linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
@ 2019-03-06 18:04 syzbot
  2019-04-11  1:43 ` syzbot
  2019-04-11 12:14 ` syzbot
  0 siblings, 2 replies; 30+ messages in thread
From: syzbot @ 2019-03-06 18:04 UTC (permalink / raw)
  To: davem, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
	netdev, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    cf08baa29613 Add linux-next specific files for 20190306
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15bc76c7200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c8b6073d992e8217
dashboard link: https://syzkaller.appspot.com/bug?extid=91fd909b6e62ebe06131
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com

BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
turning off the locking correctness validator.
CPU: 0 PID: 11902 Comm: kworker/u5:27 Not tainted 5.0.0-next-20190306 #4
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: hci94 hci_power_on
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  add_chain_cache kernel/locking/lockdep.c:2582 [inline]
  lookup_chain_cache_add kernel/locking/lockdep.c:2656 [inline]
  validate_chain kernel/locking/lockdep.c:2676 [inline]
  __lock_acquire.cold+0x250/0x50d kernel/locking/lockdep.c:3692
  lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4202
  __flush_work+0x677/0x8a0 kernel/workqueue.c:3034
  flush_work+0x18/0x20 kernel/workqueue.c:3060
  hci_dev_do_open+0xa92/0x1780 net/bluetooth/hci_core.c:1543
  hci_power_on+0x10d/0x580 net/bluetooth/hci_core.c:2173
  process_one_work+0x98e/0x1790 kernel/workqueue.c:2269
  worker_thread+0x98/0xe40 kernel/workqueue.c:2415
  kthread+0x357/0x430 kernel/kthread.c:253
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2019-03-06 18:04 BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low! syzbot
@ 2019-04-11  1:43 ` syzbot
  2019-04-11 12:14 ` syzbot
  1 sibling, 0 replies; 30+ messages in thread
From: syzbot @ 2019-04-11  1:43 UTC (permalink / raw)
  To: davem, johan.hedberg, linux-bluetooth, linux-kernel, marcel,
	netdev, syzkaller-bugs

syzbot has found a reproducer for the following crash on:

HEAD commit:    771acc7e Bluetooth: btusb: request wake pin with NOAUTOEN
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15f58add200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4fb64439e07a1ec0
dashboard link: https://syzkaller.appspot.com/bug?extid=91fd909b6e62ebe06131
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11770a8f200000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=128c945b200000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com

BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
turning off the locking correctness validator.
CPU: 0 PID: 1174 Comm: kworker/u5:0 Not tainted 5.1.0-rc4+ #63
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: hci1 hci_power_on
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  add_chain_cache kernel/locking/lockdep.c:2591 [inline]
  lookup_chain_cache_add kernel/locking/lockdep.c:2665 [inline]
  validate_chain kernel/locking/lockdep.c:2685 [inline]
  __lock_acquire.cold+0x250/0x50d kernel/locking/lockdep.c:3701
  lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:4211
  __mutex_lock_common kernel/locking/mutex.c:925 [inline]
  __mutex_lock+0xf7/0x1310 kernel/locking/mutex.c:1072
  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
  hci_dev_do_close+0x317/0xf20 net/bluetooth/hci_core.c:1696
  hci_power_on+0x1d2/0x580 net/bluetooth/hci_core.c:2191
  process_one_work+0x98e/0x1790 kernel/workqueue.c:2269
  worker_thread+0x98/0xe40 kernel/workqueue.c:2415
  kthread+0x357/0x430 kernel/kthread.c:253
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2019-03-06 18:04 BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low! syzbot
  2019-04-11  1:43 ` syzbot
@ 2019-04-11 12:14 ` syzbot
  2019-04-11 21:50   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 30+ messages in thread
From: syzbot @ 2019-04-11 12:14 UTC (permalink / raw)
  To: benh, davem, gregkh, johan.hedberg, linux-bluetooth,
	linux-kernel, marcel, netdev, rafael, syzkaller-bugs, tj,
	torvalds

syzbot has bisected this bug to:

commit 726e41097920a73e4c7c33385dcc0debb1281e18
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Tue Jul 10 00:29:10 2018 +0000

     drivers: core: Remove glue dirs from sysfs earlier

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=15f69eaf200000
start commit:   771acc7e Bluetooth: btusb: request wake pin with NOAUTOEN
git tree:       upstream
final crash:    https://syzkaller.appspot.com/x/report.txt?x=17f69eaf200000
console output: https://syzkaller.appspot.com/x/log.txt?x=13f69eaf200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4fb64439e07a1ec0
dashboard link: https://syzkaller.appspot.com/bug?extid=91fd909b6e62ebe06131
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11770a8f200000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=128c945b200000

Reported-by: syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com
Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2019-04-11 12:14 ` syzbot
@ 2019-04-11 21:50   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Herrenschmidt @ 2019-04-11 21:50 UTC (permalink / raw)
  To: davem, gregkh, johan.hedberg, linux-bluetooth, linux-kernel,
	marcel, netdev, rafael, syzkaller-bugs, tj, torvalds

On Thu, 2019-04-11 at 05:14 -0700, syzbot wrote:
> syzbot has bisected this bug to:
> 
> commit 726e41097920a73e4c7c33385dcc0debb1281e18
> Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date:   Tue Jul 10 00:29:10 2018 +0000
> 
>      drivers: core: Remove glue dirs from sysfs earlier

Greg, any idea what this is ? The log isn't terribly readable. The
above patch fixes a real bug that causes use after free and memory
corruption under some circumstances. I wonder if the BT stack is itself
manipulating stale objects ?

Ben.

> bisection log:  
> https://syzkaller.appspot.com/x/bisect.txt?x=15f69eaf200000
> start commit:   771acc7e Bluetooth: btusb: request wake pin with
> NOAUTOEN
> git tree:       upstream
> final crash:    
> https://syzkaller.appspot.com/x/report.txt?x=17f69eaf200000
> console output: 
> https://syzkaller.appspot.com/x/log.txt?x=13f69eaf200000
> kernel config:  
> https://syzkaller.appspot.com/x/.config?x=4fb64439e07a1ec0
> dashboard link: 
> https://syzkaller.appspot.com/bug?extid=91fd909b6e62ebe06131
> syz repro:      
> https://syzkaller.appspot.com/x/repro.syz?x=11770a8f200000
> C reproducer:   
> https://syzkaller.appspot.com/x/repro.c?x=128c945b200000
> 
> Reported-by: syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com
> Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs
> earlier")
> 
> For information about bisection process see: 
> https://goo.gl/tpsmEJ#bisection


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-27 14:26                   ` Waiman Long
@ 2023-01-27 15:33                     ` Chris Murphy
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Murphy @ 2023-01-27 15:33 UTC (permalink / raw)
  To: Waiman Long, Boqun Feng
  Cc: Михаил
	Гаврилов,
	David Sterba, Btrfs BTRFS, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes



On Fri, Jan 27, 2023, at 9:26 AM, Waiman Long wrote:
> On 1/26/23 23:07, Boqun Feng wrote:
>> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
>>>
>>> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
>>>> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>>>>> I'm not sure whether these options are better than just increasing the
>>>>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>>>>> you have large enough memory to test.
>>>>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>>>>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>>>>> or the user must rebuild the kernel himself? Maybe increase
>>>>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>>>>> to distribute to end users because the meaning of using packaged
>>>>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>>>>> config and rebuild the kernel by yourself).
>>>> Note that lockdep is typically only enabled in a debug kernel shipped by
>>>> a distro because of the high performance overhead. The non-debug kernel
>>>> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
>>>> when testing on the debug kernel, you can file a ticket to the distro
>>>> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
>>>> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
>>> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
>>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>>>
>>> If 19 the recommended value I don't mind sending an MR for it. But if
>>> the idea is we're going to be back here talking about bumping it to 20
>>> in six months, I'd like to avoid that.
>>>
>> How about a boot parameter then?
>
> A boot parameter doesn't work for a statically allocated array which is 
> determined at compile time. Dynamic memory allocation isn't enabled yet 
> at early boot when lockdep will be used.

Also, at least in Fedora Rawhide where the mainline debug kernels appear, mostly get used non-interactively with automated tests. So if we're going to discover lockdep issues, we need the kernel logs to be reliable at the time those tests are run, and we don't have a practical way of adding another boot parameter just for these tests.

Anyway I went ahead and submitted an MR to bump this to 19.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2271



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-27  4:07                 ` Boqun Feng
  2023-01-27  5:35                   ` Mikhail Gavrilov
@ 2023-01-27 14:26                   ` Waiman Long
  2023-01-27 15:33                     ` Chris Murphy
  1 sibling, 1 reply; 30+ messages in thread
From: Waiman Long @ 2023-01-27 14:26 UTC (permalink / raw)
  To: Boqun Feng, Chris Murphy
  Cc: Михаил
	Гаврилов,
	David Sterba, Btrfs BTRFS, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes

On 1/26/23 23:07, Boqun Feng wrote:
> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
>>
>> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
>>> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>>>> I'm not sure whether these options are better than just increasing the
>>>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>>>> you have large enough memory to test.
>>>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>>>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>>>> or the user must rebuild the kernel himself? Maybe increase
>>>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>>>> to distribute to end users because the meaning of using packaged
>>>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>>>> config and rebuild the kernel by yourself).
>>> Note that lockdep is typically only enabled in a debug kernel shipped by
>>> a distro because of the high performance overhead. The non-debug kernel
>>> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
>>> when testing on the debug kernel, you can file a ticket to the distro
>>> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
>>> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
>> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>>
>> If 19 the recommended value I don't mind sending an MR for it. But if
>> the idea is we're going to be back here talking about bumping it to 20
>> in six months, I'd like to avoid that.
>>
> How about a boot parameter then?

A boot parameter doesn't work for a statically allocated array which is 
determined at compile time. Dynamic memory allocation isn't enabled yet 
at early boot when lockdep will be used.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-27  4:07                 ` Boqun Feng
@ 2023-01-27  5:35                   ` Mikhail Gavrilov
  2023-01-27 14:26                   ` Waiman Long
  1 sibling, 0 replies; 30+ messages in thread
From: Mikhail Gavrilov @ 2023-01-27  5:35 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Chris Murphy, Waiman Long, David Sterba, Btrfs BTRFS,
	linux-kernel, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Joel Fernandes

On Fri, Jan 27, 2023 at 9:08 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
> >
> >
> > On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> > > On 1/26/23 17:42, Mikhail Gavrilov wrote:
> > >>> I'm not sure whether these options are better than just increasing the
> > >>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > >>> you have large enough memory to test.
> > >> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> > >> be done? In vanilla kernel on kernel.org? In a specific distribution?
> > >> or the user must rebuild the kernel himself? Maybe increase
> > >> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> > >> to distribute to end users because the meaning of using packaged
> > >> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> > >> config and rebuild the kernel by yourself).
> > >
> > > Note that lockdep is typically only enabled in a debug kernel shipped by
> > > a distro because of the high performance overhead. The non-debug kernel
> > > doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
> > > when testing on the debug kernel, you can file a ticket to the distro
> > > asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
> > > your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
> >
> > Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
> > https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
> >
> > If 19 the recommended value I don't mind sending an MR for it. But if
> > the idea is we're going to be back here talking about bumping it to 20
> > in six months, I'd like to avoid that.
> >
>
> How about a boot parameter then?

I would like this option.
This is better than rebuilding the kernel yourself and asking the
distribution's maintainers to increase this value.

Thanks.

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-27  3:37               ` Chris Murphy
@ 2023-01-27  4:07                 ` Boqun Feng
  2023-01-27  5:35                   ` Mikhail Gavrilov
  2023-01-27 14:26                   ` Waiman Long
  0 siblings, 2 replies; 30+ messages in thread
From: Boqun Feng @ 2023-01-27  4:07 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Waiman Long,
	Михаил
	Гаврилов,
	David Sterba, Btrfs BTRFS, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes

On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
> 
> 
> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> > On 1/26/23 17:42, Mikhail Gavrilov wrote:
> >>> I'm not sure whether these options are better than just increasing the
> >>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> >>> you have large enough memory to test.
> >> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> >> be done? In vanilla kernel on kernel.org? In a specific distribution?
> >> or the user must rebuild the kernel himself? Maybe increase
> >> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> >> to distribute to end users because the meaning of using packaged
> >> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> >> config and rebuild the kernel by yourself).
> >
> > Note that lockdep is typically only enabled in a debug kernel shipped by 
> > a distro because of the high performance overhead. The non-debug kernel 
> > doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough 
> > when testing on the debug kernel, you can file a ticket to the distro 
> > asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build 
> > your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
> 
> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
> 
> If 19 the recommended value I don't mind sending an MR for it. But if
> the idea is we're going to be back here talking about bumping it to 20
> in six months, I'd like to avoid that.
> 

How about a boot parameter then?

Regards,
Boqun

> 
> 
> -- 
> Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-27  0:20             ` Waiman Long
@ 2023-01-27  3:37               ` Chris Murphy
  2023-01-27  4:07                 ` Boqun Feng
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Murphy @ 2023-01-27  3:37 UTC (permalink / raw)
  To: Waiman Long,
	Михаил
	Гаврилов,
	Boqun Feng
  Cc: David Sterba, Btrfs BTRFS, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes



On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>> I'm not sure whether these options are better than just increasing the
>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>> you have large enough memory to test.
>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>> or the user must rebuild the kernel himself? Maybe increase
>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>> to distribute to end users because the meaning of using packaged
>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>> config and rebuild the kernel by yourself).
>
> Note that lockdep is typically only enabled in a debug kernel shipped by 
> a distro because of the high performance overhead. The non-debug kernel 
> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough 
> when testing on the debug kernel, you can file a ticket to the distro 
> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build 
> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.

Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921

If 19 the recommended value I don't mind sending an MR for it. But if the idea is we're going to be back here talking about bumping it to 20 in six months, I'd like to avoid that.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 22:42           ` Mikhail Gavrilov
  2023-01-26 22:51             ` Boqun Feng
  2023-01-26 23:49             ` Boqun Feng
@ 2023-01-27  0:20             ` Waiman Long
  2023-01-27  3:37               ` Chris Murphy
  2 siblings, 1 reply; 30+ messages in thread
From: Waiman Long @ 2023-01-27  0:20 UTC (permalink / raw)
  To: Mikhail Gavrilov, Boqun Feng
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Joel Fernandes


On 1/26/23 17:42, Mikhail Gavrilov wrote:
>> I'm not sure whether these options are better than just increasing the
>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>> you have large enough memory to test.
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).

Note that lockdep is typically only enabled in a debug kernel shipped by 
a distro because of the high performance overhead. The non-debug kernel 
doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough 
when testing on the debug kernel, you can file a ticket to the distro 
asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build 
your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 22:42           ` Mikhail Gavrilov
  2023-01-26 22:51             ` Boqun Feng
@ 2023-01-26 23:49             ` Boqun Feng
  2023-01-27  0:20             ` Waiman Long
  2 siblings, 0 replies; 30+ messages in thread
From: Boqun Feng @ 2023-01-26 23:49 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Joel Fernandes

On Fri, Jan 27, 2023 at 03:42:52AM +0500, Mikhail Gavrilov wrote:
> On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > [Cc lock folks]
> >
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
> > > >
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > > > > >
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > >
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > >
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > >
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > >
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> > > > >   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > >
> > > > > What's next? Increase this value to 19?
> > > >
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > >
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> >
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
> 
> It is needed for correct working btrfs. David, am I right?
> 

Lockdep is not needed for correct working btrfs in production. It's a
tool to help btrfs developers to find deadlocks in
development/test/debug environment. End users, i.e. the users of linux
kernel don't need it.

Regards,
Boqun

> >
> > > May Be set 99 right away? Or remove such a check condition?
> >
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
> >
> > However, a few other options we can try in lockdep are:
> >
> > *       warn but not turn off the lockdep: the lock holding chain is
> >         only a cache for what lock holding combination lockdep has ever
> >         see, we also record the dependency in the graph. Without the
> >         lock holding chain, lockdep can still work but just slower.
> >
> > *       allow dynmaic memory allocation in lockdep: I think this might
> >         be OK since we have lockdep_recursion to avoid lockdep code ->
> >         mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> >         missing something. And even we allow it, the use of memory
> >         doesn't change, you will still need that amout of memory to
> >         track lock holding chains.
> >
> > I'm not sure whether these options are better than just increasing the
> > number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > you have large enough memory to test.
> 
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).
> 
> It would be great if the chosen value would simply work always
> everywhere. 30? ok! But as I understand, btrfs does not have any
> guarantees for this. David, am I right?
> 
> Anyway, thank you for keeping the conversation going.
> 
> -- 
> Best Regards,
> Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 22:42           ` Mikhail Gavrilov
@ 2023-01-26 22:51             ` Boqun Feng
  2023-01-26 23:49             ` Boqun Feng
  2023-01-27  0:20             ` Waiman Long
  2 siblings, 0 replies; 30+ messages in thread
From: Boqun Feng @ 2023-01-26 22:51 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Joel Fernandes

On Fri, Jan 27, 2023 at 03:42:52AM +0500, Mikhail Gavrilov wrote:
> On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > [Cc lock folks]
> >
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
> > > >
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > > > > >
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > >
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > >
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > >
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > >
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> > > > >   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > >
> > > > > What's next? Increase this value to 19?
> > > >
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > >
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> >
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
> 
> It is needed for correct working btrfs. David, am I right?
> 
> >
> > > May Be set 99 right away? Or remove such a check condition?
> >
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
> >
> > However, a few other options we can try in lockdep are:
> >
> > *       warn but not turn off the lockdep: the lock holding chain is
> >         only a cache for what lock holding combination lockdep has ever
> >         see, we also record the dependency in the graph. Without the
> >         lock holding chain, lockdep can still work but just slower.
> >
> > *       allow dynmaic memory allocation in lockdep: I think this might
> >         be OK since we have lockdep_recursion to avoid lockdep code ->
> >         mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> >         missing something. And even we allow it, the use of memory
> >         doesn't change, you will still need that amout of memory to
> >         track lock holding chains.
> >
> > I'm not sure whether these options are better than just increasing the
> > number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > you have large enough memory to test.
> 
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).
> 

Lockdep is a dev tool to help finding out deadlocks, and it introduces
cost when enabled, although it's possible, I doubt no one will run
LOCKDEP enabled kernel in production environment. In other words, it's
a debug/test-kernel-only option.

Regards,
Boqun

> It would be great if the chosen value would simply work always
> everywhere. 30? ok! But as I understand, btrfs does not have any
> guarantees for this. David, am I right?
> 
> Anyway, thank you for keeping the conversation going.
> 
> -- 
> Best Regards,
> Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 17:38         ` Boqun Feng
  2023-01-26 18:30           ` Waiman Long
@ 2023-01-26 22:42           ` Mikhail Gavrilov
  2023-01-26 22:51             ` Boqun Feng
                               ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: Mikhail Gavrilov @ 2023-01-26 22:42 UTC (permalink / raw)
  To: Boqun Feng
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Joel Fernandes

On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> [Cc lock folks]
>
> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
> > >
> > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > > > >
> > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > Hi guys.
> > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > >
> > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > tends to work.
> > > >
> > > > Hi,
> > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > >
> > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > >
> > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > [88685.088124] turning off the locking correctness validator.
> > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> > > >   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > >
> > > > What's next? Increase this value to 19?
> > >
> > > Yes, though increasing the value is a workaround so you may see the
> > > warning again.
> >
> > Is there any sense in this WARNING if we would ignore it and every
> > time increase the threshold value?
>
> Lockdep uses static allocated array to track lock holdings chains to
> avoid dynmaic memory allocation in its own code. So if you see the
> warning it means your test has more combination of lock holdings than
> the array can record. In other words, you reach the resource limitation,
> and in that sense it makes sense to just ignore it and increase the
> value: you want to give lockdep enough resource to work, right?

It is needed for correct working btrfs. David, am I right?

>
> > May Be set 99 right away? Or remove such a check condition?
>
> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> chains array..
>
> However, a few other options we can try in lockdep are:
>
> *       warn but not turn off the lockdep: the lock holding chain is
>         only a cache for what lock holding combination lockdep has ever
>         see, we also record the dependency in the graph. Without the
>         lock holding chain, lockdep can still work but just slower.
>
> *       allow dynmaic memory allocation in lockdep: I think this might
>         be OK since we have lockdep_recursion to avoid lockdep code ->
>         mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
>         missing something. And even we allow it, the use of memory
>         doesn't change, you will still need that amout of memory to
>         track lock holding chains.
>
> I'm not sure whether these options are better than just increasing the
> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> you have large enough memory to test.

About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
be done? In vanilla kernel on kernel.org? In a specific distribution?
or the user must rebuild the kernel himself? Maybe increase
LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
to distribute to end users because the meaning of using packaged
distributions is lost (user should change LOCKDEP_CHAINS_BITS in
config and rebuild the kernel by yourself).

It would be great if the chosen value would simply work always
everywhere. 30? ok! But as I understand, btrfs does not have any
guarantees for this. David, am I right?

Anyway, thank you for keeping the conversation going.

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 18:59             ` Boqun Feng
@ 2023-01-26 19:07               ` Waiman Long
  0 siblings, 0 replies; 30+ messages in thread
From: Waiman Long @ 2023-01-26 19:07 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Mikhail Gavrilov, dsterba, Btrfs BTRFS,
	Linux List Kernel Mailing, Chris Murphy, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes

On 1/26/23 13:59, Boqun Feng wrote:
> On Thu, Jan 26, 2023 at 01:30:34PM -0500, Waiman Long wrote:
>> On 1/26/23 12:38, Boqun Feng wrote:
>>> [Cc lock folks]
>>>
>>> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
>>>> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
>>>>> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
>>>>>> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
>>>>>>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>>>>>>> Hi guys.
>>>>>>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>>>>>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>>>>>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>>>>>>> tends to work.
>>>>>> Hi,
>>>>>> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
>>>>>> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
>>>>>>
>>>>>> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
>>>>>> CONFIG_LOCKDEP_CHAINS_BITS=18
>>>>>>
>>>>>> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>>>>>> [88685.088124] turning off the locking correctness validator.
>>>>>> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
>>>>>> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
>>>>>>     -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
>>>>>> [88685.088154] Hardware name: System manufacturer System Product
>>>>>> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
>>>>>>
>>>>>> What's next? Increase this value to 19?
>>>>> Yes, though increasing the value is a workaround so you may see the
>>>>> warning again.
>>>> Is there any sense in this WARNING if we would ignore it and every
>>>> time increase the threshold value?
>>> Lockdep uses static allocated array to track lock holdings chains to
>>> avoid dynmaic memory allocation in its own code. So if you see the
>>> warning it means your test has more combination of lock holdings than
>>> the array can record. In other words, you reach the resource limitation,
>>> and in that sense it makes sense to just ignore it and increase the
>>> value: you want to give lockdep enough resource to work, right?
>>>
>>>> May Be set 99 right away? Or remove such a check condition?
>>> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
>>> chains array..
>> Note that every increment of LOCKDEP_CHAINS_BITS double the storage space.
>> With 99, that will likely exceed the total amount of memory you have in your
>> system.
>>
>> Boqun, where does the 5 figure come from. It is just a simple u16 array of
> 	#define MAX_LOCKDEP_CHAINS_BITS	CONFIG_LOCKDEP_CHAINS_BITS
> 	#define MAX_LOCKDEP_CHAINS	(1UL << MAX_LOCKDEP_CHAINS_BITS)
>
> 	#define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)
>
> I think the last one means we think the average length of a lock chain
> is 5, in other words, in average, a task hold at most 5 locks. I don't
> know where the 5 came from either, but it's there ;-)

You are right. I missed that when I looked. So 5 is assumed to the 
average length of a lock chain.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 18:30           ` Waiman Long
@ 2023-01-26 18:59             ` Boqun Feng
  2023-01-26 19:07               ` Waiman Long
  0 siblings, 1 reply; 30+ messages in thread
From: Boqun Feng @ 2023-01-26 18:59 UTC (permalink / raw)
  To: Waiman Long
  Cc: Mikhail Gavrilov, dsterba, Btrfs BTRFS,
	Linux List Kernel Mailing, Chris Murphy, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Joel Fernandes

On Thu, Jan 26, 2023 at 01:30:34PM -0500, Waiman Long wrote:
> On 1/26/23 12:38, Boqun Feng wrote:
> > [Cc lock folks]
> > 
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > > 
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > > 
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> > > > >    -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > > 
> > > > > What's next? Increase this value to 19?
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
> > 
> > > May Be set 99 right away? Or remove such a check condition?
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
> 
> Note that every increment of LOCKDEP_CHAINS_BITS double the storage space.
> With 99, that will likely exceed the total amount of memory you have in your
> system.
> 
> Boqun, where does the 5 figure come from. It is just a simple u16 array of

	#define MAX_LOCKDEP_CHAINS_BITS	CONFIG_LOCKDEP_CHAINS_BITS
	#define MAX_LOCKDEP_CHAINS	(1UL << MAX_LOCKDEP_CHAINS_BITS)

	#define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)

I think the last one means we think the average length of a lock chain
is 5, in other words, in average, a task hold at most 5 locks. I don't
know where the 5 came from either, but it's there ;-)

Regards,
Boqun

> size MAX_LOCKDEP_CHAIN_HLOCKS. The chain_hlocks array stores the lock chains
> that show up in the lockdep splats and in the /proc/lockdep* files. Each
> chain is variable size. As we add new lock into the chain, we have to
> repeatedly deallocate and reallocate a larger chain buffer. That will cause
> fragmentation in the chain_hlocks[]. So if we have a very long lock chain,
> the allocation may fail because the largest free block is smaller than the
> requested chain length. There may be enough free space in chain_hlocks, but
> it is just too fragmented to be useful.
> 
> Maybe we should figure out a better way to handle this fragmentation. In the
> mean time, the easiest way forward is just to increase the
> LOCKDEP_CHAINS_BITS by 1.
> 
> > 
> > However, a few other options we can try in lockdep are:
> > 
> > *	warn but not turn off the lockdep: the lock holding chain is
> > 	only a cache for what lock holding combination lockdep has ever
> > 	see, we also record the dependency in the graph. Without the
> > 	lock holding chain, lockdep can still work but just slower.
> > 
> > *	allow dynmaic memory allocation in lockdep: I think this might
> > 	be OK since we have lockdep_recursion to avoid lockdep code ->
> > 	mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> > 	missing something. And even we allow it, the use of memory
> > 	doesn't change, you will still need that amout of memory to
> > 	track lock holding chains.
> 
> It is not just the issue of calling the memory allocator. There is also the
> issue of copying data from old chain_hlocks to new one while the old one may
> be updated during the copying process unless we can freeze everything else.
> 
> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26 17:38         ` Boqun Feng
@ 2023-01-26 18:30           ` Waiman Long
  2023-01-26 18:59             ` Boqun Feng
  2023-01-26 22:42           ` Mikhail Gavrilov
  1 sibling, 1 reply; 30+ messages in thread
From: Waiman Long @ 2023-01-26 18:30 UTC (permalink / raw)
  To: Boqun Feng, Mikhail Gavrilov
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Joel Fernandes

On 1/26/23 12:38, Boqun Feng wrote:
> [Cc lock folks]
>
> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
>> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
>>> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
>>>> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
>>>>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>>>>> Hi guys.
>>>>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>>>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>>>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>>>>> tends to work.
>>>> Hi,
>>>> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
>>>> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
>>>>
>>>> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
>>>> CONFIG_LOCKDEP_CHAINS_BITS=18
>>>>
>>>> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>>>> [88685.088124] turning off the locking correctness validator.
>>>> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
>>>> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
>>>>    -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
>>>> [88685.088154] Hardware name: System manufacturer System Product
>>>> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
>>>>
>>>> What's next? Increase this value to 19?
>>> Yes, though increasing the value is a workaround so you may see the
>>> warning again.
>> Is there any sense in this WARNING if we would ignore it and every
>> time increase the threshold value?
> Lockdep uses static allocated array to track lock holdings chains to
> avoid dynmaic memory allocation in its own code. So if you see the
> warning it means your test has more combination of lock holdings than
> the array can record. In other words, you reach the resource limitation,
> and in that sense it makes sense to just ignore it and increase the
> value: you want to give lockdep enough resource to work, right?
>
>> May Be set 99 right away? Or remove such a check condition?
> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> chains array..

Note that every increment of LOCKDEP_CHAINS_BITS double the storage 
space. With 99, that will likely exceed the total amount of memory you 
have in your system.

Boqun, where does the 5 figure come from. It is just a simple u16 array 
of size MAX_LOCKDEP_CHAIN_HLOCKS. The chain_hlocks array stores the lock 
chains that show up in the lockdep splats and in the /proc/lockdep* 
files. Each chain is variable size. As we add new lock into the chain, 
we have to repeatedly deallocate and reallocate a larger chain buffer. 
That will cause fragmentation in the chain_hlocks[]. So if we have a 
very long lock chain, the allocation may fail because the largest free 
block is smaller than the requested chain length. There may be enough 
free space in chain_hlocks, but it is just too fragmented to be useful.

Maybe we should figure out a better way to handle this fragmentation. In 
the mean time, the easiest way forward is just to increase the 
LOCKDEP_CHAINS_BITS by 1.

>
> However, a few other options we can try in lockdep are:
>
> *	warn but not turn off the lockdep: the lock holding chain is
> 	only a cache for what lock holding combination lockdep has ever
> 	see, we also record the dependency in the graph. Without the
> 	lock holding chain, lockdep can still work but just slower.
>
> *	allow dynmaic memory allocation in lockdep: I think this might
> 	be OK since we have lockdep_recursion to avoid lockdep code ->
> 	mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> 	missing something. And even we allow it, the use of memory
> 	doesn't change, you will still need that amout of memory to
> 	track lock holding chains.

It is not just the issue of calling the memory allocator. There is also 
the issue of copying data from old chain_hlocks to new one while the old 
one may be updated during the copying process unless we can freeze 
everything else.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-26  9:47       ` Mikhail Gavrilov
@ 2023-01-26 17:38         ` Boqun Feng
  2023-01-26 18:30           ` Waiman Long
  2023-01-26 22:42           ` Mikhail Gavrilov
  0 siblings, 2 replies; 30+ messages in thread
From: Boqun Feng @ 2023-01-26 17:38 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy,
	Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
	Joel Fernandes

[Cc lock folks]

On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
> >
> > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > > >
> > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > Hi guys.
> > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > >
> > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > tends to work.
> > >
> > > Hi,
> > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > >
> > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > >
> > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > [88685.088124] turning off the locking correctness validator.
> > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> > >   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > [88685.088154] Hardware name: System manufacturer System Product
> > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > >
> > > What's next? Increase this value to 19?
> >
> > Yes, though increasing the value is a workaround so you may see the
> > warning again.
> 
> Is there any sense in this WARNING if we would ignore it and every
> time increase the threshold value?

Lockdep uses static allocated array to track lock holdings chains to
avoid dynmaic memory allocation in its own code. So if you see the
warning it means your test has more combination of lock holdings than
the array can record. In other words, you reach the resource limitation,
and in that sense it makes sense to just ignore it and increase the
value: you want to give lockdep enough resource to work, right?

> May Be set 99 right away? Or remove such a check condition?

That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
chains array..

However, a few other options we can try in lockdep are:

*	warn but not turn off the lockdep: the lock holding chain is
	only a cache for what lock holding combination lockdep has ever
	see, we also record the dependency in the graph. Without the
	lock holding chain, lockdep can still work but just slower.

*	allow dynmaic memory allocation in lockdep: I think this might
	be OK since we have lockdep_recursion to avoid lockdep code ->
	mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
	missing something. And even we allow it, the use of memory
	doesn't change, you will still need that amout of memory to
	track lock holding chains.

I'm not sure whether these options are better than just increasing the
number, maybe to unblock your ASAP, you can try make it 30 and make sure
you have large enough memory to test.

Regards,
Boqun

> 
> -- 
> Best Regards,
> Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-25 17:15     ` David Sterba
@ 2023-01-26  9:47       ` Mikhail Gavrilov
  2023-01-26 17:38         ` Boqun Feng
  0 siblings, 1 reply; 30+ messages in thread
From: Mikhail Gavrilov @ 2023-01-26  9:47 UTC (permalink / raw)
  To: dsterba, boqun.feng; +Cc: Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy

On Wed, Jan 25, 2023 at 10:21 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> > >
> > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > Hi guys.
> > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > >
> > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > tends to work.
> >
> > Hi,
> > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> >
> > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > CONFIG_LOCKDEP_CHAINS_BITS=18
> >
> > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > [88685.088124] turning off the locking correctness validator.
> > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
> >   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > [88685.088154] Hardware name: System manufacturer System Product
> > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> >
> > What's next? Increase this value to 19?
>
> Yes, though increasing the value is a workaround so you may see the
> warning again.

Is there any sense in this WARNING if we would ignore it and every
time increase the threshold value?
May Be set 99 right away? Or remove such a check condition?

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2023-01-24 20:27   ` Mikhail Gavrilov
@ 2023-01-25 17:15     ` David Sterba
  2023-01-26  9:47       ` Mikhail Gavrilov
  0 siblings, 1 reply; 30+ messages in thread
From: David Sterba @ 2023-01-25 17:15 UTC (permalink / raw)
  To: Mikhail Gavrilov
  Cc: dsterba, Btrfs BTRFS, Linux List Kernel Mailing, Chris Murphy

On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
> >
> > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > Hi guys.
> > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> >
> > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > tends to work.
> 
> Hi,
> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> 
> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> CONFIG_LOCKDEP_CHAINS_BITS=18
> 
> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> [88685.088124] turning off the locking correctness validator.
> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
>   -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> [88685.088154] Hardware name: System manufacturer System Product
> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> 
> What's next? Increase this value to 19?

Yes, though increasing the value is a workaround so you may see the
warning again.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 16:42 ` David Sterba
  2022-07-26 19:19   ` Chris Murphy
  2022-08-03 19:28   ` Mikhail Gavrilov
@ 2023-01-24 20:27   ` Mikhail Gavrilov
  2023-01-25 17:15     ` David Sterba
  2 siblings, 1 reply; 30+ messages in thread
From: Mikhail Gavrilov @ 2023-01-24 20:27 UTC (permalink / raw)
  To: dsterba, Mikhail Gavrilov, Btrfs BTRFS, Linux List Kernel Mailing
  Cc: Chris Murphy

[-- Attachment #1: Type: text/plain, Size: 1220 bytes --]

On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > Hi guys.
> > Always with intensive writing on a btrfs volume, the message "BUG:
> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

Hi,
Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.

❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
CONFIG_LOCKDEP_CHAINS_BITS=18

[88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[88685.088124] turning off the locking correctness validator.
[88685.088133] Please attach the output of /proc/lock_stat to the bug report
[88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G        W    L
  -------  ---  6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
[88685.088154] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022

What's next? Increase this value to 19?
Actual full kernel log and lock_stat attached below.

-- 
Best Regards,
Mike Gavrilov.

[-- Attachment #2: dmesg.tar.xz --]
[-- Type: application/x-xz, Size: 36752 bytes --]

[-- Attachment #3: lock_stat.tar.xz --]
[-- Type: application/x-xz, Size: 96920 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-08-04  7:35       ` Mikhail Gavrilov
@ 2022-08-04 11:23         ` Tetsuo Handa
  0 siblings, 0 replies; 30+ messages in thread
From: Tetsuo Handa @ 2022-08-04 11:23 UTC (permalink / raw)
  To: Mikhail Gavrilov, Chris Murphy
  Cc: David Sterba, Btrfs BTRFS, linux-kernel, dvyukov

On 2022/08/04 16:35, Mikhail Gavrilov wrote:
> On Thu, Aug 4, 2022 at 1:01 AM Chris Murphy <lists@colorremedies.com> wrote:
>>
>> This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
> 
> I saw this change, but it would be good if users of all other
> distributions will be happy too.
> 

I'm not a lockdep maintainer.

Please submit a patch to lockdep maintainers and persuade lockdep maintainers
to change the default value. ;-)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-08-03 20:00     ` Chris Murphy
@ 2022-08-04  7:35       ` Mikhail Gavrilov
  2022-08-04 11:23         ` Tetsuo Handa
  0 siblings, 1 reply; 30+ messages in thread
From: Mikhail Gavrilov @ 2022-08-04  7:35 UTC (permalink / raw)
  To: Chris Murphy
  Cc: David Sterba, Btrfs BTRFS, linux-kernel, Tetsuo Handa, dvyukov

On Thu, Aug 4, 2022 at 1:01 AM Chris Murphy <lists@colorremedies.com> wrote:
>
> This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921

I saw this change, but it would be good if users of all other
distributions will be happy too.

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-08-03 19:28   ` Mikhail Gavrilov
@ 2022-08-03 20:00     ` Chris Murphy
  2022-08-04  7:35       ` Mikhail Gavrilov
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Murphy @ 2022-08-03 20:00 UTC (permalink / raw)
  To: Михаил
	Гаврилов,
	David Sterba, Btrfs BTRFS, linux-kernel
  Cc: Tetsuo Handa, dvyukov



On Wed, Aug 3, 2022, at 3:28 PM, Mikhail Gavrilov wrote:
> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
>>
>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>> > Hi guys.
>> > Always with intensive writing on a btrfs volume, the message "BUG:
>> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>
>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>> tends to work.
>
> I confirm that after bumping LOCKDEP_CHAINS_BITS to 18 several days of
> continuous writing on the BTRFS partition with different files with a
> total size of 10Tb I didn't see this kernel bug message again.
> Tetsuo, I saw your commit 5dc33592e95534dc8455ce3e9baaaf3dae0fff82 [1]
> set for LOCKDEP_CHAINS_BITS default value 16.
> Why not increase LOCKDEP_CHAINS_BITS to 18 by default?

This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921




-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 16:42 ` David Sterba
  2022-07-26 19:19   ` Chris Murphy
@ 2022-08-03 19:28   ` Mikhail Gavrilov
  2022-08-03 20:00     ` Chris Murphy
  2023-01-24 20:27   ` Mikhail Gavrilov
  2 siblings, 1 reply; 30+ messages in thread
From: Mikhail Gavrilov @ 2022-08-03 19:28 UTC (permalink / raw)
  To: dsterba, Mikhail Gavrilov, Btrfs BTRFS, Linux List Kernel Mailing
  Cc: Tetsuo Handa, dvyukov

On Tue, Jul 26, 2022 at 9:47 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > Hi guys.
> > Always with intensive writing on a btrfs volume, the message "BUG:
> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

I confirm that after bumping LOCKDEP_CHAINS_BITS to 18 several days of
continuous writing on the BTRFS partition with different files with a
total size of 10Tb I didn't see this kernel bug message again.
Tetsuo, I saw your commit 5dc33592e95534dc8455ce3e9baaaf3dae0fff82 [1]
set for LOCKDEP_CHAINS_BITS default value 16.
Why not increase LOCKDEP_CHAINS_BITS to 18 by default?
Thanks.


[1] https://github.com/torvalds/linux/blame/master/lib/Kconfig.debug#L1387

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 19:21     ` Chris Murphy
@ 2022-07-26 20:42       ` David Sterba
  0 siblings, 0 replies; 30+ messages in thread
From: David Sterba @ 2022-07-26 20:42 UTC (permalink / raw)
  To: Chris Murphy
  Cc: David Sterba,
	Михаил
	Гаврилов,
	Btrfs BTRFS, linux-kernel

On Tue, Jul 26, 2022 at 03:21:32PM -0400, Chris Murphy wrote:
> 
> 
> On Tue, Jul 26, 2022, at 3:19 PM, Chris Murphy wrote:
> > On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
> >> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> >>> Hi guys.
> >>> Always with intensive writing on a btrfs volume, the message "BUG:
> >>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> >>
> >> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> >> tends to work.
> >
> > Fedora is using 17. I'll make a request to bump it to 18. Thanks.
> 
> Should it be 18 across all archs? Or is it OK to only bump x86_64?

I think it applies to all achritectures equally but I'm no lockdep
expert.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 19:19   ` Chris Murphy
@ 2022-07-26 19:21     ` Chris Murphy
  2022-07-26 20:42       ` David Sterba
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Murphy @ 2022-07-26 19:21 UTC (permalink / raw)
  To: David Sterba,
	Михаил
	Гаврилов
  Cc: Btrfs BTRFS, linux-kernel



On Tue, Jul 26, 2022, at 3:19 PM, Chris Murphy wrote:
> On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>> Hi guys.
>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>
>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>> tends to work.
>
> Fedora is using 17. I'll make a request to bump it to 18. Thanks.

Should it be 18 across all archs? Or is it OK to only bump x86_64?

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 16:42 ` David Sterba
@ 2022-07-26 19:19   ` Chris Murphy
  2022-07-26 19:21     ` Chris Murphy
  2022-08-03 19:28   ` Mikhail Gavrilov
  2023-01-24 20:27   ` Mikhail Gavrilov
  2 siblings, 1 reply; 30+ messages in thread
From: Chris Murphy @ 2022-07-26 19:19 UTC (permalink / raw)
  To: David Sterba,
	Михаил
	Гаврилов
  Cc: Btrfs BTRFS, linux-kernel



On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>> Hi guys.
>> Always with intensive writing on a btrfs volume, the message "BUG:
>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

Fedora is using 17. I'll make a request to bump it to 18. Thanks.

--
Chris Murphy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  2022-07-26 12:32 Mikhail Gavrilov
@ 2022-07-26 16:42 ` David Sterba
  2022-07-26 19:19   ` Chris Murphy
                     ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: David Sterba @ 2022-07-26 16:42 UTC (permalink / raw)
  To: Mikhail Gavrilov; +Cc: Btrfs BTRFS, Linux List Kernel Mailing

On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> Hi guys.
> Always with intensive writing on a btrfs volume, the message "BUG:
> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.

Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
tends to work.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
@ 2022-07-26 12:32 Mikhail Gavrilov
  2022-07-26 16:42 ` David Sterba
  0 siblings, 1 reply; 30+ messages in thread
From: Mikhail Gavrilov @ 2022-07-26 12:32 UTC (permalink / raw)
  To: Btrfs BTRFS, Linux List Kernel Mailing

Hi guys.
Always with intensive writing on a btrfs volume, the message "BUG:
MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.

[46729.134549] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[46729.134557] turning off the locking correctness validator.
[46729.134559] Please attach the output of /proc/lock_stat to the bug report
[46729.134561] CPU: 22 PID: 166516 Comm: ThreadPoolForeg Tainted: G
    W    L   --------  ---
5.19.0-0.rc7.20220722git68e77ffbfd06.56.fc37.x86_64 #1
[46729.134566] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
[46729.134569] Call Trace:
[46729.134572]  <TASK>
[46729.134576]  dump_stack_lvl+0x5b/0x77
[46729.134583]  __lock_acquire.cold+0x167/0x29e
[46729.134594]  lock_acquire+0xce/0x2d0
[46729.134599]  ? btrfs_reserve_extent+0xbd/0x250
[46729.134606]  ? btrfs_get_alloc_profile+0x17e/0x240
[46729.134611]  btrfs_get_alloc_profile+0x19c/0x240
[46729.134614]  ? btrfs_reserve_extent+0xbd/0x250
[46729.134618]  btrfs_reserve_extent+0xbd/0x250
[46729.134629]  btrfs_alloc_tree_block+0xa3/0x510
[46729.134635]  ? release_extent_buffer+0xa7/0xe0
[46729.134643]  split_node+0x131/0x3d0
[46729.134652]  btrfs_search_slot+0x2f3/0xc30
[46729.134659]  ? btrfs_insert_inode_ref+0x50/0x3b0
[46729.134664]  btrfs_insert_empty_items+0x31/0x70
[46729.134669]  btrfs_insert_inode_ref+0x99/0x3b0
[46729.134678]  btrfs_rename2+0x317/0x1510
[46729.134690]  ? vfs_rename+0x49d/0xd20
[46729.134693]  ? btrfs_symlink+0x460/0x460
[46729.134696]  vfs_rename+0x49d/0xd20
[46729.134705]  ? do_renameat2+0x4a0/0x510
[46729.134709]  do_renameat2+0x4a0/0x510
[46729.134720]  __x64_sys_rename+0x3f/0x50
[46729.134724]  do_syscall_64+0x5b/0x80
[46729.134729]  ? memcg_slab_free_hook+0x1fd/0x2e0
[46729.134735]  ? do_faccessat+0x111/0x260
[46729.134739]  ? kmem_cache_free+0x379/0x3d0
[46729.134744]  ? lock_is_held_type+0xe8/0x140
[46729.134749]  ? do_syscall_64+0x67/0x80
[46729.134752]  ? lockdep_hardirqs_on+0x7d/0x100
[46729.134757]  ? do_syscall_64+0x67/0x80
[46729.134760]  ? asm_exc_page_fault+0x22/0x30
[46729.134764]  ? lockdep_hardirqs_on+0x7d/0x100
[46729.134768]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[46729.134773] RIP: 0033:0x7fd2a29b5afb
[46729.134798] Code: e8 7a 27 0a 00 f7 d8 19 c0 5b c3 0f 1f 40 00 b8
ff ff ff ff 5b c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 52 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f1 82 17 00
f7 d8
[46729.134801] RSP: 002b:00007fd25b70a5a8 EFLAGS: 00000282 ORIG_RAX:
0000000000000052
[46729.134805] RAX: ffffffffffffffda RBX: 00007fd25b70a5e0 RCX: 00007fd2a29b5afb
[46729.134808] RDX: 0000000000000000 RSI: 00003ba01ef60820 RDI: 00003ba00e4b2da0
[46729.134810] RBP: 00007fd25b70a660 R08: 0000000000000000 R09: 00007fd25b70a570
[46729.134812] R10: 00007ffd36b1f080 R11: 0000000000000282 R12: 00007fd25b70a5b8
[46729.134815] R13: 00003ba00e4b2da0 R14: 00007fd25b70a6c4 R15: 00003ba01ef60820
[46729.134823]  </TASK>

In this regard, I want to ask, is this really a bug?
The kernel version is 5.19-rc7.

Here's the full kernel log: https://pastebin.com/hYWH7RHu
Here's /proc/lock_stat: https://pastebin.com/ex5w0QW9

-- 
Best Regards,
Mike Gavrilov.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
@ 2020-12-09  8:02 Dmitry Vyukov
  0 siblings, 0 replies; 30+ messages in thread
From: Dmitry Vyukov @ 2020-12-09  8:02 UTC (permalink / raw)
  To: syzbot; +Cc: syzkaller-bugs, LKML

This stopped happening a while ago, let's close this to get
notifications about new instances.
One of likely candidates:

#syz fix: net: partially revert dynamic lockdep key changes

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-01-27 15:34 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-06 18:04 BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low! syzbot
2019-04-11  1:43 ` syzbot
2019-04-11 12:14 ` syzbot
2019-04-11 21:50   ` Benjamin Herrenschmidt
2020-12-09  8:02 Dmitry Vyukov
2022-07-26 12:32 Mikhail Gavrilov
2022-07-26 16:42 ` David Sterba
2022-07-26 19:19   ` Chris Murphy
2022-07-26 19:21     ` Chris Murphy
2022-07-26 20:42       ` David Sterba
2022-08-03 19:28   ` Mikhail Gavrilov
2022-08-03 20:00     ` Chris Murphy
2022-08-04  7:35       ` Mikhail Gavrilov
2022-08-04 11:23         ` Tetsuo Handa
2023-01-24 20:27   ` Mikhail Gavrilov
2023-01-25 17:15     ` David Sterba
2023-01-26  9:47       ` Mikhail Gavrilov
2023-01-26 17:38         ` Boqun Feng
2023-01-26 18:30           ` Waiman Long
2023-01-26 18:59             ` Boqun Feng
2023-01-26 19:07               ` Waiman Long
2023-01-26 22:42           ` Mikhail Gavrilov
2023-01-26 22:51             ` Boqun Feng
2023-01-26 23:49             ` Boqun Feng
2023-01-27  0:20             ` Waiman Long
2023-01-27  3:37               ` Chris Murphy
2023-01-27  4:07                 ` Boqun Feng
2023-01-27  5:35                   ` Mikhail Gavrilov
2023-01-27 14:26                   ` Waiman Long
2023-01-27 15:33                     ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).