b.a.t.m.a.n.lists.open-mesh.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [batman?] BUG: soft lockup in sys_sendmsg
@ 2024-02-12 10:26 syzbot
  2024-02-12 10:41 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: syzbot @ 2024-02-12 10:26 UTC (permalink / raw)
  To: a, b.a.t.m.a.n, davem, edumazet, kuba, linux-kernel,
	mareklindner, netdev, pabeni, sven, sw, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    41bccc98fb79 Linux 6.8-rc2
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=14200118180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=451a1e62b11ea4a6
dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0772069e29cf/disk-41bccc98.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/659d3f0755b7/vmlinux-41bccc98.xz
kernel image: https://storage.googleapis.com/syzbot-assets/7780a45c3e51/Image-41bccc98.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com

watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [syz-executor.0:28718]
Modules linked in:
irq event stamp: 45929391
hardirqs last  enabled at (45929390): [<ffff8000801d9dc8>] __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
hardirqs last disabled at (45929391): [<ffff80008ad57108>] __el1_irq arch/arm64/kernel/entry-common.c:499 [inline]
hardirqs last disabled at (45929391): [<ffff80008ad57108>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:517
softirqs last  enabled at (2040): [<ffff80008002189c>] softirq_handle_end kernel/softirq.c:399 [inline]
softirqs last  enabled at (2040): [<ffff80008002189c>] __do_softirq+0xac8/0xce4 kernel/softirq.c:582
softirqs last disabled at (2052): [<ffff80008aacbc40>] spin_lock_bh include/linux/spinlock.h:356 [inline]
softirqs last disabled at (2052): [<ffff80008aacbc40>] batadv_tt_local_resize_to_mtu+0x60/0x154 net/batman-adv/translation-table.c:3949
CPU: 1 PID: 28718 Comm: syz-executor.0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : should_resched arch/arm64/include/asm/preempt.h:79 [inline]
pc : __local_bh_enable_ip+0x228/0x44c kernel/softirq.c:388
lr : __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
sp : ffff80009a0670b0
x29: ffff80009a0670c0 x28: ffff70001340ce60 x27: ffff80009a0673d0
x26: ffff00011e860290 x25: ffff0000d08a9f08 x24: 0000000000000001
x23: 1fffe00023d4d3c1 x22: dfff800000000000 x21: ffff80008aacbf98
x20: 0000000000000202 x19: ffff00011ea69e08 x18: ffff80009a066800
x17: 77656e2074696620 x16: ffff80008031ffc8 x15: 0000000000000001
x14: 1fffe0001ba5a290 x13: 0000000000000000 x12: 0000000000000003
x11: 0000000000040000 x10: 0000000000000003 x9 : 0000000000000000
x8 : 0000000002bcd3ae x7 : ffff80008aacbe30 x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000002 x1 : ffff80008aecd7e0 x0 : ffff80012545c000
Call trace:
 __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:27 [inline]
 arch_local_irq_enable arch/arm64/include/asm/irqflags.h:49 [inline]
 __local_bh_enable_ip+0x228/0x44c kernel/softirq.c:386
 __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
 _raw_spin_unlock_bh+0x3c/0x4c kernel/locking/spinlock.c:210
 spin_unlock_bh include/linux/spinlock.h:396 [inline]
 batadv_tt_local_purge+0x264/0x2e8 net/batman-adv/translation-table.c:1356
 batadv_tt_local_resize_to_mtu+0xa0/0x154 net/batman-adv/translation-table.c:3956
 batadv_update_min_mtu+0x74/0xa4 net/batman-adv/hard-interface.c:651
 batadv_netlink_set_mesh+0x50c/0x1078 net/batman-adv/netlink.c:500
 genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
 genl_rcv_msg+0x874/0xb6c net/netlink/genetlink.c:1208
 netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2543
 genl_rcv+0x38/0x50 net/netlink/genetlink.c:1217
 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
 netlink_unicast+0x65c/0x898 net/netlink/af_netlink.c:1367
 netlink_sendmsg+0x83c/0xb20 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 ____sys_sendmsg+0x56c/0x840 net/socket.c:2584
 ___sys_sendmsg net/socket.c:2638 [inline]
 __sys_sendmsg+0x26c/0x33c net/socket.c:2667
 __do_sys_sendmsg net/socket.c:2676 [inline]
 __se_sys_sendmsg net/socket.c:2674 [inline]
 __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2674
 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
 invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
 el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678
 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:51
lr : default_idle_call+0xf8/0x128 kernel/sched/idle.c:103
sp : ffff80008ebe7cd0
x29: ffff80008ebe7cd0 x28: dfff800000000000 x27: 1ffff00011d7cfa8
x26: ffff80008ec6d000 x25: 0000000000000000 x24: 0000000000000001
x23: 1ffff00011d8da74 x22: ffff80008ec6d3a0 x21: 0000000000000000
x20: ffff80008ec94e00 x19: ffff8000802cff08 x18: 1fffe000367ff796
x17: ffff80008ec6d000 x16: ffff8000802cf7cc x15: 0000000000000001
x14: 1fffe00036801310 x13: 0000000000000000 x12: 0000000000000003
x11: 0000000000000001 x10: 0000000000000003 x9 : 0000000000000000
x8 : 0000000000bf0413 x7 : ffff800080461668 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008ad5af48
x2 : 0000000000000000 x1 : ffff80008aecd7e0 x0 : ffff80012543a000
Call trace:
 __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:27 [inline]
 arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:49
 cpuidle_idle_call kernel/sched/idle.c:170 [inline]
 do_idle+0x1f0/0x4e8 kernel/sched/idle.c:312
 cpu_startup_entry+0x5c/0x74 kernel/sched/idle.c:410
 rest_init+0x2dc/0x2f4 init/main.c:730
 start_kernel+0x0/0x4e8 init/main.c:827
 start_kernel+0x3e8/0x4e8 init/main.c:1072
 __primary_switched+0xb4/0xbc arch/arm64/kernel/head.S:523


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [batman?] BUG: soft lockup in sys_sendmsg
  2024-02-12 10:26 [syzbot] [batman?] BUG: soft lockup in sys_sendmsg syzbot
@ 2024-02-12 10:41 ` Eric Dumazet
  2024-02-12 11:23   ` Sven Eckelmann
  2024-02-12 13:28 ` Sven Eckelmann
  2024-03-21 18:26 ` [syzbot] [tipc?] " syzbot
  2 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2024-02-12 10:41 UTC (permalink / raw)
  To: syzbot
  Cc: a, b.a.t.m.a.n, davem, kuba, linux-kernel, mareklindner, netdev,
	pabeni, sven, sw, syzkaller-bugs

On Mon, Feb 12, 2024 at 11:26 AM syzbot
<syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    41bccc98fb79 Linux 6.8-rc2
> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=14200118180000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=451a1e62b11ea4a6
> dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: arm64
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0772069e29cf/disk-41bccc98.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/659d3f0755b7/vmlinux-41bccc98.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7780a45c3e51/Image-41bccc98.gz.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com
>
> watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [syz-executor.0:28718]
> Modules linked in:
> irq event stamp: 45929391
> hardirqs last  enabled at (45929390): [<ffff8000801d9dc8>] __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
> hardirqs last disabled at (45929391): [<ffff80008ad57108>] __el1_irq arch/arm64/kernel/entry-common.c:499 [inline]
> hardirqs last disabled at (45929391): [<ffff80008ad57108>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:517
> softirqs last  enabled at (2040): [<ffff80008002189c>] softirq_handle_end kernel/softirq.c:399 [inline]
> softirqs last  enabled at (2040): [<ffff80008002189c>] __do_softirq+0xac8/0xce4 kernel/softirq.c:582
> softirqs last disabled at (2052): [<ffff80008aacbc40>] spin_lock_bh include/linux/spinlock.h:356 [inline]
> softirqs last disabled at (2052): [<ffff80008aacbc40>] batadv_tt_local_resize_to_mtu+0x60/0x154 net/batman-adv/translation-table.c:3949
> CPU: 1 PID: 28718 Comm: syz-executor.0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : should_resched arch/arm64/include/asm/preempt.h:79 [inline]
> pc : __local_bh_enable_ip+0x228/0x44c kernel/softirq.c:388
> lr : __local_bh_enable_ip+0x224/0x44c kernel/softirq.c:386
> sp : ffff80009a0670b0
> x29: ffff80009a0670c0 x28: ffff70001340ce60 x27: ffff80009a0673d0
> x26: ffff00011e860290 x25: ffff0000d08a9f08 x24: 0000000000000001
> x23: 1fffe00023d4d3c1 x22: dfff800000000000 x21: ffff80008aacbf98
> x20: 0000000000000202 x19: ffff00011ea69e08 x18: ffff80009a066800
> x17: 77656e2074696620 x16: ffff80008031ffc8 x15: 0000000000000001
> x14: 1fffe0001ba5a290 x13: 0000000000000000 x12: 0000000000000003
> x11: 0000000000040000 x10: 0000000000000003 x9 : 0000000000000000
> x8 : 0000000002bcd3ae x7 : ffff80008aacbe30 x6 : 0000000000000000
> x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
> x2 : 0000000000000002 x1 : ffff80008aecd7e0 x0 : ffff80012545c000
> Call trace:
>  __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:27 [inline]
>  arch_local_irq_enable arch/arm64/include/asm/irqflags.h:49 [inline]
>  __local_bh_enable_ip+0x228/0x44c kernel/softirq.c:386
>  __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
>  _raw_spin_unlock_bh+0x3c/0x4c kernel/locking/spinlock.c:210
>  spin_unlock_bh include/linux/spinlock.h:396 [inline]
>  batadv_tt_local_purge+0x264/0x2e8 net/batman-adv/translation-table.c:1356
>  batadv_tt_local_resize_to_mtu+0xa0/0x154 net/batman-adv/translation-table.c:3956
>  batadv_update_min_mtu+0x74/0xa4 net/batman-adv/hard-interface.c:651
>  batadv_netlink_set_mesh+0x50c/0x1078 net/batman-adv/netlink.c:500
>  genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
>  genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
>  genl_rcv_msg+0x874/0xb6c net/netlink/genetlink.c:1208
>  netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2543
>  genl_rcv+0x38/0x50 net/netlink/genetlink.c:1217
>  netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
>  netlink_unicast+0x65c/0x898 net/netlink/af_netlink.c:1367
>  netlink_sendmsg+0x83c/0xb20 net/netlink/af_netlink.c:1908
>  sock_sendmsg_nosec net/socket.c:730 [inline]
>  __sock_sendmsg net/socket.c:745 [inline]
>  ____sys_sendmsg+0x56c/0x840 net/socket.c:2584
>  ___sys_sendmsg net/socket.c:2638 [inline]
>  __sys_sendmsg+0x26c/0x33c net/socket.c:2667
>  __do_sys_sendmsg net/socket.c:2676 [inline]
>  __se_sys_sendmsg net/socket.c:2674 [inline]
>  __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2674
>  __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
>  invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
>  el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
>  do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
>  el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678
>  el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
>  el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0-rc2-syzkaller-g41bccc98fb79 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:51
> lr : default_idle_call+0xf8/0x128 kernel/sched/idle.c:103
> sp : ffff80008ebe7cd0
> x29: ffff80008ebe7cd0 x28: dfff800000000000 x27: 1ffff00011d7cfa8
> x26: ffff80008ec6d000 x25: 0000000000000000 x24: 0000000000000001
> x23: 1ffff00011d8da74 x22: ffff80008ec6d3a0 x21: 0000000000000000
> x20: ffff80008ec94e00 x19: ffff8000802cff08 x18: 1fffe000367ff796
> x17: ffff80008ec6d000 x16: ffff8000802cf7cc x15: 0000000000000001
> x14: 1fffe00036801310 x13: 0000000000000000 x12: 0000000000000003
> x11: 0000000000000001 x10: 0000000000000003 x9 : 0000000000000000
> x8 : 0000000000bf0413 x7 : ffff800080461668 x6 : 0000000000000000
> x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff80008ad5af48
> x2 : 0000000000000000 x1 : ffff80008aecd7e0 x0 : ffff80012543a000
> Call trace:
>  __daif_local_irq_enable arch/arm64/include/asm/irqflags.h:27 [inline]
>  arch_local_irq_enable+0x8/0xc arch/arm64/include/asm/irqflags.h:49
>  cpuidle_idle_call kernel/sched/idle.c:170 [inline]
>  do_idle+0x1f0/0x4e8 kernel/sched/idle.c:312
>  cpu_startup_entry+0x5c/0x74 kernel/sched/idle.c:410
>  rest_init+0x2dc/0x2f4 init/main.c:730
>  start_kernel+0x0/0x4e8 init/main.c:827
>  start_kernel+0x3e8/0x4e8 init/main.c:1072
>  __primary_switched+0xb4/0xbc arch/arm64/kernel/head.S:523
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup

This patch [1] looks suspicious

I think batman-adv should reject too small MTU values.

[1]

commit d8e42a2b0addf238be8b3b37dcd9795a5c1be459
Author: Sven Eckelmann <sven@narfation.org>
Date:   Wed Jul 19 10:01:15 2023 +0200

    batman-adv: Don't increase MTU when set by user

    If the user set an MTU value, it usually means that there are special
    requirements for the MTU. But if an interface gots activated, the MTU was
    always recalculated and then the user set value was overwritten.

    The only reason why this user set value has to be overwritten, is when the
    MTU has to be decreased because batman-adv is not able to transfer packets
    with the user specified size.

    Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sven Eckelmann <sven@narfation.org>
    Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [batman?] BUG: soft lockup in sys_sendmsg
  2024-02-12 10:41 ` Eric Dumazet
@ 2024-02-12 11:23   ` Sven Eckelmann
  0 siblings, 0 replies; 6+ messages in thread
From: Sven Eckelmann @ 2024-02-12 11:23 UTC (permalink / raw)
  To: syzbot, Eric Dumazet
  Cc: a, b.a.t.m.a.n, davem, kuba, linux-kernel, mareklindner, netdev,
	pabeni, sw, syzkaller-bugs

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

On Monday, 12 February 2024 11:41:38 CET Eric Dumazet wrote:
> This patch [1] looks suspicious

Shouldn't be caused by this - but this might be another way to trigger the 
problem. The problem would be visible even without it when a mtu is explicitly 
set. But the reproducer is not available so I can't actually check what is 
going on.

> I think batman-adv should reject too small MTU values.

You are refering to the size calculated by 
batadv_tt_local_table_transmit_size(), right? And yes, I would agree that it 
looks suspicious and might not have been correctly integrated in 
batadv_max_header_len() when commit a19d3d85e1b8 ("batman-adv: limit local 
translation table max size") introduced the code. But I think we also need to 
remove interfaces again when receiving NETDEV_CHANGEMTU and an interface is 
not having the correctly sized anymore. So have to check how to do this the 
best way.

Kind regards,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [batman?] BUG: soft lockup in sys_sendmsg
  2024-02-12 10:26 [syzbot] [batman?] BUG: soft lockup in sys_sendmsg syzbot
  2024-02-12 10:41 ` Eric Dumazet
@ 2024-02-12 13:28 ` Sven Eckelmann
  2024-02-12 13:28   ` syzbot
  2024-03-21 18:26 ` [syzbot] [tipc?] " syzbot
  2 siblings, 1 reply; 6+ messages in thread
From: Sven Eckelmann @ 2024-02-12 13:28 UTC (permalink / raw)
  To: a, b.a.t.m.a.n, davem, edumazet, kuba, linux-kernel,
	mareklindner, netdev, pabeni, sw, syzkaller-bugs, syzbot

[-- Attachment #1: Type: text/plain, Size: 3925 bytes --]

On Monday, 12 February 2024 11:26:24 CET syzbot wrote:
> syzbot found the following issue on:
> 
> HEAD commit:    41bccc98fb79 Linux 6.8-rc2
> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=14200118180000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=451a1e62b11ea4a6
> dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: arm64
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0772069e29cf/disk-41bccc98.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/659d3f0755b7/vmlinux-41bccc98.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7780a45c3e51/Image-41bccc98.gz.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com
> 

#syz test

>From 5984ace8f8df7cf8d6f98ded0eebe7d962028992 Mon Sep 17 00:00:00 2001
From: Sven Eckelmann <sven@narfation.org>
Date: Mon, 12 Feb 2024 13:10:33 +0100
Subject: [PATCH] batman-adv: Avoid infinite loop trying to resize local TT

If the MTU of one of an attached interface becomes too small to transmit
the local translation table then it must be resized to fit inside all
fragments (when enabled) or a single packet.

But if the MTU becomes too low to transmit even the header + the VLAN
specific part then the resizing of the local TT will never succeed. This
can for example happen when the usable space is 110 bytes and 11 VLANs are
on top of batman-adv. In this case, at least 116 byte would be needed.
There will just be an endless spam of

   batman_adv: batadv0: Forced to purge local tt entries to fit new maximum fragment MTU (110)

in the log but the function will never finish. Problem here is that the
timeout will be halved in each step and will then stagnate at 0 and
therefore never be able to reduce the table even more.

There are other scenarios possible with a similar result. The number of
BATADV_TT_CLIENT_NOPURGE entries in the local TT can for example be too
high to fit inside a packet. Such a scenario can therefore happen also with
only a single VLAN + 7 non-purgable addresses - requiring at least 120
bytes.

While this should be handled proactively when:

* interface with too low MTU is added
* VLAN is added
* non-purgeable local mac is added
* MTU of an attached interface is reduced
* fragmentation setting gets disabled (which most likely requires dropping
  attached interfaces)

not all of these scenarios can be prevented because batman-adv is only
consuming events without the the possibility to prevent these actions
(non-purgable MAC address added, MTU of an attached interface is reduced).
It is therefore necessary to also make sure that the code is able to handle
also the situations when there were already incompatible system
configurations present.

Cc: stable@vger.kernel.org
Fixes: a19d3d85e1b8 ("batman-adv: limit local translation table max size")
Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com
Signed-off-by: Sven Eckelmann <sven@narfation.org>
---
 net/batman-adv/translation-table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index b95c36765d04..2243cec18ecc 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -3948,7 +3948,7 @@ void batadv_tt_local_resize_to_mtu(struct net_device *soft_iface)
 
 	spin_lock_bh(&bat_priv->tt.commit_lock);
 
-	while (true) {
+	while (timeout) {
 		table_size = batadv_tt_local_table_transmit_size(bat_priv);
 		if (packet_size_max >= table_size)
 			break;
-- 
2.39.2


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [syzbot] [batman?] BUG: soft lockup in sys_sendmsg
  2024-02-12 13:28 ` Sven Eckelmann
@ 2024-02-12 13:28   ` syzbot
  0 siblings, 0 replies; 6+ messages in thread
From: syzbot @ 2024-02-12 13:28 UTC (permalink / raw)
  To: sven
  Cc: a, b.a.t.m.a.n, davem, edumazet, kuba, linux-kernel,
	mareklindner, netdev, pabeni, sven, sw, syzkaller-bugs

> On Monday, 12 February 2024 11:26:24 CET syzbot wrote:
>> syzbot found the following issue on:
>> 
>> HEAD commit:    41bccc98fb79 Linux 6.8-rc2
>> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
>> console output: https://syzkaller.appspot.com/x/log.txt?x=14200118180000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=451a1e62b11ea4a6
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>> userspace arch: arm64
>> 
>> Unfortunately, I don't have any reproducer for this issue yet.
>> 
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/0772069e29cf/disk-41bccc98.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/659d3f0755b7/vmlinux-41bccc98.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/7780a45c3e51/Image-41bccc98.gz.xz
>> 
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com
>> 
>
> #syz test

This crash does not have a reproducer. I cannot test it.

>
> From 5984ace8f8df7cf8d6f98ded0eebe7d962028992 Mon Sep 17 00:00:00 2001
> From: Sven Eckelmann <sven@narfation.org>
> Date: Mon, 12 Feb 2024 13:10:33 +0100
> Subject: [PATCH] batman-adv: Avoid infinite loop trying to resize local TT
>
> If the MTU of one of an attached interface becomes too small to transmit
> the local translation table then it must be resized to fit inside all
> fragments (when enabled) or a single packet.
>
> But if the MTU becomes too low to transmit even the header + the VLAN
> specific part then the resizing of the local TT will never succeed. This
> can for example happen when the usable space is 110 bytes and 11 VLANs are
> on top of batman-adv. In this case, at least 116 byte would be needed.
> There will just be an endless spam of
>
>    batman_adv: batadv0: Forced to purge local tt entries to fit new maximum fragment MTU (110)
>
> in the log but the function will never finish. Problem here is that the
> timeout will be halved in each step and will then stagnate at 0 and
> therefore never be able to reduce the table even more.
>
> There are other scenarios possible with a similar result. The number of
> BATADV_TT_CLIENT_NOPURGE entries in the local TT can for example be too
> high to fit inside a packet. Such a scenario can therefore happen also with
> only a single VLAN + 7 non-purgable addresses - requiring at least 120
> bytes.
>
> While this should be handled proactively when:
>
> * interface with too low MTU is added
> * VLAN is added
> * non-purgeable local mac is added
> * MTU of an attached interface is reduced
> * fragmentation setting gets disabled (which most likely requires dropping
>   attached interfaces)
>
> not all of these scenarios can be prevented because batman-adv is only
> consuming events without the the possibility to prevent these actions
> (non-purgable MAC address added, MTU of an attached interface is reduced).
> It is therefore necessary to also make sure that the code is able to handle
> also the situations when there were already incompatible system
> configurations present.
>
> Cc: stable@vger.kernel.org
> Fixes: a19d3d85e1b8 ("batman-adv: limit local translation table max size")
> Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com
> Signed-off-by: Sven Eckelmann <sven@narfation.org>
> ---
>  net/batman-adv/translation-table.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
> index b95c36765d04..2243cec18ecc 100644
> --- a/net/batman-adv/translation-table.c
> +++ b/net/batman-adv/translation-table.c
> @@ -3948,7 +3948,7 @@ void batadv_tt_local_resize_to_mtu(struct net_device *soft_iface)
>  
>  	spin_lock_bh(&bat_priv->tt.commit_lock);
>  
> -	while (true) {
> +	while (timeout) {
>  		table_size = batadv_tt_local_table_transmit_size(bat_priv);
>  		if (packet_size_max >= table_size)
>  			break;
> -- 
> 2.39.2
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [tipc?] [batman?] BUG: soft lockup in sys_sendmsg
  2024-02-12 10:26 [syzbot] [batman?] BUG: soft lockup in sys_sendmsg syzbot
  2024-02-12 10:41 ` Eric Dumazet
  2024-02-12 13:28 ` Sven Eckelmann
@ 2024-03-21 18:26 ` syzbot
  2 siblings, 0 replies; 6+ messages in thread
From: syzbot @ 2024-03-21 18:26 UTC (permalink / raw)
  To: a, b.a.t.m.a.n, davem, edumazet, jmaloy, kuba, linux-kernel,
	mareklindner, netdev, pabeni, sven, sw, syzkaller-bugs,
	tipc-discussion, ying.xue

syzbot has found a reproducer for the following issue on:

HEAD commit:    707081b61156 Merge branch 'for-next/core', remote-tracking..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=134d4fa5180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=caeac3f3565b057a
dashboard link: https://syzkaller.appspot.com/bug?extid=a6a4b5bb3da165594cff
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=139a4c81180000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=108b0ac9180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6cad68bf7532/disk-707081b6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1a27e5400778/vmlinux-707081b6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/67dfc53755d0/Image-707081b6.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a6a4b5bb3da165594cff@syzkaller.appspotmail.com

watchdog: BUG: soft lockup - CPU#0 stuck for 27s! [syz-executor227:7772]
Modules linked in:
irq event stamp: 5373
hardirqs last  enabled at (5372): [<ffff80008ad68de8>] __exit_to_kernel_mode arch/arm64/kernel/entry-common.c:85 [inline]
hardirqs last  enabled at (5372): [<ffff80008ad68de8>] exit_to_kernel_mode+0xdc/0x10c arch/arm64/kernel/entry-common.c:95
hardirqs last disabled at (5373): [<ffff80008ad66a78>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (5373): [<ffff80008ad66a78>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last  enabled at (542): [<ffff800088e9a56c>] spin_unlock_bh include/linux/spinlock.h:396 [inline]
softirqs last  enabled at (542): [<ffff800088e9a56c>] release_sock+0x154/0x1b8 net/core/sock.c:3547
softirqs last disabled at (548): [<ffff800088eaf8bc>] spin_lock_bh include/linux/spinlock.h:356 [inline]
softirqs last disabled at (548): [<ffff800088eaf8bc>] lock_sock_nested+0x74/0x11c net/core/sock.c:3526
CPU: 0 PID: 7772 Comm: syz-executor227 Not tainted 6.8.0-rc7-syzkaller-g707081b61156 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : queued_spin_lock_slowpath+0x15c/0xcf8 kernel/locking/qspinlock.c:383
lr : queued_spin_lock_slowpath+0x168/0xcf8 kernel/locking/qspinlock.c:383
sp : ffff800097ca76c0
x29: ffff800097ca7760 x28: 1fffe00018e1be6b x27: 1ffff00012f94ee4
x26: dfff800000000000 x25: 1fffe00018e1be6d x24: ffff800097ca76e0
x23: ffff800097ca7720 x22: ffff700012f94edc x21: 0000000000000001
x20: 0000000000000001 x19: ffff0000c70df358 x18: 0000000000000000
x17: 0000000000000000 x16: ffff8000809fd934 x15: 0000000000000001
x14: 1fffe00018e1be6b x13: 0000000000000000 x12: 0000000000000000
x11: ffff600018e1be6c x10: 1fffe00018e1be6b x9 : 0000000000000000
x8 : 0000000000000001 x7 : ffff800088eaf8bc x6 : 0000000000000000
x5 : 0000000000000000 x4 : 0000000000000001 x3 : ffff80008ae5db50
x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000001
Call trace:
 __cmpwait_case_8 arch/arm64/include/asm/cmpxchg.h:229 [inline]
 __cmpwait arch/arm64/include/asm/cmpxchg.h:257 [inline]
 queued_spin_lock_slowpath+0x15c/0xcf8 kernel/locking/qspinlock.c:383
 queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
 do_raw_spin_lock+0x320/0x348 kernel/locking/spinlock_debug.c:116
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:127 [inline]
 _raw_spin_lock_bh+0x50/0x60 kernel/locking/spinlock.c:178
 spin_lock_bh include/linux/spinlock.h:356 [inline]
 lock_sock_nested+0x74/0x11c net/core/sock.c:3526
 lock_sock include/net/sock.h:1691 [inline]
 tipc_sendstream+0x50/0x84 net/tipc/socket.c:1550
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 ____sys_sendmsg+0x56c/0x840 net/socket.c:2584
 ___sys_sendmsg net/socket.c:2638 [inline]
 __sys_sendmsg+0x26c/0x33c net/socket.c:2667
 __do_sys_sendmsg net/socket.c:2676 [inline]
 __se_sys_sendmsg net/socket.c:2674 [inline]
 __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2674
 __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
 invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:48
 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:133
 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:152
 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-21 18:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-12 10:26 [syzbot] [batman?] BUG: soft lockup in sys_sendmsg syzbot
2024-02-12 10:41 ` Eric Dumazet
2024-02-12 11:23   ` Sven Eckelmann
2024-02-12 13:28 ` Sven Eckelmann
2024-02-12 13:28   ` syzbot
2024-03-21 18:26 ` [syzbot] [tipc?] " syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).