All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [wireless?] possible deadlock in ieee80211_open
@ 2024-03-27 14:52 syzbot
  2024-03-28 22:37 ` Johannes Berg
  0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2024-03-27 14:52 UTC (permalink / raw)
  To: davem, edumazet, johannes, kuba, linux-kernel, linux-wireless,
	netdev, pabeni, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    237bb5f7f7f5 cxgb4: unnecessary check for 0 in the free_sg..
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=113622a5180000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=7526b1c2ce0b9a92e9a6
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/728c4d735738/disk-237bb5f7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fcd84ee276f5/vmlinux-237bb5f7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/366f6292e769/bzImage-237bb5f7.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7526b1c2ce0b9a92e9a6@syzkaller.appspotmail.com

netlink: 'syz-executor.0': attribute type 10 has an invalid length.
======================================================
WARNING: possible circular locking dependency detected
6.8.0-syzkaller-05204-g237bb5f7f7f5 #0 Not tainted
------------------------------------------------------
syz-executor.0/7478 is trying to acquire lock:
ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5951 [inline]
ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449

but task is already holding lock:
ffff888064974d20 (team->team_lock_key#17){+.+.}-{3:3}, at: team_add_slave+0xad/0x2750 drivers/net/team/team.c:1973

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (team->team_lock_key#17){+.+.}-{3:3}:
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       team_port_change_check+0x51/0x1e0 drivers/net/team/team.c:2995
       team_device_event+0x161/0x5b0 drivers/net/team/team.c:3021
       notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
       call_netdevice_notifiers net/core/dev.c:2002 [inline]
       dev_close_many+0x33c/0x4c0 net/core/dev.c:1543
       unregister_netdevice_many_notify+0x544/0x16d0 net/core/dev.c:11071
       macvlan_device_event+0x7bc/0x850 drivers/net/macvlan.c:1828
       notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
       call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
       call_netdevice_notifiers net/core/dev.c:2002 [inline]
       unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
       unregister_netdevice_many net/core/dev.c:11154 [inline]
       unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
       unregister_netdevice include/linux/netdevice.h:3115 [inline]
       _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
       ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
       ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
       rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
       cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847
       genl_family_rcv_msg_doit net/netlink/genetlink.c:1113 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:1193 [inline]
       genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1208
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
       genl_rcv+0x28/0x40 net/netlink/genetlink.c:1217
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
       ___sys_sendmsg net/socket.c:2638 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

-> #0 (&rdev->wiphy.mtx){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3134 [inline]
       check_prevs_add kernel/locking/lockdep.c:3253 [inline]
       validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
       __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
       lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
       __mutex_lock_common kernel/locking/mutex.c:608 [inline]
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       wiphy_lock include/net/cfg80211.h:5951 [inline]
       ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
       __dev_open+0x2d3/0x450 net/core/dev.c:1430
       dev_open+0xae/0x1b0 net/core/dev.c:1466
       team_port_add drivers/net/team/team.c:1214 [inline]
       team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
       do_set_master net/core/rtnetlink.c:2685 [inline]
       do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
       __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
       rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
       rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
       netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
       netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
       netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
       netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x221/0x270 net/socket.c:745
       ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
       ___sys_sendmsg net/socket.c:2638 [inline]
       __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
       do_syscall_64+0xfb/0x240
       entry_SYSCALL_64_after_hwframe+0x6d/0x75

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(team->team_lock_key#17);
                               lock(&rdev->wiphy.mtx);
                               lock(team->team_lock_key#17);
  lock(&rdev->wiphy.mtx);

 *** DEADLOCK ***

2 locks held by syz-executor.0/7478:
 #0: ffffffff8f385a08 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
 #0: ffffffff8f385a08 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x10d0 net/core/rtnetlink.c:6592
 #1: ffff888064974d20 (team->team_lock_key#17){+.+.}-{3:3}, at: team_add_slave+0xad/0x2750 drivers/net/team/team.c:1973

stack backtrace:
CPU: 0 PID: 7478 Comm: syz-executor.0 Not tainted 6.8.0-syzkaller-05204-g237bb5f7f7f5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
 check_prev_add kernel/locking/lockdep.c:3134 [inline]
 check_prevs_add kernel/locking/lockdep.c:3253 [inline]
 validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
 __mutex_lock_common kernel/locking/mutex.c:608 [inline]
 __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
 wiphy_lock include/net/cfg80211.h:5951 [inline]
 ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
 __dev_open+0x2d3/0x450 net/core/dev.c:1430
 dev_open+0xae/0x1b0 net/core/dev.c:1466
 team_port_add drivers/net/team/team.c:1214 [inline]
 team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
 do_set_master net/core/rtnetlink.c:2685 [inline]
 do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
 __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
 rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
 rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595
 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2559
 netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
 netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
 netlink_sendmsg+0x8e1/0xcb0 net/netlink/af_netlink.c:1905
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
 ___sys_sendmsg net/socket.c:2638 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
 do_syscall_64+0xfb/0x240
 entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7fc81627dda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fc81701e0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fc8163abf80 RCX: 00007fc81627dda9
RDX: 0000000000000000 RSI: 0000000020000600 RDI: 0000000000000003
RBP: 00007fc8162ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fc8163abf80 R15: 00007ffd5f0eb6a8
 </TASK>
team0: Port device wlan1 added


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [syzbot] [wireless?] possible deadlock in ieee80211_open
  2024-03-27 14:52 [syzbot] [wireless?] possible deadlock in ieee80211_open syzbot
@ 2024-03-28 22:37 ` Johannes Berg
  0 siblings, 0 replies; 2+ messages in thread
From: Johannes Berg @ 2024-03-28 22:37 UTC (permalink / raw)
  To: syzbot, davem, edumazet, kuba, linux-kernel, linux-wireless,
	netdev, pabeni, syzkaller-bugs

On Wed, 2024-03-27 at 07:52 -0700, syzbot wrote:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.8.0-syzkaller-05204-g237bb5f7f7f5 #0 Not tainted
> ------------------------------------------------------
> syz-executor.0/7478 is trying to acquire lock:
> ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5951 [inline]
> ffff888077110768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
> 
> but task is already holding lock:
> ffff888064974d20 (team->team_lock_key#17){+.+.}-{3:3}, at: team_add_slave+0xad/0x2750 drivers/net/team/team.c:1973
> 
> which lock already depends on the new lock.

Hmm.

> the existing dependency chain (in reverse order) is:
> 
> -> #1 (team->team_lock_key#17){+.+.}-{3:3}:
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        team_port_change_check+0x51/0x1e0 drivers/net/team/team.c:2995
>        team_device_event+0x161/0x5b0 drivers/net/team/team.c:3021
>        notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>        call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>        call_netdevice_notifiers net/core/dev.c:2002 [inline]
>        dev_close_many+0x33c/0x4c0 net/core/dev.c:1543
>        unregister_netdevice_many_notify+0x544/0x16d0 net/core/dev.c:11071
>        macvlan_device_event+0x7bc/0x850 drivers/net/macvlan.c:1828
>        notifier_call_chain+0x18f/0x3b0 kernel/notifier.c:93
>        call_netdevice_notifiers_extack net/core/dev.c:1988 [inline]
>        call_netdevice_notifiers net/core/dev.c:2002 [inline]
>        unregister_netdevice_many_notify+0xd96/0x16d0 net/core/dev.c:11096
>        unregister_netdevice_many net/core/dev.c:11154 [inline]
>        unregister_netdevice_queue+0x303/0x370 net/core/dev.c:11033
>        unregister_netdevice include/linux/netdevice.h:3115 [inline]
>        _cfg80211_unregister_wdev+0x162/0x560 net/wireless/core.c:1206
>        ieee80211_if_remove+0x25d/0x3a0 net/mac80211/iface.c:2242
>        ieee80211_del_iface+0x19/0x30 net/mac80211/cfg.c:202
>        rdev_del_virtual_intf net/wireless/rdev-ops.h:62 [inline]
>        cfg80211_remove_virtual_intf+0x230/0x3f0 net/wireless/util.c:2847

So this was the interface being removed via nl80211 (why do we even do
that? rtnetlink can do that too ...)

I guess it was a team port, since team_port_get_rtnl() must've been non-
NULL for this netdev. That acquires the team->lock mutex, but we hold
the wiphy mutex around unregister_netdevice().

> -> #0 (&rdev->wiphy.mtx){+.+.}-{3:3}:
>        check_prev_add kernel/locking/lockdep.c:3134 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
>        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
>        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
>        lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
>        __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>        __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>        wiphy_lock include/net/cfg80211.h:5951 [inline]
>        ieee80211_open+0xe7/0x200 net/mac80211/iface.c:449
>        __dev_open+0x2d3/0x450 net/core/dev.c:1430
>        dev_open+0xae/0x1b0 net/core/dev.c:1466
>        team_port_add drivers/net/team/team.c:1214 [inline]
>        team_add_slave+0x9b3/0x2750 drivers/net/team/team.c:1974
>        do_set_master net/core/rtnetlink.c:2685 [inline]
>        do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2891
>        __rtnl_newlink net/core/rtnetlink.c:3680 [inline]
>        rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3727
>        rtnetlink_rcv_msg+0x89b/0x10d0 net/core/rtnetlink.c:6595

I guess this was actually adding it as a team slave/port, which acquired
the team->lock mutex, but do_open acquires the wiphy lock.

We _don't_ hold the wiphy mutex around dev_close() when invoked in this
path (see nl80211_del_interface), but regardless of how we delete the
interface, we will hold wiphy mutex around the unregister.

Thing is, I'm not sure I see a good way to avoid that? Maybe we could
defer the unregister, and just set the ieee80211_ptr to NULL to make it
effectively dead for wireless in the meantime. Not sure.

However, as far as I can tell it's not actually possible for the
deadlock to happen, because _both_ paths will necessarily be holding the
RTNL around them - from nl80211 (nl80211_del_interface has
NL80211_FLAG_NEED_RTNL) and rtnetlink_rcv_msg() respectively.

So ultimately, we're both holding the mutex for internal reasons, but
given the outer RTNL, I don't see how this would really deadlock.

Given that, I'm inclined to ignore this, although it'd be nice to
silence lockdep about it somehow I guess?

johannes

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-03-28 22:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-27 14:52 [syzbot] [wireless?] possible deadlock in ieee80211_open syzbot
2024-03-28 22:37 ` Johannes Berg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.