All of lore.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
@ 2023-04-30 18:01 syzbot
  2023-05-23 15:46 ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: syzbot @ 2023-04-30 18:01 UTC (permalink / raw)
  To: Jason, davem, edumazet, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs, wireguard

Hello,

syzbot found the following issue on:

HEAD commit:    825a0714d2b3 Merge tag 'efi-next-for-v6.4' of git://git.ke..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17f56dc8280000
kernel config:  https://syzkaller.appspot.com/x/.config?x=7ecbb03c21601216
dashboard link: https://syzkaller.appspot.com/bug?extid=c2775460db0e1c70018e
compiler:       Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/93b1af100ee7/disk-825a0714.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/3579f310db81/vmlinux-825a0714.xz
kernel image: https://storage.googleapis.com/syzbot-assets/0bd9cec144b8/bzImage-825a0714.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+c2775460db0e1c70018e@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: slab-use-after-free in hlist_add_head include/linux/list.h:945 [inline]
BUG: KASAN: slab-use-after-free in enqueue_timer+0xad/0x560 kernel/time/timer.c:605
Write of size 8 at addr ffff88801ecc1500 by task kworker/0:11/5405

CPU: 0 PID: 5405 Comm: kworker/0:11 Not tainted 6.3.0-syzkaller-11733-g825a0714d2b3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
Workqueue: wg-crypt-wg1 wg_packet_decrypt_worker
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:351 [inline]
 print_report+0x163/0x540 mm/kasan/report.c:462
 kasan_report+0x176/0x1b0 mm/kasan/report.c:572
 hlist_add_head include/linux/list.h:945 [inline]
 enqueue_timer+0xad/0x560 kernel/time/timer.c:605
 internal_add_timer kernel/time/timer.c:634 [inline]
 __mod_timer+0xa76/0xf40 kernel/time/timer.c:1131
 mod_peer_timer+0x158/0x220 drivers/net/wireguard/timers.c:37
 wg_packet_consume_data_done drivers/net/wireguard/receive.c:354 [inline]
 wg_packet_rx_poll+0xd9e/0x2250 drivers/net/wireguard/receive.c:474
 __napi_poll+0xc7/0x470 net/core/dev.c:6496
 napi_poll net/core/dev.c:6563 [inline]
 net_rx_action+0x78b/0x1010 net/core/dev.c:6696
 __do_softirq+0x2ab/0x908 kernel/softirq.c:571
 do_softirq+0x166/0x250 kernel/softirq.c:472
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x1b5/0x1f0 kernel/softirq.c:396
 spin_unlock_bh include/linux/spinlock.h:395 [inline]
 ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
 wg_packet_decrypt_worker+0xd40/0xde0 drivers/net/wireguard/receive.c:499
 process_one_work+0x8a0/0x10e0 kernel/workqueue.c:2405
 worker_thread+0xa63/0x1210 kernel/workqueue.c:2552
 kthread+0x2b8/0x350 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>

Allocated by task 16792:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4f/0x70 mm/kasan/common.c:52
 ____kasan_kmalloc mm/kasan/common.c:374 [inline]
 __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:383
 kasan_kmalloc include/linux/kasan.h:196 [inline]
 __do_kmalloc_node mm/slab_common.c:966 [inline]
 __kmalloc_node+0xb8/0x230 mm/slab_common.c:973
 kmalloc_node include/linux/slab.h:579 [inline]
 kvmalloc_node+0x72/0x180 mm/util.c:604
 kvmalloc include/linux/slab.h:697 [inline]
 kvzalloc include/linux/slab.h:705 [inline]
 alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
 rtnl_create_link+0x2f7/0xc00 net/core/rtnetlink.c:3315
 rtnl_newlink_create net/core/rtnetlink.c:3433 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3660 [inline]
 rtnl_newlink+0x1379/0x2010 net/core/rtnetlink.c:3673
 rtnetlink_rcv_msg+0x825/0xf40 net/core/rtnetlink.c:6395
 netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2546
 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
 netlink_unicast+0x7c3/0x990 net/netlink/af_netlink.c:1365
 netlink_sendmsg+0xa2a/0xd60 net/netlink/af_netlink.c:1913
 sock_sendmsg_nosec net/socket.c:724 [inline]
 sock_sendmsg net/socket.c:747 [inline]
 __sys_sendto+0x475/0x630 net/socket.c:2144
 __do_sys_sendto net/socket.c:2156 [inline]
 __se_sys_sendto net/socket.c:2152 [inline]
 __x64_sys_sendto+0xde/0xf0 net/socket.c:2152
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 41:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4f/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:521
 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236
 kasan_slab_free include/linux/kasan.h:162 [inline]
 slab_free_hook mm/slub.c:1781 [inline]
 slab_free_freelist_hook mm/slub.c:1807 [inline]
 slab_free mm/slub.c:3786 [inline]
 __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
 device_release+0x95/0x1c0
 kobject_cleanup lib/kobject.c:683 [inline]
 kobject_release lib/kobject.c:714 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x228/0x470 lib/kobject.c:731
 netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400
 default_device_exit_batch+0x5c9/0x630 net/core/dev.c:11392
 ops_exit_list net/core/net_namespace.c:175 [inline]
 cleanup_net+0x767/0xb80 net/core/net_namespace.c:614
 process_one_work+0x8a0/0x10e0 kernel/workqueue.c:2405
 worker_thread+0xa63/0x1210 kernel/workqueue.c:2552
 kthread+0x2b8/0x350 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

Last potentially related work creation:
 kasan_save_stack+0x3f/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb0/0xc0 mm/kasan/generic.c:491
 insert_work+0x54/0x3d0 kernel/workqueue.c:1365
 __queue_work+0xb37/0xf10 kernel/workqueue.c:1526
 call_timer_fn+0x178/0x580 kernel/time/timer.c:1700
 expire_timers kernel/time/timer.c:1746 [inline]
 __run_timers+0x67a/0x860 kernel/time/timer.c:2022
 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2035
 __do_softirq+0x2ab/0x908 kernel/softirq.c:571

Second to last potentially related work creation:
 kasan_save_stack+0x3f/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb0/0xc0 mm/kasan/generic.c:491
 insert_work+0x54/0x3d0 kernel/workqueue.c:1365
 __queue_work+0xb37/0xf10 kernel/workqueue.c:1526
 call_timer_fn+0x178/0x580 kernel/time/timer.c:1700
 expire_timers kernel/time/timer.c:1746 [inline]
 __run_timers+0x67a/0x860 kernel/time/timer.c:2022
 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2035
 __do_softirq+0x2ab/0x908 kernel/softirq.c:571

The buggy address belongs to the object at ffff88801ecc0000
 which belongs to the cache kmalloc-cg-8k of size 8192
The buggy address is located 5376 bytes inside of
 freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000)

The buggy address belongs to the physical page:
page:ffffea00007b3000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1ecc0
head:ffffea00007b3000 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff88807621e8c1
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000010200 ffff88801244f640 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000020002 00000001ffffffff ffff88807621e8c1
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d60c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 16792, tgid 16792 (syz-executor.2), ts 506275782663, free_ts 506274493341
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1e6/0x210 mm/page_alloc.c:1722
 prep_new_page mm/page_alloc.c:1729 [inline]
 get_page_from_freelist+0x321c/0x33a0 mm/page_alloc.c:3493
 __alloc_pages+0x255/0x670 mm/page_alloc.c:4759
 alloc_slab_page+0x6a/0x160 mm/slub.c:1851
 allocate_slab mm/slub.c:1998 [inline]
 new_slab+0x84/0x2f0 mm/slub.c:2051
 ___slab_alloc+0xa85/0x10a0 mm/slub.c:3192
 __slab_alloc mm/slub.c:3291 [inline]
 __slab_alloc_node mm/slub.c:3344 [inline]
 slab_alloc_node mm/slub.c:3441 [inline]
 __kmem_cache_alloc_node+0x1b8/0x290 mm/slub.c:3490
 __do_kmalloc_node mm/slab_common.c:965 [inline]
 __kmalloc_node+0xa7/0x230 mm/slab_common.c:973
 kmalloc_node include/linux/slab.h:579 [inline]
 kvmalloc_node+0x72/0x180 mm/util.c:604
 kvmalloc include/linux/slab.h:697 [inline]
 kvzalloc include/linux/slab.h:705 [inline]
 alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
 rtnl_create_link+0x2f7/0xc00 net/core/rtnetlink.c:3315
 rtnl_newlink_create net/core/rtnetlink.c:3433 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3660 [inline]
 rtnl_newlink+0x1379/0x2010 net/core/rtnetlink.c:3673
 rtnetlink_rcv_msg+0x825/0xf40 net/core/rtnetlink.c:6395
 netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2546
 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
 netlink_unicast+0x7c3/0x990 net/netlink/af_netlink.c:1365
 netlink_sendmsg+0xa2a/0xd60 net/netlink/af_netlink.c:1913
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1302 [inline]
 free_unref_page_prepare+0x903/0xa30 mm/page_alloc.c:2555
 free_unref_page+0x37/0x3f0 mm/page_alloc.c:2650
 qlist_free_all+0x22/0x60 mm/kasan/quarantine.c:185
 kasan_quarantine_reduce+0x14b/0x160 mm/kasan/quarantine.c:292
 ____kasan_kmalloc mm/kasan/common.c:340 [inline]
 __kasan_kmalloc+0x23/0xb0 mm/kasan/common.c:383
 kmalloc include/linux/slab.h:559 [inline]
 kzalloc include/linux/slab.h:680 [inline]
 ref_tracker_alloc+0x140/0x470 lib/ref_tracker.c:85
 register_netdevice+0x110b/0x1790 net/core/dev.c:10105
 ipcaif_newlink+0x1f0/0x4c0 net/caif/chnl_net.c:452
 rtnl_newlink_create net/core/rtnetlink.c:3443 [inline]
 __rtnl_newlink net/core/rtnetlink.c:3660 [inline]
 rtnl_newlink+0x1468/0x2010 net/core/rtnetlink.c:3673
 rtnetlink_rcv_msg+0x825/0xf40 net/core/rtnetlink.c:6395
 netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2546
 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
 netlink_unicast+0x7c3/0x990 net/netlink/af_netlink.c:1365
 netlink_sendmsg+0xa2a/0xd60 net/netlink/af_netlink.c:1913
 sock_sendmsg_nosec net/socket.c:724 [inline]
 sock_sendmsg net/socket.c:747 [inline]
 __sys_sendto+0x475/0x630 net/socket.c:2144
 __do_sys_sendto net/socket.c:2156 [inline]
 __se_sys_sendto net/socket.c:2152 [inline]
 __x64_sys_sendto+0xde/0xf0 net/socket.c:2152
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80

Memory state around the buggy address:
 ffff88801ecc1400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88801ecc1480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88801ecc1500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff88801ecc1580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88801ecc1600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-04-30 18:01 [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer syzbot
@ 2023-05-23 15:46 ` Jason A. Donenfeld
  2023-05-23 16:05   ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 15:46 UTC (permalink / raw)
  To: syzbot, edumazet, kuba, netdev, syzkaller-bugs
  Cc: davem, edumazet, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs, wireguard, jann

Hey Syzkaller & Netdev folks,

I've been looking at this a bit and am slightly puzzled. At first I saw
this:

>  enqueue_timer+0xad/0x560 kernel/time/timer.c:605
>  internal_add_timer kernel/time/timer.c:634 [inline]
>  __mod_timer+0xa76/0xf40 kernel/time/timer.c:1131
>  mod_peer_timer+0x158/0x220 drivers/net/wireguard/timers.c:37
>  wg_packet_consume_data_done drivers/net/wireguard/receive.c:354 [inline]
>  wg_packet_rx_poll+0xd9e/0x2250 drivers/net/wireguard/receive.c:474

And I thought - darn, it's a bug where a struct wg_peer's timer is
modified -- in this case, timer_persistent_keepalive by way of
wg_timers_any_authenticated_packet_traversal() -- after the peer object
has been freed. This fits most clearly the designated line
receive.c:354, and the subsequent 8 byte write when enqueuing the timer.

So I traced through the peer shutdown code in peer.c -- the
peer_make_dead() + peer_remove_after_dead() combo -- and made sure the
peer->is_dead RCU logic was correct. And I couldn't find a bug.

But then I looked further down at the syzbot report:

> Allocated by task 16792:
>  kvzalloc include/linux/slab.h:705 [inline]
>  alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
>  rtnl_create_link+0x2f7/0xc00 net/core/rtnetlink.c:3315

and

> Freed by task 41:
>  __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
>  device_release+0x95/0x1c0
>  kobject_cleanup lib/kobject.c:683 [inline]
>  kobject_release lib/kobject.c:714 [inline]
>  kref_put include/linux/kref.h:65 [inline]
>  kobject_put+0x228/0x470 lib/kobject.c:731
>  netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400

So that means the memory in question is actually the one that's
allocated and freed by the networking stack. Specifically, dev.c:10626
is allocating a struct net_device with a trailing struct wg_device (its
priv_data). However, wg_device does not have any struct timer_lists in
it, and I don't see how net_device's watchdog_timer would be related to
the stacktrace which is clearly operating over a wg_peer timer.

So what on earth is going on here?

Jason

PS - Jakub, I have some WG fixes queued up for you, but I wanted to have
some resolution with this first before sending a tranche.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 15:46 ` Jason A. Donenfeld
@ 2023-05-23 16:05   ` Jakub Kicinski
  2023-05-23 16:12     ` Eric Dumazet
  2023-05-23 16:14     ` Jason A. Donenfeld
  0 siblings, 2 replies; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-23 16:05 UTC (permalink / raw)
  To: Jason A. Donenfeld, edumazet
  Cc: syzbot, netdev, syzkaller-bugs, davem, linux-kernel, pabeni,
	wireguard, jann

On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote:
> > Freed by task 41:
> >  __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
> >  device_release+0x95/0x1c0
> >  kobject_cleanup lib/kobject.c:683 [inline]
> >  kobject_release lib/kobject.c:714 [inline]
> >  kref_put include/linux/kref.h:65 [inline]
> >  kobject_put+0x228/0x470 lib/kobject.c:731
> >  netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400  
> 
> So that means the memory in question is actually the one that's
> allocated and freed by the networking stack. Specifically, dev.c:10626
> is allocating a struct net_device with a trailing struct wg_device (its
> priv_data). However, wg_device does not have any struct timer_lists in
> it, and I don't see how net_device's watchdog_timer would be related to
> the stacktrace which is clearly operating over a wg_peer timer.
> 
> So what on earth is going on here?

Your timer had the pleasure of getting queued _after_ a dead watchdog
timer, no? IOW it tries to update the ->next pointer of a queued
watchdog timer. We should probably do:

diff --git a/net/core/dev.c b/net/core/dev.c
index 374d38fb8b9d..f3ed20ebcf5a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10389,6 +10389,8 @@ void netdev_run_todo(void)
                WARN_ON(rcu_access_pointer(dev->ip_ptr));
                WARN_ON(rcu_access_pointer(dev->ip6_ptr));
 
+               WARN_ON(timer_shutdown_sync(&dev->watchdog_timer));
+
                if (dev->priv_destructor)
                        dev->priv_destructor(dev);
                if (dev->needs_free_netdev)

to catch how that watchdog_timer is getting queued. Would that make
sense, Eric?

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:05   ` Jakub Kicinski
@ 2023-05-23 16:12     ` Eric Dumazet
  2023-05-23 16:41       ` Jakub Kicinski
  2023-05-23 16:14     ` Jason A. Donenfeld
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2023-05-23 16:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jason A. Donenfeld, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, May 23, 2023 at 6:05 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote:
> > > Freed by task 41:
> > >  __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
> > >  device_release+0x95/0x1c0
> > >  kobject_cleanup lib/kobject.c:683 [inline]
> > >  kobject_release lib/kobject.c:714 [inline]
> > >  kref_put include/linux/kref.h:65 [inline]
> > >  kobject_put+0x228/0x470 lib/kobject.c:731
> > >  netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400
> >
> > So that means the memory in question is actually the one that's
> > allocated and freed by the networking stack. Specifically, dev.c:10626
> > is allocating a struct net_device with a trailing struct wg_device (its
> > priv_data). However, wg_device does not have any struct timer_lists in
> > it, and I don't see how net_device's watchdog_timer would be related to
> > the stacktrace which is clearly operating over a wg_peer timer.
> >
> > So what on earth is going on here?
>
> Your timer had the pleasure of getting queued _after_ a dead watchdog
> timer, no? IOW it tries to update the ->next pointer of a queued
> watchdog timer. We should probably do:
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 374d38fb8b9d..f3ed20ebcf5a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10389,6 +10389,8 @@ void netdev_run_todo(void)
>                 WARN_ON(rcu_access_pointer(dev->ip_ptr));
>                 WARN_ON(rcu_access_pointer(dev->ip6_ptr));
>
> +               WARN_ON(timer_shutdown_sync(&dev->watchdog_timer));
> +
>                 if (dev->priv_destructor)
>                         dev->priv_destructor(dev);
>                 if (dev->needs_free_netdev)
>
> to catch how that watchdog_timer is getting queued. Would that make
> sense, Eric?

Would this case be catched at the time the device is freed ?

(CONFIG_DEBUG_OBJECTS_FREE=y or something)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:05   ` Jakub Kicinski
  2023-05-23 16:12     ` Eric Dumazet
@ 2023-05-23 16:14     ` Jason A. Donenfeld
  2023-05-23 16:46       ` Jakub Kicinski
  1 sibling, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 16:14 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: edumazet, syzbot, netdev, syzkaller-bugs, davem, linux-kernel,
	pabeni, wireguard, jann

On Tue, May 23, 2023 at 09:05:12AM -0700, Jakub Kicinski wrote:
> On Tue, 23 May 2023 17:46:20 +0200 Jason A. Donenfeld wrote:
> > > Freed by task 41:
> > >  __kmem_cache_free+0x264/0x3c0 mm/slub.c:3799
> > >  device_release+0x95/0x1c0
> > >  kobject_cleanup lib/kobject.c:683 [inline]
> > >  kobject_release lib/kobject.c:714 [inline]
> > >  kref_put include/linux/kref.h:65 [inline]
> > >  kobject_put+0x228/0x470 lib/kobject.c:731
> > >  netdev_run_todo+0xe5a/0xf50 net/core/dev.c:10400  
> > 
> > So that means the memory in question is actually the one that's
> > allocated and freed by the networking stack. Specifically, dev.c:10626
> > is allocating a struct net_device with a trailing struct wg_device (its
> > priv_data). However, wg_device does not have any struct timer_lists in
> > it, and I don't see how net_device's watchdog_timer would be related to
> > the stacktrace which is clearly operating over a wg_peer timer.
> > 
> > So what on earth is going on here?
> 
> Your timer had the pleasure of getting queued _after_ a dead watchdog
> timer, no? IOW it tries to update the ->next pointer of a queued
> watchdog timer. 

Ahh, you're right! Specifically,

> hlist_add_head include/linux/list.h:945 [inline]
> enqueue_timer+0xad/0x560 kernel/time/timer.c:605

The write on line 945 refers to the side of the timer base, not the
peer's timer_list being queued. So indeed, the wireguard netdev is still
alive at this point, but it's being queued to a timer in a different
netdev that's already been freed (whether watchdog or otherwise in some
privdata). So, IOW, not a wireguard bug, right?

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:12     ` Eric Dumazet
@ 2023-05-23 16:41       ` Jakub Kicinski
  2023-05-23 16:42         ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-23 16:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jason A. Donenfeld, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, 23 May 2023 18:12:32 +0200 Eric Dumazet wrote:
> > Your timer had the pleasure of getting queued _after_ a dead watchdog
> > timer, no? IOW it tries to update the ->next pointer of a queued
> > watchdog timer. We should probably do:
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 374d38fb8b9d..f3ed20ebcf5a 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -10389,6 +10389,8 @@ void netdev_run_todo(void)
> >                 WARN_ON(rcu_access_pointer(dev->ip_ptr));
> >                 WARN_ON(rcu_access_pointer(dev->ip6_ptr));
> >
> > +               WARN_ON(timer_shutdown_sync(&dev->watchdog_timer));
> > +
> >                 if (dev->priv_destructor)
> >                         dev->priv_destructor(dev);
> >                 if (dev->needs_free_netdev)
> >
> > to catch how that watchdog_timer is getting queued. Would that make
> > sense, Eric?  
> 
> Would this case be catched at the time the device is freed ?
> 
> (CONFIG_DEBUG_OBJECTS_FREE=y or something)

It should, no idea why it isn't. Looking thru the code now I don't see
any obvious gaps where timer object is on a list but not active :S
There's no way to get a vmcore from syzbot, right? :)

Also I thought the shutdown leads to a warning when someone tries to
schedule the dead timer but in fact add_timer() just exits cleanly.
So the shutdown won't help us find the culprit :(

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:41       ` Jakub Kicinski
@ 2023-05-23 16:42         ` Jason A. Donenfeld
  2023-05-23 16:47           ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 16:42 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Eric Dumazet, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, May 23, 2023 at 6:41 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 23 May 2023 18:12:32 +0200 Eric Dumazet wrote:
> > > Your timer had the pleasure of getting queued _after_ a dead watchdog
> > > timer, no? IOW it tries to update the ->next pointer of a queued
> > > watchdog timer. We should probably do:
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 374d38fb8b9d..f3ed20ebcf5a 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -10389,6 +10389,8 @@ void netdev_run_todo(void)
> > >                 WARN_ON(rcu_access_pointer(dev->ip_ptr));
> > >                 WARN_ON(rcu_access_pointer(dev->ip6_ptr));
> > >
> > > +               WARN_ON(timer_shutdown_sync(&dev->watchdog_timer));
> > > +
> > >                 if (dev->priv_destructor)
> > >                         dev->priv_destructor(dev);
> > >                 if (dev->needs_free_netdev)
> > >
> > > to catch how that watchdog_timer is getting queued. Would that make
> > > sense, Eric?
> >
> > Would this case be catched at the time the device is freed ?
> >
> > (CONFIG_DEBUG_OBJECTS_FREE=y or something)
>
> It should, no idea why it isn't. Looking thru the code now I don't see
> any obvious gaps where timer object is on a list but not active :S
> There's no way to get a vmcore from syzbot, right? :)
>
> Also I thought the shutdown leads to a warning when someone tries to
> schedule the dead timer but in fact add_timer() just exits cleanly.
> So the shutdown won't help us find the culprit :(

Worth noting that it could also be caused by adding to a dead timer
anywhere in priv_data of another netdev, not just the sole timer_list
in net_device.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:14     ` Jason A. Donenfeld
@ 2023-05-23 16:46       ` Jakub Kicinski
  2023-05-23 16:47         ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-23 16:46 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: edumazet, syzbot, netdev, syzkaller-bugs, davem, linux-kernel,
	pabeni, wireguard, jann

On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote:
> So, IOW, not a wireguard bug, right?

What's slightly concerning is that there aren't any other timers
leading to

  KASAN: slab-use-after-free Write in enqueue_timer

:( If WG was just an innocent bystander there should be, right?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:42         ` Jason A. Donenfeld
@ 2023-05-23 16:47           ` Jakub Kicinski
  2023-05-23 17:01             ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-23 16:47 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Eric Dumazet, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote:
> > It should, no idea why it isn't. Looking thru the code now I don't see
> > any obvious gaps where timer object is on a list but not active :S
> > There's no way to get a vmcore from syzbot, right? :)
> >
> > Also I thought the shutdown leads to a warning when someone tries to
> > schedule the dead timer but in fact add_timer() just exits cleanly.
> > So the shutdown won't help us find the culprit :(  
> 
> Worth noting that it could also be caused by adding to a dead timer
> anywhere in priv_data of another netdev, not just the sole timer_list
> in net_device.

Oh, I thought you zero'ed in on the watchdog based on offsets.
Still, object debug should track all timers in the slab and complain
on the free path.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:46       ` Jakub Kicinski
@ 2023-05-23 16:47         ` Jason A. Donenfeld
  2023-05-23 17:16           ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 16:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: edumazet, syzbot, netdev, syzkaller-bugs, davem, linux-kernel,
	pabeni, wireguard, jann

On Tue, May 23, 2023 at 6:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote:
> > So, IOW, not a wireguard bug, right?
>
> What's slightly concerning is that there aren't any other timers
> leading to
>
>   KASAN: slab-use-after-free Write in enqueue_timer
>
> :( If WG was just an innocent bystander there should be, right?

Well, WG does mod this timer for every single packet in its RX path.
So that's bound to turn things up I suppose.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:47           ` Jakub Kicinski
@ 2023-05-23 17:01             ` Jason A. Donenfeld
  2023-05-23 17:05               ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 17:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Eric Dumazet, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote:
> On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote:
> > > It should, no idea why it isn't. Looking thru the code now I don't see
> > > any obvious gaps where timer object is on a list but not active :S
> > > There's no way to get a vmcore from syzbot, right? :)
> > >
> > > Also I thought the shutdown leads to a warning when someone tries to
> > > schedule the dead timer but in fact add_timer() just exits cleanly.
> > > So the shutdown won't help us find the culprit :(  
> > 
> > Worth noting that it could also be caused by adding to a dead timer
> > anywhere in priv_data of another netdev, not just the sole timer_list
> > in net_device.
> 
> Oh, I thought you zero'ed in on the watchdog based on offsets.
> Still, object debug should track all timers in the slab and complain
> on the free path.

No, I mentioned watchdog because it's the only timer_list in struct
net_device.

Offset analysis is an interesting idea though. Look at this:

> The buggy address belongs to the object at ffff88801ecc0000
>  which belongs to the cache kmalloc-cg-8k of size 8192
> The buggy address is located 5376 bytes inside of
>  freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000)

IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80
and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that
the address offset is just after what wg uses?

Hm.

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 17:01             ` Jason A. Donenfeld
@ 2023-05-23 17:05               ` Eric Dumazet
  2023-05-23 17:07                 ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2023-05-23 17:05 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Jakub Kicinski, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, May 23, 2023 at 7:01 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote:
> > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote:
> > > > It should, no idea why it isn't. Looking thru the code now I don't see
> > > > any obvious gaps where timer object is on a list but not active :S
> > > > There's no way to get a vmcore from syzbot, right? :)
> > > >
> > > > Also I thought the shutdown leads to a warning when someone tries to
> > > > schedule the dead timer but in fact add_timer() just exits cleanly.
> > > > So the shutdown won't help us find the culprit :(
> > >
> > > Worth noting that it could also be caused by adding to a dead timer
> > > anywhere in priv_data of another netdev, not just the sole timer_list
> > > in net_device.
> >
> > Oh, I thought you zero'ed in on the watchdog based on offsets.
> > Still, object debug should track all timers in the slab and complain
> > on the free path.
>
> No, I mentioned watchdog because it's the only timer_list in struct
> net_device.
>
> Offset analysis is an interesting idea though. Look at this:
>
> > The buggy address belongs to the object at ffff88801ecc0000
> >  which belongs to the cache kmalloc-cg-8k of size 8192
> > The buggy address is located 5376 bytes inside of
> >  freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000)
>
> IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80
> and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that
> the address offset is just after what wg uses?


Note that the syzkaller report mentioned:

alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
 usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698
 usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396
 really_probe+0x294/0xc30 drivers/base/dd.c:658
 __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800
 driver_probe_device+0x50/0x420 drivers/base/dd.c:830
 __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958

So maybe an usbnet driver has a timer_list in its priv_data.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 17:05               ` Eric Dumazet
@ 2023-05-23 17:07                 ` Eric Dumazet
  2023-05-24  8:24                   ` Dmitry Vyukov
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2023-05-23 17:07 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Jakub Kicinski, syzbot, netdev, syzkaller-bugs, davem,
	linux-kernel, pabeni, wireguard, jann

On Tue, May 23, 2023 at 7:05 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, May 23, 2023 at 7:01 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote:
> > > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote:
> > > > > It should, no idea why it isn't. Looking thru the code now I don't see
> > > > > any obvious gaps where timer object is on a list but not active :S
> > > > > There's no way to get a vmcore from syzbot, right? :)
> > > > >
> > > > > Also I thought the shutdown leads to a warning when someone tries to
> > > > > schedule the dead timer but in fact add_timer() just exits cleanly.
> > > > > So the shutdown won't help us find the culprit :(
> > > >
> > > > Worth noting that it could also be caused by adding to a dead timer
> > > > anywhere in priv_data of another netdev, not just the sole timer_list
> > > > in net_device.
> > >
> > > Oh, I thought you zero'ed in on the watchdog based on offsets.
> > > Still, object debug should track all timers in the slab and complain
> > > on the free path.
> >
> > No, I mentioned watchdog because it's the only timer_list in struct
> > net_device.
> >
> > Offset analysis is an interesting idea though. Look at this:
> >
> > > The buggy address belongs to the object at ffff88801ecc0000
> > >  which belongs to the cache kmalloc-cg-8k of size 8192
> > > The buggy address is located 5376 bytes inside of
> > >  freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000)
> >
> > IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80
> > and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that
> > the address offset is just after what wg uses?
>
>
> Note that the syzkaller report mentioned:
>
> alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
>  usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698
>  usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396
>  really_probe+0x294/0xc30 drivers/base/dd.c:658
>  __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800
>  driver_probe_device+0x50/0x420 drivers/base/dd.c:830
>  __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958
>
> So maybe an usbnet driver has a timer_list in its priv_data.

struct usbnet {
...
struct timer_list   delay;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 16:47         ` Jason A. Donenfeld
@ 2023-05-23 17:16           ` Jason A. Donenfeld
  2023-05-23 17:28             ` Jason A. Donenfeld
  0 siblings, 1 reply; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 17:16 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: edumazet, syzbot, netdev, syzkaller-bugs, davem, linux-kernel,
	pabeni, wireguard, jann

On Tue, May 23, 2023 at 06:47:41PM +0200, Jason A. Donenfeld wrote:
> On Tue, May 23, 2023 at 6:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote:
> > > So, IOW, not a wireguard bug, right?
> >
> > What's slightly concerning is that there aren't any other timers
> > leading to
> >
> >   KASAN: slab-use-after-free Write in enqueue_timer
> >
> > :( If WG was just an innocent bystander there should be, right?
> 
> Well, WG does mod this timer for every single packet in its RX path.
> So that's bound to turn things up I suppose.

Here's one that is seemingly the same -- enqueuing a timer to a freed
base -- with the allocation and free being the same netdev core
function, but the UaF trigger for it is a JBD2 transaction thing:
https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000
No WG at all in it, but there's still the mysterious 5376 value...

Jason

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 17:16           ` Jason A. Donenfeld
@ 2023-05-23 17:28             ` Jason A. Donenfeld
  0 siblings, 0 replies; 18+ messages in thread
From: Jason A. Donenfeld @ 2023-05-23 17:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: edumazet, syzbot, netdev, syzkaller-bugs, davem, linux-kernel,
	pabeni, wireguard, jann

On Tue, May 23, 2023 at 07:16:20PM +0200, Jason A. Donenfeld wrote:
> On Tue, May 23, 2023 at 06:47:41PM +0200, Jason A. Donenfeld wrote:
> > On Tue, May 23, 2023 at 6:46 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Tue, 23 May 2023 18:14:18 +0200 Jason A. Donenfeld wrote:
> > > > So, IOW, not a wireguard bug, right?
> > >
> > > What's slightly concerning is that there aren't any other timers
> > > leading to
> > >
> > >   KASAN: slab-use-after-free Write in enqueue_timer
> > >
> > > :( If WG was just an innocent bystander there should be, right?
> > 
> > Well, WG does mod this timer for every single packet in its RX path.
> > So that's bound to turn things up I suppose.
> 
> Here's one that is seemingly the same -- enqueuing a timer to a freed
> base -- with the allocation and free being the same netdev core
> function, but the UaF trigger for it is a JBD2 transaction thing:
> https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000
> No WG at all in it, but there's still the mysterious 5376 value...

In this one, you see the free happens in some infiniband code.  Looking
at ipoib_dev_priv, and going to the member at net_device+ipoib_dev_priv,
we get this at 5320:

        struct delayed_work        neigh_reap_task;

5376-5320=56, which doesn't quite put us at the timer_list. Close but no
cigar?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-23 17:07                 ` Eric Dumazet
@ 2023-05-24  8:24                   ` Dmitry Vyukov
  2023-05-24 15:33                     ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Vyukov @ 2023-05-24  8:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jason A. Donenfeld, Jakub Kicinski, syzbot, netdev,
	syzkaller-bugs, davem, linux-kernel, pabeni, wireguard, jann

On Tue, 23 May 2023 at 19:07, 'Eric Dumazet' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> On Tue, May 23, 2023 at 7:05 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, May 23, 2023 at 7:01 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > >
> > > On Tue, May 23, 2023 at 09:47:36AM -0700, Jakub Kicinski wrote:
> > > > On Tue, 23 May 2023 18:42:53 +0200 Jason A. Donenfeld wrote:
> > > > > > It should, no idea why it isn't. Looking thru the code now I don't see
> > > > > > any obvious gaps where timer object is on a list but not active :S
> > > > > > There's no way to get a vmcore from syzbot, right? :)
> > > > > >
> > > > > > Also I thought the shutdown leads to a warning when someone tries to
> > > > > > schedule the dead timer but in fact add_timer() just exits cleanly.
> > > > > > So the shutdown won't help us find the culprit :(
> > > > >
> > > > > Worth noting that it could also be caused by adding to a dead timer
> > > > > anywhere in priv_data of another netdev, not just the sole timer_list
> > > > > in net_device.
> > > >
> > > > Oh, I thought you zero'ed in on the watchdog based on offsets.
> > > > Still, object debug should track all timers in the slab and complain
> > > > on the free path.
> > >
> > > No, I mentioned watchdog because it's the only timer_list in struct
> > > net_device.
> > >
> > > Offset analysis is an interesting idea though. Look at this:
> > >
> > > > The buggy address belongs to the object at ffff88801ecc0000
> > > >  which belongs to the cache kmalloc-cg-8k of size 8192
> > > > The buggy address is located 5376 bytes inside of
> > > >  freed 8192-byte region [ffff88801ecc0000, ffff88801ecc2000)
> > >
> > > IDA says that for syzkaller's vmlinux, net_device has a size of 0xc80
> > > and wg_device has a size of 0x880. 0xc80+0x880=5376. Coincidence that
> > > the address offset is just after what wg uses?
> >
> >
> > Note that the syzkaller report mentioned:
> >
> > alloc_netdev_mqs+0x89/0xf30 net/core/dev.c:10626
> >  usbnet_probe+0x196/0x2770 drivers/net/usb/usbnet.c:1698
> >  usb_probe_interface+0x5c4/0xb00 drivers/usb/core/driver.c:396
> >  really_probe+0x294/0xc30 drivers/base/dd.c:658
> >  __driver_probe_device+0x1a2/0x3d0 drivers/base/dd.c:800
> >  driver_probe_device+0x50/0x420 drivers/base/dd.c:830
> >  __device_attach_driver+0x2d3/0x520 drivers/base/dd.c:958
> >
> > So maybe an usbnet driver has a timer_list in its priv_data.
>
> struct usbnet {
> ...
> struct timer_list   delay;

FWIW There are more report examples on the dashboard.
There are some that don't mention wireguard nor usbnet, e.g.:
https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000
So that's probably red herring. But they all seem to mention alloc_netdev_mqs.
Let's do for now:
#syz set subsystems: net

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-24  8:24                   ` Dmitry Vyukov
@ 2023-05-24 15:33                     ` Jakub Kicinski
  2023-05-24 15:39                       ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-24 15:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, Jason A. Donenfeld, syzbot, netdev, syzkaller-bugs,
	davem, linux-kernel, pabeni, wireguard, jann

On Wed, 24 May 2023 10:24:31 +0200 Dmitry Vyukov wrote:
> FWIW There are more report examples on the dashboard.
> There are some that don't mention wireguard nor usbnet, e.g.:
> https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000
> So that's probably red herring. But they all seem to mention alloc_netdev_mqs.

While we have you, let me ask about the possibility of having vmcore
access - I think it'd be very useful to solve this mystery. 
With a bit of luck the timer still has the function set.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer
  2023-05-24 15:33                     ` Jakub Kicinski
@ 2023-05-24 15:39                       ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2023-05-24 15:39 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Dumazet, Jason A. Donenfeld, syzbot, netdev, syzkaller-bugs,
	davem, linux-kernel, pabeni, wireguard, jann

On Wed, 24 May 2023 08:33:41 -0700 Jakub Kicinski wrote:
> On Wed, 24 May 2023 10:24:31 +0200 Dmitry Vyukov wrote:
> > FWIW There are more report examples on the dashboard.
> > There are some that don't mention wireguard nor usbnet, e.g.:
> > https://syzkaller.appspot.com/text?tag=CrashReport&x=17dd2446280000
> > So that's probably red herring. But they all seem to mention alloc_netdev_mqs.  
> 
> While we have you, let me ask about the possibility of having vmcore
> access - I think it'd be very useful to solve this mystery. 
> With a bit of luck the timer still has the function set.

I take that back.

Memory state around the buggy address:
 ffff88801ecc1400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88801ecc1480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88801ecc1500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb  
                   ^
 ffff88801ecc1580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88801ecc1600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-05-24 15:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-30 18:01 [syzbot] [wireguard?] KASAN: slab-use-after-free Write in enqueue_timer syzbot
2023-05-23 15:46 ` Jason A. Donenfeld
2023-05-23 16:05   ` Jakub Kicinski
2023-05-23 16:12     ` Eric Dumazet
2023-05-23 16:41       ` Jakub Kicinski
2023-05-23 16:42         ` Jason A. Donenfeld
2023-05-23 16:47           ` Jakub Kicinski
2023-05-23 17:01             ` Jason A. Donenfeld
2023-05-23 17:05               ` Eric Dumazet
2023-05-23 17:07                 ` Eric Dumazet
2023-05-24  8:24                   ` Dmitry Vyukov
2023-05-24 15:33                     ` Jakub Kicinski
2023-05-24 15:39                       ` Jakub Kicinski
2023-05-23 16:14     ` Jason A. Donenfeld
2023-05-23 16:46       ` Jakub Kicinski
2023-05-23 16:47         ` Jason A. Donenfeld
2023-05-23 17:16           ` Jason A. Donenfeld
2023-05-23 17:28             ` Jason A. Donenfeld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.