linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net: use-after-free in worker_thread
@ 2016-12-03 12:56 Andrey Konovalov
  2016-12-03 12:58 ` Andrey Konovalov
  2016-12-03 17:41 ` Cong Wang
  0 siblings, 2 replies; 13+ messages in thread
From: Andrey Konovalov @ 2016-12-03 12:56 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML

Hi!

I'm seeing lots of the following error reports while running the
syzkaller fuzzer.

Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).

==================================================================
BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774

page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
index:0xffff880067f39c10 compound_mapcount: 0
flags: 0x500000000004080(slab|head)
page dumped because: kasan: bad access detected

CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
 ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
 ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<     inline     >] describe_address mm/kasan/report.c:262
 [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
 [<     inline     >] kasan_report mm/kasan/report.c:390
 [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:411
 [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
 [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The buggy address belongs to the object at ffff880067f3e6d0
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address ffff880067f3ecd8 is located 1544 bytes inside
 of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)

Freed by task 0:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
 [<     inline     >] slab_free_hook mm/slub.c:1352
 [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
 [<     inline     >] slab_free mm/slub.c:2951
 [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
 [<     inline     >] sk_prot_free net/core/sock.c:1372
 [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

Allocated by task 10748:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
 [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
 [<     inline     >] kmalloc include/linux/slab.h:495
 [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
 [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
 [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
 [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
 [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
 [<     inline     >] sock_create net/socket.c:1208
 [<     inline     >] SYSC_socket net/socket.c:1238
 [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
 [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

Memory state around the buggy address:
 ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                    ^
 ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 12:56 net: use-after-free in worker_thread Andrey Konovalov
@ 2016-12-03 12:58 ` Andrey Konovalov
  2016-12-03 13:05   ` Andrey Konovalov
  2016-12-03 17:41 ` Cong Wang
  1 sibling, 1 reply; 13+ messages in thread
From: Andrey Konovalov @ 2016-12-03 12:58 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML
  Cc: Kostya Serebryany, Dmitry Vyukov, syzkaller

+syzkaller@googlegroups.com

On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> Hi!
>
> I'm seeing lots of the following error reports while running the
> syzkaller fuzzer.
>
> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>
> ==================================================================
> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>
> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> index:0xffff880067f39c10 compound_mapcount: 0
> flags: 0x500000000004080(slab|head)
> page dumped because: kasan: bad access detected
>
> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [<     inline     >] describe_address mm/kasan/report.c:262
>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>  [<     inline     >] kasan_report mm/kasan/report.c:390
>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:411
>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>
> The buggy address belongs to the object at ffff880067f3e6d0
>  which belongs to the cache kmalloc-2048 of size 2048
> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
>
> Freed by task 0:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>  [<     inline     >] slab_free_hook mm/slub.c:1352
>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>  [<     inline     >] slab_free mm/slub.c:2951
>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>
> Allocated by task 10748:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>  [<     inline     >] kmalloc include/linux/slab.h:495
>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>  [<     inline     >] sock_create net/socket.c:1208
>  [<     inline     >] SYSC_socket net/socket.c:1238
>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> Memory state around the buggy address:
>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                     ^
>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 12:58 ` Andrey Konovalov
@ 2016-12-03 13:05   ` Andrey Konovalov
  2016-12-03 13:49     ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Andrey Konovalov @ 2016-12-03 13:05 UTC (permalink / raw)
  To: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML
  Cc: Kostya Serebryany, Dmitry Vyukov, syzkaller

On Sat, Dec 3, 2016 at 1:58 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> +syzkaller@googlegroups.com
>
> On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi!
>>
>> I'm seeing lots of the following error reports while running the
>> syzkaller fuzzer.
>>
>> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
>> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>>
>> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
>> index:0xffff880067f39c10 compound_mapcount: 0
>> flags: 0x500000000004080(slab|head)
>> page dumped because: kasan: bad access detected
>>
>> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
>> Call Trace:
>>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>>  [<     inline     >] describe_address mm/kasan/report.c:262
>>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>>  [<     inline     >] kasan_report mm/kasan/report.c:390
>>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
>> mm/kasan/report.c:411
>>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>>
>> The buggy address belongs to the object at ffff880067f3e6d0
>>  which belongs to the cache kmalloc-2048 of size 2048
>> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
>>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
>>
>> Freed by task 0:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>>  [<     inline     >] slab_free_hook mm/slub.c:1352
>>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>>  [<     inline     >] slab_free mm/slub.c:2951
>>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>>  [<     inline     >] sock_put include/net/sock.h:1591
>>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>>
>> Allocated by task 10748:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>>  [<     inline     >] kmalloc include/linux/slab.h:495
>>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>>  [<     inline     >] sock_create net/socket.c:1208
>>  [<     inline     >] SYSC_socket net/socket.c:1238
>>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> Memory state around the buggy address:
>>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                     ^
>>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================

Here is another report that looks related:

==================================================================
BUG: KASAN: use-after-free in __list_add+0x236/0x2c0
Read of size 8 at addr ffff880068854780 by task ksoftirqd/2/20

page:ffffea0001a21400 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x500000000004080(slab|head)
page dumped because: kasan: bad access detected

CPU: 2 PID: 20 Comm: ksoftirqd/2 Not tainted 4.9.0-rc7+ #66
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88006daf6578 ffffffff81f882da ffffffff6daf62a0 1ffff1000db5ec42
 ffffed000db5ec3a dffffc0000000000 0000000041b58ab3 ffffffff8541e198
 ffffffff81f88048 ffff88006dac3610 ffff88006daf6300 0000000000000802
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<     inline     >] describe_address mm/kasan/report.c:262
 [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
 [<     inline     >] kasan_report mm/kasan/report.c:390
 [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:411
 [<ffffffff8200c166>] __list_add+0x236/0x2c0 lib/list_debug.c:30
 [<     inline     >] list_add_tail include/linux/list.h:77
 [<ffffffff8131e295>] insert_work+0x175/0x4b0 kernel/workqueue.c:1298
 [<ffffffff8131eb52>] __queue_work+0x582/0x11e0 kernel/workqueue.c:1459
 [<ffffffff81320c21>] queue_work_on+0x231/0x240 kernel/workqueue.c:1484
 [<     inline     >] queue_work include/linux/workqueue.h:474
 [<     inline     >] schedule_work include/linux/workqueue.h:532
 [<ffffffff8348c8cc>] netlink_sock_destruct+0x23c/0x2d0
net/netlink/af_netlink.c:361
 [<ffffffff831e9ce1>] __sk_destruct+0xe1/0x6e0 net/core/sock.c:1423
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
 [<ffffffff812d38c0>] run_ksoftirqd+0x20/0x60 kernel/softirq.c:676
 [<ffffffff81350132>] smpboot_thread_fn+0x562/0x860 kernel/smpboot.c:163
 [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

The buggy address belongs to the object at ffff880068854170
 which belongs to the cache kmalloc-2048 of size 2048
The buggy address ffff880068854780 is located 1552 bytes inside
 of 2048-byte region [ffff880068854170, ffff880068854970)

Freed by task 20:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
 [<     inline     >] slab_free_hook mm/slub.c:1352
 [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
 [<     inline     >] slab_free mm/slub.c:2951
 [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
 [<     inline     >] sk_prot_free net/core/sock.c:1372
 [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
 [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
 [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
 [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
 [<     inline     >] sock_put include/net/sock.h:1591
 [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
 [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
 [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
 [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
 [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
 [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
 [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284

Allocated by task 9480:
 [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [<     inline     >] set_track mm/kasan/kasan.c:507
 [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
 [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
 [<     inline     >] kmalloc include/linux/slab.h:495
 [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
 [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
 [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
 [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
 [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
 [<     inline     >] sock_create net/socket.c:1208
 [<     inline     >] SYSC_socket net/socket.c:1238
 [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
 [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

Memory state around the buggy address:
 ffff880068854680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880068854700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880068854780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff880068854800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880068854880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 13:05   ` Andrey Konovalov
@ 2016-12-03 13:49     ` Eric Dumazet
  2016-12-03 15:39       ` Andrey Konovalov
  2016-12-05  7:21       ` Herbert Xu
  0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2016-12-03 13:49 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML, Kostya Serebryany, Dmitry Vyukov,
	syzkaller

On Sat, 2016-12-03 at 14:05 +0100, Andrey Konovalov wrote:
> On Sat, Dec 3, 2016 at 1:58 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> > +syzkaller@googlegroups.com
> >
> > On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
> >> Hi!
> >>
> >> I'm seeing lots of the following error reports while running the
> >> syzkaller fuzzer.
> >>
> >> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> >> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
> >>
> >> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> >> index:0xffff880067f39c10 compound_mapcount: 0
> >> flags: 0x500000000004080(slab|head)
> >> page dumped because: kasan: bad access detected
> >>
> >> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
> >>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
> >>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> >> Call Trace:
> >>  [<     inline     >] __dump_stack lib/dump_stack.c:15
> >>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> >>  [<     inline     >] describe_address mm/kasan/report.c:262
> >>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
> >>  [<     inline     >] kasan_report mm/kasan/report.c:390
> >>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> >> mm/kasan/report.c:411
> >>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
> >>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
> >>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> >>
> >> The buggy address belongs to the object at ffff880067f3e6d0
> >>  which belongs to the cache kmalloc-2048 of size 2048
> >> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
> >>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
> >>
> >> Freed by task 0:
> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
> >>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
> >>  [<     inline     >] slab_free_hook mm/slub.c:1352
> >>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
> >>  [<     inline     >] slab_free mm/slub.c:2951
> >>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
> >>  [<     inline     >] sk_prot_free net/core/sock.c:1372
> >>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
> >>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
> >>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
> >>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
> >>  [<     inline     >] sock_put include/net/sock.h:1591
> >>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
> >>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
> >>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
> >>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
> >>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
> >>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
> >>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
> >>
> >> Allocated by task 10748:
> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
> >>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
> >>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
> >>  [<     inline     >] kmalloc include/linux/slab.h:495
> >>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
> >>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
> >>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
> >>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
> >>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
> >>  [<     inline     >] sock_create net/socket.c:1208
> >>  [<     inline     >] SYSC_socket net/socket.c:1238
> >>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
> >>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> >>
> >> Memory state around the buggy address:
> >>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>                                                     ^
> >>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >> ==================================================================
> 
> Here is another report that looks related:
> 
> ==================================================================
> BUG: KASAN: use-after-free in __list_add+0x236/0x2c0
> Read of size 8 at addr ffff880068854780 by task ksoftirqd/2/20
> 
> page:ffffea0001a21400 count:1 mapcount:0 mapping:          (null)
> index:0x0 compound_mapcount: 0
> flags: 0x500000000004080(slab|head)
> page dumped because: kasan: bad access detected
> 
> CPU: 2 PID: 20 Comm: ksoftirqd/2 Not tainted 4.9.0-rc7+ #66
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffff88006daf6578 ffffffff81f882da ffffffff6daf62a0 1ffff1000db5ec42
>  ffffed000db5ec3a dffffc0000000000 0000000041b58ab3 ffffffff8541e198
>  ffffffff81f88048 ffff88006dac3610 ffff88006daf6300 0000000000000802
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [<     inline     >] describe_address mm/kasan/report.c:262
>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>  [<     inline     >] kasan_report mm/kasan/report.c:390
>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:411
>  [<ffffffff8200c166>] __list_add+0x236/0x2c0 lib/list_debug.c:30
>  [<     inline     >] list_add_tail include/linux/list.h:77
>  [<ffffffff8131e295>] insert_work+0x175/0x4b0 kernel/workqueue.c:1298
>  [<ffffffff8131eb52>] __queue_work+0x582/0x11e0 kernel/workqueue.c:1459
>  [<ffffffff81320c21>] queue_work_on+0x231/0x240 kernel/workqueue.c:1484
>  [<     inline     >] queue_work include/linux/workqueue.h:474
>  [<     inline     >] schedule_work include/linux/workqueue.h:532
>  [<ffffffff8348c8cc>] netlink_sock_destruct+0x23c/0x2d0
> net/netlink/af_netlink.c:361
>  [<ffffffff831e9ce1>] __sk_destruct+0xe1/0x6e0 net/core/sock.c:1423
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>  [<ffffffff812d38c0>] run_ksoftirqd+0x20/0x60 kernel/softirq.c:676
>  [<ffffffff81350132>] smpboot_thread_fn+0x562/0x860 kernel/smpboot.c:163
>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> 
> The buggy address belongs to the object at ffff880068854170
>  which belongs to the cache kmalloc-2048 of size 2048
> The buggy address ffff880068854780 is located 1552 bytes inside
>  of 2048-byte region [ffff880068854170, ffff880068854970)
> 
> Freed by task 20:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>  [<     inline     >] slab_free_hook mm/slub.c:1352
>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>  [<     inline     >] slab_free mm/slub.c:2951
>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>  [<     inline     >] sock_put include/net/sock.h:1591
>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
> 
> Allocated by task 9480:
>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>  [<     inline     >] set_track mm/kasan/kasan.c:507
>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>  [<     inline     >] kmalloc include/linux/slab.h:495
>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>  [<     inline     >] sock_create net/socket.c:1208
>  [<     inline     >] SYSC_socket net/socket.c:1238
>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
> 
> Memory state around the buggy address:
>  ffff880068854680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880068854700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >ffff880068854780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                    ^
>  ffff880068854800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff880068854880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================


Hi Andrey. Please give us some rest during the week end ;)

This looks like the bug I mentioned earlier for which I have a pending
patch ? Can you try it ?

The RCU conversion done by Thomas was quite buggy.

Thanks.


diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5ebe9db39ec6c72708628bc48efad9f0e680..c348c4a5ea4ecc05dcc9e2afbc069ab65a1a57fe 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -475,8 +475,8 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
 
 	rcu_read_lock();
 	sk = __netlink_lookup(table, portid, net);
-	if (sk)
-		sock_hold(sk);
+	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
+		sk = NULL;
 	rcu_read_unlock();
 
 	return sk;
@@ -600,6 +600,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
 	}
 	init_waitqueue_head(&nlk->wait);
 
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sk->sk_destruct = netlink_sock_destruct;
 	sk->sk_protocol = protocol;
 	return 0;
@@ -664,13 +665,6 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
-{
-	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
-
-	sock_put(&nlk->sk);
-}
-
 static int netlink_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
@@ -743,7 +737,7 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+	sock_put(sk);
 	return 0;
 }
 
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index 4fdb3831897775547f77c069a8018c0d2a253c8c..988d1a02487e37b7efd4872dd0ab6d230e5a2021 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -33,7 +33,6 @@ struct netlink_sock {
 	struct module		*module;
 
 	struct rhash_head	node;
-	struct rcu_head		rcu;
 	struct work_struct	work;
 };
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 13:49     ` Eric Dumazet
@ 2016-12-03 15:39       ` Andrey Konovalov
  2016-12-05  7:21       ` Herbert Xu
  1 sibling, 0 replies; 13+ messages in thread
From: Andrey Konovalov @ 2016-12-03 15:39 UTC (permalink / raw)
  To: syzkaller
  Cc: David S. Miller, Cong Wang, Johannes Berg, Florian Westphal,
	Herbert Xu, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML, Kostya Serebryany, Dmitry Vyukov

[-- Attachment #1: Type: text/plain, Size: 14438 bytes --]

On Sat, Dec 3, 2016 at 2:49 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Sat, 2016-12-03 at 14:05 +0100, Andrey Konovalov wrote:
>> On Sat, Dec 3, 2016 at 1:58 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> > +syzkaller@googlegroups.com
>> >
>> > On Sat, Dec 3, 2016 at 1:56 PM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> >> Hi!
>> >>
>> >> I'm seeing lots of the following error reports while running the
>> >> syzkaller fuzzer.
>> >>
>> >> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>> >>
>> >> ==================================================================
>> >> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
>> >> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>> >>
>> >> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
>> >> index:0xffff880067f39c10 compound_mapcount: 0
>> >> flags: 0x500000000004080(slab|head)
>> >> page dumped because: kasan: bad access detected
>> >>
>> >> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
>> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> >>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>> >>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>> >>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
>> >> Call Trace:
>> >>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>> >>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>> >>  [<     inline     >] describe_address mm/kasan/report.c:262
>> >>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>> >>  [<     inline     >] kasan_report mm/kasan/report.c:390
>> >>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
>> >> mm/kasan/report.c:411
>> >>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>> >>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>> >>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>> >>
>> >> The buggy address belongs to the object at ffff880067f3e6d0
>> >>  which belongs to the cache kmalloc-2048 of size 2048
>> >> The buggy address ffff880067f3ecd8 is located 1544 bytes inside
>> >>  of 2048-byte region [ffff880067f3e6d0, ffff880067f3eed0)
>> >>
>> >> Freed by task 0:
>> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
>> >>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>> >>  [<     inline     >] slab_free_hook mm/slub.c:1352
>> >>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>> >>  [<     inline     >] slab_free mm/slub.c:2951
>> >>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>> >>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>> >>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>> >>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>> >>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>> >>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>> >>  [<     inline     >] sock_put include/net/sock.h:1591
>> >>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>> >>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>> >>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>> >>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>> >>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>> >>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>> >>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>> >>
>> >> Allocated by task 10748:
>> >>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>> >>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>> >>  [<     inline     >] set_track mm/kasan/kasan.c:507
>> >>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>> >>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>> >>  [<     inline     >] kmalloc include/linux/slab.h:495
>> >>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>> >>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>> >>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>> >>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>> >>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>> >>  [<     inline     >] sock_create net/socket.c:1208
>> >>  [<     inline     >] SYSC_socket net/socket.c:1238
>> >>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>> >>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>> >>
>> >> Memory state around the buggy address:
>> >>  ffff880067f3eb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >>  ffff880067f3ec00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >>>ffff880067f3ec80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >>                                                     ^
>> >>  ffff880067f3ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >>  ffff880067f3ed80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >> ==================================================================
>>
>> Here is another report that looks related:
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in __list_add+0x236/0x2c0
>> Read of size 8 at addr ffff880068854780 by task ksoftirqd/2/20
>>
>> page:ffffea0001a21400 count:1 mapcount:0 mapping:          (null)
>> index:0x0 compound_mapcount: 0
>> flags: 0x500000000004080(slab|head)
>> page dumped because: kasan: bad access detected
>>
>> CPU: 2 PID: 20 Comm: ksoftirqd/2 Not tainted 4.9.0-rc7+ #66
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>  ffff88006daf6578 ffffffff81f882da ffffffff6daf62a0 1ffff1000db5ec42
>>  ffffed000db5ec3a dffffc0000000000 0000000041b58ab3 ffffffff8541e198
>>  ffffffff81f88048 ffff88006dac3610 ffff88006daf6300 0000000000000802
>> Call Trace:
>>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>>  [<     inline     >] describe_address mm/kasan/report.c:262
>>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>>  [<     inline     >] kasan_report mm/kasan/report.c:390
>>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
>> mm/kasan/report.c:411
>>  [<ffffffff8200c166>] __list_add+0x236/0x2c0 lib/list_debug.c:30
>>  [<     inline     >] list_add_tail include/linux/list.h:77
>>  [<ffffffff8131e295>] insert_work+0x175/0x4b0 kernel/workqueue.c:1298
>>  [<ffffffff8131eb52>] __queue_work+0x582/0x11e0 kernel/workqueue.c:1459
>>  [<ffffffff81320c21>] queue_work_on+0x231/0x240 kernel/workqueue.c:1484
>>  [<     inline     >] queue_work include/linux/workqueue.h:474
>>  [<     inline     >] schedule_work include/linux/workqueue.h:532
>>  [<ffffffff8348c8cc>] netlink_sock_destruct+0x23c/0x2d0
>> net/netlink/af_netlink.c:361
>>  [<ffffffff831e9ce1>] __sk_destruct+0xe1/0x6e0 net/core/sock.c:1423
>>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>>  [<     inline     >] sock_put include/net/sock.h:1591
>>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>>  [<ffffffff812d38c0>] run_ksoftirqd+0x20/0x60 kernel/softirq.c:676
>>  [<ffffffff81350132>] smpboot_thread_fn+0x562/0x860 kernel/smpboot.c:163
>>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>>
>> The buggy address belongs to the object at ffff880068854170
>>  which belongs to the cache kmalloc-2048 of size 2048
>> The buggy address ffff880068854780 is located 1552 bytes inside
>>  of 2048-byte region [ffff880068854170, ffff880068854970)
>>
>> Freed by task 20:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e4a53>] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:571
>>  [<     inline     >] slab_free_hook mm/slub.c:1352
>>  [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
>>  [<     inline     >] slab_free mm/slub.c:2951
>>  [<ffffffff817e0eb7>] kfree+0xe7/0x2b0 mm/slub.c:3871
>>  [<     inline     >] sk_prot_free net/core/sock.c:1372
>>  [<ffffffff831ea1c7>] __sk_destruct+0x5c7/0x6e0 net/core/sock.c:1445
>>  [<ffffffff831f3517>] sk_destruct+0x47/0x80 net/core/sock.c:1453
>>  [<ffffffff831f35a7>] __sk_free+0x57/0x230 net/core/sock.c:1461
>>  [<ffffffff831f37a3>] sk_free+0x23/0x30 net/core/sock.c:1472
>>  [<     inline     >] sock_put include/net/sock.h:1591
>>  [<ffffffff8348ca9c>] deferred_put_nlk_sk+0x2c/0x40 net/netlink/af_netlink.c:671
>>  [<     inline     >] __rcu_reclaim kernel/rcu/rcu.h:118
>>  [<ffffffff8146d42f>] rcu_do_batch.isra.67+0x8ff/0xc50 kernel/rcu/tree.c:2776
>>  [<     inline     >] invoke_rcu_callbacks kernel/rcu/tree.c:3040
>>  [<     inline     >] __rcu_process_callbacks kernel/rcu/tree.c:3007
>>  [<ffffffff8146e097>] rcu_process_callbacks+0x2b7/0xba0 kernel/rcu/tree.c:3024
>>  [<ffffffff84a2d08b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
>>
>> Allocated by task 9480:
>>  [<ffffffff81203526>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>>  [<ffffffff817e4173>] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
>>  [<     inline     >] set_track mm/kasan/kasan.c:507
>>  [<ffffffff817e43fd>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:598
>>  [<ffffffff817e0050>] __kmalloc+0xa0/0x2d0 mm/slub.c:3734
>>  [<     inline     >] kmalloc include/linux/slab.h:495
>>  [<ffffffff831e4c01>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1333
>>  [<ffffffff831efd15>] sk_alloc+0x105/0x1000 net/core/sock.c:1389
>>  [<ffffffff8348ad46>] __netlink_create+0x66/0x1d0 net/netlink/af_netlink.c:588
>>  [<ffffffff8348cdab>] netlink_create+0x2fb/0x500 net/netlink/af_netlink.c:647
>>  [<ffffffff831dd1d6>] __sock_create+0x4f6/0x880 net/socket.c:1168
>>  [<     inline     >] sock_create net/socket.c:1208
>>  [<     inline     >] SYSC_socket net/socket.c:1238
>>  [<ffffffff831dd799>] SyS_socket+0xf9/0x230 net/socket.c:1218
>>  [<ffffffff84a29fc1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> Memory state around the buggy address:
>>  ffff880068854680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880068854700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >ffff880068854780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                    ^
>>  ffff880068854800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>  ffff880068854880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>
>
> Hi Andrey. Please give us some rest during the week end ;)

Hi Eric,

Sorry, wanted to restart fuzzer on newer kernel and immediately
started getting enormous amount of crashes :)

>
> This looks like the bug I mentioned earlier for which I have a pending
> patch ? Can you try it ?

No, it seems that your patch doesn't help, this is apparently something else.

I've attached a reproducer.

Thanks!

>
> The RCU conversion done by Thomas was quite buggy.
>
> Thanks.
>
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 602e5ebe9db39ec6c72708628bc48efad9f0e680..c348c4a5ea4ecc05dcc9e2afbc069ab65a1a57fe 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -475,8 +475,8 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
>
>         rcu_read_lock();
>         sk = __netlink_lookup(table, portid, net);
> -       if (sk)
> -               sock_hold(sk);
> +       if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
> +               sk = NULL;
>         rcu_read_unlock();
>
>         return sk;
> @@ -600,6 +600,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
>         }
>         init_waitqueue_head(&nlk->wait);
>
> +       sock_set_flag(sk, SOCK_RCU_FREE);
>         sk->sk_destruct = netlink_sock_destruct;
>         sk->sk_protocol = protocol;
>         return 0;
> @@ -664,13 +665,6 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
>         goto out;
>  }
>
> -static void deferred_put_nlk_sk(struct rcu_head *head)
> -{
> -       struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
> -
> -       sock_put(&nlk->sk);
> -}
> -
>  static int netlink_release(struct socket *sock)
>  {
>         struct sock *sk = sock->sk;
> @@ -743,7 +737,7 @@ static int netlink_release(struct socket *sock)
>         local_bh_disable();
>         sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
>         local_bh_enable();
> -       call_rcu(&nlk->rcu, deferred_put_nlk_sk);
> +       sock_put(sk);
>         return 0;
>  }
>
> diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
> index 4fdb3831897775547f77c069a8018c0d2a253c8c..988d1a02487e37b7efd4872dd0ab6d230e5a2021 100644
> --- a/net/netlink/af_netlink.h
> +++ b/net/netlink/af_netlink.h
> @@ -33,7 +33,6 @@ struct netlink_sock {
>         struct module           *module;
>
>         struct rhash_head       node;
> -       struct rcu_head         rcu;
>         struct work_struct      work;
>  };
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: worker-crash-poc.c --]
[-- Type: text/x-csrc, Size: 8352 bytes --]

// autogenerated by syzkaller (http://github.com/google/syzkaller)

#ifndef __NR_socket
#define __NR_socket 41
#endif
#ifndef __NR_write
#define __NR_write 1
#endif
#ifndef __NR_readv
#define __NR_readv 19
#endif
#ifndef __NR_mmap
#define __NR_mmap 9
#endif

#define _GNU_SOURCE

#include <sys/ioctl.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <linux/capability.h>
#include <linux/if.h>
#include <linux/if_tun.h>
#include <linux/sched.h>
#include <net/if_arp.h>

#include <assert.h>
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

const int kFailStatus = 67;
const int kErrorStatus = 68;
const int kRetryStatus = 69;

__attribute__((noreturn)) void fail(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  exit(kFailStatus);
}

__attribute__((noreturn)) void exitf(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  exit(kRetryStatus);
}

static int flag_debug;

void debug(const char* msg, ...)
{
  if (!flag_debug)
    return;
  va_list args;
  va_start(args, msg);
  vfprintf(stdout, msg, args);
  va_end(args);
  fflush(stdout);
}

__thread int skip_segv;
__thread jmp_buf segv_env;

static void segv_handler(int sig, siginfo_t* info, void* uctx)
{
  if (__atomic_load_n(&skip_segv, __ATOMIC_RELAXED))
    _longjmp(segv_env, 1);
  exit(sig);
}

static void install_segv_handler()
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_sigaction = segv_handler;
  sa.sa_flags = SA_NODEFER | SA_SIGINFO;
  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGBUS, &sa, NULL);
}

#define NONFAILING(...)                                                \
  {                                                                    \
    __atomic_fetch_add(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
    if (_setjmp(segv_env) == 0) {                                      \
      __VA_ARGS__;                                                     \
    }                                                                  \
    __atomic_fetch_sub(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
  }

static uintptr_t execute_syscall(int nr, uintptr_t a0, uintptr_t a1,
                                 uintptr_t a2, uintptr_t a3,
                                 uintptr_t a4, uintptr_t a5,
                                 uintptr_t a6, uintptr_t a7,
                                 uintptr_t a8)
{
  switch (nr) {
  default:
    return syscall(nr, a0, a1, a2, a3, a4, a5);
  }
}

static void setup_main_process(uint64_t pid, bool enable_tun)
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_handler = SIG_IGN;
  syscall(SYS_rt_sigaction, 0x20, &sa, NULL, 8);
  syscall(SYS_rt_sigaction, 0x21, &sa, NULL, 8);
  install_segv_handler();

  char tmpdir_template[] = "./syzkaller.XXXXXX";
  char* tmpdir = mkdtemp(tmpdir_template);
  if (!tmpdir)
    fail("failed to mkdtemp");
  if (chmod(tmpdir, 0777))
    fail("failed to chmod");
  if (chdir(tmpdir))
    fail("failed to chdir");
}

static void loop();

static void sandbox_common()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  setsid();

  struct rlimit rlim;
  rlim.rlim_cur = rlim.rlim_max = 128 << 20;
  setrlimit(RLIMIT_AS, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_FSIZE, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_STACK, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 0;
  setrlimit(RLIMIT_CORE, &rlim);

  unshare(CLONE_NEWNS);
  unshare(CLONE_NEWIPC);
  unshare(CLONE_IO);
}

static int do_sandbox_none()
{
  int pid = fork();
  if (pid)
    return pid;
  sandbox_common();
  loop();
  exit(1);
}

static void remove_dir(const char* dir)
{
  DIR* dp;
  struct dirent* ep;
  int iter = 0;
retry:
  dp = opendir(dir);
  if (dp == NULL) {
    if (errno == EMFILE) {
      exitf("opendir(%s) failed due to NOFILE, exiting");
    }
    exitf("opendir(%s) failed", dir);
  }
  while ((ep = readdir(dp))) {
    if (strcmp(ep->d_name, ".") == 0 || strcmp(ep->d_name, "..") == 0)
      continue;
    char filename[FILENAME_MAX];
    snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name);
    struct stat st;
    if (lstat(filename, &st))
      exitf("lstat(%s) failed", filename);
    if (S_ISDIR(st.st_mode)) {
      remove_dir(filename);
      continue;
    }
    int i;
    for (i = 0;; i++) {
      debug("unlink(%s)\n", filename);
      if (unlink(filename) == 0)
        break;
      if (errno == EROFS) {
        debug("ignoring EROFS\n");
        break;
      }
      if (errno != EBUSY || i > 100)
        exitf("unlink(%s) failed", filename);
      debug("umount(%s)\n", filename);
      if (umount2(filename, MNT_DETACH))
        exitf("umount(%s) failed", filename);
    }
  }
  closedir(dp);
  int i;
  for (i = 0;; i++) {
    debug("rmdir(%s)\n", dir);
    if (rmdir(dir) == 0)
      break;
    if (i < 100) {
      if (errno == EROFS) {
        debug("ignoring EROFS\n");
        break;
      }
      if (errno == EBUSY) {
        debug("umount(%s)\n", dir);
        if (umount2(dir, MNT_DETACH))
          exitf("umount(%s) failed", dir);
        continue;
      }
      if (errno == ENOTEMPTY) {
        if (iter < 100) {
          iter++;
          goto retry;
        }
      }
    }
    exitf("rmdir(%s) failed", dir);
  }
}

static uint64_t current_time_ms()
{
  struct timespec ts;

  if (clock_gettime(CLOCK_MONOTONIC, &ts))
    fail("clock_gettime failed");
  return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static void test();

void loop()
{
  int iter;
  for (iter = 0;; iter++) {
    char cwdbuf[256];
    sprintf(cwdbuf, "./%d", iter);
    if (mkdir(cwdbuf, 0777))
      fail("failed to mkdir");
    int pid = fork();
    if (pid < 0)
      fail("clone failed");
    if (pid == 0) {
      prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
      setpgrp();
      if (chdir(cwdbuf))
        fail("failed to chdir");
      test();
      exit(0);
    }
    int status = 0;
    uint64_t start = current_time_ms();
    for (;;) {
      int res = waitpid(pid, &status, __WALL | WNOHANG);
      int errno0 = errno;
      if (res == pid)
        break;
      usleep(1000);
      if (current_time_ms() - start > 5 * 1000) {
        kill(-pid, SIGKILL);
        kill(pid, SIGKILL);
        waitpid(pid, &status, __WALL);
        break;
      }
    }
    remove_dir(cwdbuf);
  }
}

long r[7];
void* thr(void* arg)
{
  switch ((long)arg) {
  case 0:
    r[0] =
        execute_syscall(__NR_mmap, 0x20000000ul, 0xe43000ul, 0x3ul,
                        0x32ul, 0xfffffffffffffffful, 0x0ul, 0, 0, 0);
    break;
  case 1:
    r[1] = execute_syscall(__NR_socket, 0x10ul, 0x3ul, 0x0ul, 0, 0, 0,
                           0, 0, 0);
    break;
  case 2:
    NONFAILING(memcpy((void*)0x20e42fe1,
                      "\x1f\x00\x00\x00\x1a\x00\x03\xf2\x00\x00\x13\xff"
                      "\x07\x00\x00\x77\x00\x00\xff\xff\xff\x7f\xd8\x00"
                      "\x00\x00\x00\x00\x00\x00\x34",
                      31));
    r[3] = execute_syscall(__NR_write, r[1], 0x20e42fe1ul, 0x1ful, 0, 0,
                           0, 0, 0, 0);
    break;
  case 3:
    NONFAILING(*(uint64_t*)0x20e41000 = (uint64_t)0x20e3ef11);
    NONFAILING(*(uint64_t*)0x20e41008 = (uint64_t)0x1);
    r[6] = execute_syscall(__NR_readv, r[1], 0x20e41000ul, 0x1ul, 0, 0,
                           0, 0, 0, 0);
    break;
  }
  return 0;
}

void test()
{
  long i;
  pthread_t th[8];

  memset(r, -1, sizeof(r));
  srand(getpid());
  for (i = 0; i < 4; i++) {
    pthread_create(&th[i], 0, thr, (void*)i);
    usleep(10000);
  }
  usleep(100000);
}

int main()
{
  setup_main_process(0, false);
  int pid = do_sandbox_none();
  int status = 0;
  while (waitpid(pid, &status, __WALL) != pid) {
  }
  return 0;
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 12:56 net: use-after-free in worker_thread Andrey Konovalov
  2016-12-03 12:58 ` Andrey Konovalov
@ 2016-12-03 17:41 ` Cong Wang
  2016-12-03 18:14   ` Cong Wang
  1 sibling, 1 reply; 13+ messages in thread
From: Cong Wang @ 2016-12-03 17:41 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Johannes Berg, Florian Westphal, Herbert Xu,
	Eric Dumazet, Bob Copeland, Tom Herbert, David Decotigny, netdev,
	LKML

On Sat, Dec 3, 2016 at 4:56 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
> Hi!
>
> I'm seeing lots of the following error reports while running the
> syzkaller fuzzer.
>
> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>
> ==================================================================
> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>
> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> index:0xffff880067f39c10 compound_mapcount: 0
> flags: 0x500000000004080(slab|head)
> page dumped because: kasan: bad access detected
>
> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> Call Trace:
>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>  [<     inline     >] describe_address mm/kasan/report.c:262
>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>  [<     inline     >] kasan_report mm/kasan/report.c:390
>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:411
>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433

Heck... this is the pending work vs. sk_destruct() race. :-/
We can't wait for the work in RCU callback, let me think about it...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 17:41 ` Cong Wang
@ 2016-12-03 18:14   ` Cong Wang
  2016-12-05  7:19     ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2016-12-03 18:14 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Johannes Berg, Florian Westphal, Herbert Xu,
	Eric Dumazet, Bob Copeland, Tom Herbert, David Decotigny, netdev,
	LKML

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

On Sat, Dec 3, 2016 at 9:41 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Sat, Dec 3, 2016 at 4:56 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi!
>>
>> I'm seeing lots of the following error reports while running the
>> syzkaller fuzzer.
>>
>> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
>> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
>>
>> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
>> index:0xffff880067f39c10 compound_mapcount: 0
>> flags: 0x500000000004080(slab|head)
>> page dumped because: kasan: bad access detected
>>
>> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
>>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
>>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
>> Call Trace:
>>  [<     inline     >] __dump_stack lib/dump_stack.c:15
>>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
>>  [<     inline     >] describe_address mm/kasan/report.c:262
>>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
>>  [<     inline     >] kasan_report mm/kasan/report.c:390
>>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
>> mm/kasan/report.c:411
>>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
>>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>
> Heck... this is the pending work vs. sk_destruct() race. :-/
> We can't wait for the work in RCU callback, let me think about it...

Please try the attached patch, I only did compile test, I can't access
my desktop now, so can't do further tests.

Thanks!

[-- Attachment #2: netlink.diff --]
[-- Type: text/plain, Size: 2005 bytes --]

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5eb..6f33013 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
 	sk_mem_charge(sk, skb->truesize);
 }
 
-static void __netlink_sock_destruct(struct sock *sk)
+static void netlink_sock_destruct(struct sock *sk)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 
 	if (nlk->cb_running) {
+		if (nlk->cb.done)
+			nlk->cb.done(&nlk->cb);
 		module_put(nlk->cb.module);
 		kfree_skb(nlk->cb.skb);
 	}
@@ -343,28 +345,6 @@ static void __netlink_sock_destruct(struct sock *sk)
 	WARN_ON(nlk_sk(sk)->groups);
 }
 
-static void netlink_sock_destruct_work(struct work_struct *work)
-{
-	struct netlink_sock *nlk = container_of(work, struct netlink_sock,
-						work);
-
-	nlk->cb.done(&nlk->cb);
-	__netlink_sock_destruct(&nlk->sk);
-}
-
-static void netlink_sock_destruct(struct sock *sk)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-
-	if (nlk->cb_running && nlk->cb.done) {
-		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
-		schedule_work(&nlk->work);
-		return;
-	}
-
-	__netlink_sock_destruct(sk);
-}
-
 /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on
  * SMP. Look, when several writers sleep and reader wakes them up, all but one
  * immediately hit write lock and grab all the cpus. Exclusive sleep solves
@@ -664,11 +644,19 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
+static void netlink_sock_put_work(struct work_struct *work)
+{
+	struct netlink_sock *nlk = container_of(work, struct netlink_sock,
+						work);
+	sock_put(&nlk->sk);
+}
+
 static void deferred_put_nlk_sk(struct rcu_head *head)
 {
 	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
 
-	sock_put(&nlk->sk);
+	INIT_WORK(&nlk->work, netlink_sock_put_work);
+	schedule_work(&nlk->work);
 }
 
 static int netlink_release(struct socket *sock)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 18:14   ` Cong Wang
@ 2016-12-05  7:19     ` Herbert Xu
  2016-12-05  7:26       ` [v2 PATCH] netlink: Do not schedule work from sk_destruct Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2016-12-05  7:19 UTC (permalink / raw)
  To: Cong Wang
  Cc: Andrey Konovalov, David S. Miller, Johannes Berg,
	Florian Westphal, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML

On Sat, Dec 03, 2016 at 10:14:48AM -0800, Cong Wang wrote:
> On Sat, Dec 3, 2016 at 9:41 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > On Sat, Dec 3, 2016 at 4:56 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
> >> Hi!
> >>
> >> I'm seeing lots of the following error reports while running the
> >> syzkaller fuzzer.
> >>
> >> Reports appeared when I updated to 3c49de52 (Dec 2) from 2caceb32 (Dec 1).
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in worker_thread+0x17d8/0x18a0
> >> Read of size 8 at addr ffff880067f3ecd8 by task kworker/3:1/774
> >>
> >> page:ffffea00019fce00 count:1 mapcount:0 mapping:          (null)
> >> index:0xffff880067f39c10 compound_mapcount: 0
> >> flags: 0x500000000004080(slab|head)
> >> page dumped because: kasan: bad access detected
> >>
> >> CPU: 3 PID: 774 Comm: kworker/3:1 Not tainted 4.9.0-rc7+ #66
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >>  ffff88006c267838 ffffffff81f882da ffffffff6c25e338 1ffff1000d84ce9a
> >>  ffffed000d84ce92 ffff88006c25e340 0000000041b58ab3 ffffffff8541e198
> >>  ffffffff81f88048 0000000100000000 0000000041b58ab3 ffffffff853d3ee8
> >> Call Trace:
> >>  [<     inline     >] __dump_stack lib/dump_stack.c:15
> >>  [<ffffffff81f882da>] dump_stack+0x292/0x398 lib/dump_stack.c:51
> >>  [<     inline     >] describe_address mm/kasan/report.c:262
> >>  [<ffffffff817e50d1>] kasan_report_error+0x121/0x560 mm/kasan/report.c:368
> >>  [<     inline     >] kasan_report mm/kasan/report.c:390
> >>  [<ffffffff817e560e>] __asan_report_load8_noabort+0x3e/0x40
> >> mm/kasan/report.c:411
> >>  [<ffffffff81329b88>] worker_thread+0x17d8/0x18a0 kernel/workqueue.c:2228
> >>  [<ffffffff8133ebf3>] kthread+0x323/0x3e0 kernel/kthread.c:209
> >>  [<ffffffff84a2a22a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> >
> > Heck... this is the pending work vs. sk_destruct() race. :-/
> > We can't wait for the work in RCU callback, let me think about it...

Sorry, my patch was obviously crap as it was trying to delay the
freeing of a socket from sk_destruct which can't be done.

> Please try the attached patch, I only did compile test, I can't access
> my desktop now, so can't do further tests.

Thanks for the patch.  It'll obviously work but I wanted avoid that
because it penalises the common path for the rare case.

Andrey, please try this patch and let me know if it's any better.

---8<---
Subject: netlink: Do not schedule work from sk_destruct

It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5eb..8a642c5 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
 	sk_mem_charge(sk, skb->truesize);
 }
 
-static void __netlink_sock_destruct(struct sock *sk)
+static void netlink_sock_destruct(struct sock *sk)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 
 	if (nlk->cb_running) {
+		if (nlk->cb.done)
+			nlk->cb.done(&nlk->cb);
 		module_put(nlk->cb.module);
 		kfree_skb(nlk->cb.skb);
 	}
@@ -348,21 +350,7 @@ static void netlink_sock_destruct_work(struct work_struct *work)
 	struct netlink_sock *nlk = container_of(work, struct netlink_sock,
 						work);
 
-	nlk->cb.done(&nlk->cb);
-	__netlink_sock_destruct(&nlk->sk);
-}
-
-static void netlink_sock_destruct(struct sock *sk)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-
-	if (nlk->cb_running && nlk->cb.done) {
-		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
-		schedule_work(&nlk->work);
-		return;
-	}
-
-	__netlink_sock_destruct(sk);
+	sk_free(&nlk->sk);
 }
 
 /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on
@@ -664,11 +652,17 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
+static void deferred_free_nlk_sk(struct rcu_head *head)
 {
 	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
 
-	sock_put(&nlk->sk);
+	if (nlk->cb_running && nlk->cb.done) {
+		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
+		schedule_work(&nlk->work);
+		return;
+	}
+
+	sk_free(&nlk->sk);
 }
 
 static int netlink_release(struct socket *sock)
@@ -743,7 +737,9 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+
+	if (atomic_dec_and_test(&sk->sk_refcnt))
+		call_rcu(&nlk->rcu, deferred_free_nlk_sk);
 	return 0;
 }
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: net: use-after-free in worker_thread
  2016-12-03 13:49     ` Eric Dumazet
  2016-12-03 15:39       ` Andrey Konovalov
@ 2016-12-05  7:21       ` Herbert Xu
  1 sibling, 0 replies; 13+ messages in thread
From: Herbert Xu @ 2016-12-05  7:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Konovalov, David S. Miller, Cong Wang, Johannes Berg,
	Florian Westphal, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML, Kostya Serebryany, Dmitry Vyukov,
	syzkaller

On Sat, Dec 03, 2016 at 05:49:07AM -0800, Eric Dumazet wrote:
>
> @@ -600,6 +600,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
>  	}
>  	init_waitqueue_head(&nlk->wait);
>  
> +	sock_set_flag(sk, SOCK_RCU_FREE);
>  	sk->sk_destruct = netlink_sock_destruct;
>  	sk->sk_protocol = protocol;
>  	return 0;

It's not necessarily a big deal but I just wanted to point out
that SOCK_RCU_FREE is not equivalent to the call_rcu thing that
netlink does.  The latter only does the RCU deferral for the socket
release call which is the only place where it's needed while
SOCK_RCU_FREE will force every path to do an RCU deferral.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [v2 PATCH] netlink: Do not schedule work from sk_destruct
  2016-12-05  7:19     ` Herbert Xu
@ 2016-12-05  7:26       ` Herbert Xu
  2016-12-05  7:28         ` [v3 " Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2016-12-05  7:26 UTC (permalink / raw)
  To: Cong Wang
  Cc: Andrey Konovalov, David S. Miller, Johannes Berg,
	Florian Westphal, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML

On Mon, Dec 05, 2016 at 03:19:46PM +0800, Herbert Xu wrote:
>
> Thanks for the patch.  It'll obviously work but I wanted avoid that
> because it penalises the common path for the rare case.
> 
> Andrey, please try this patch and let me know if it's any better.
> 
> ---8<---
> Subject: netlink: Do not schedule work from sk_destruct

Crap, I screwed it up again.  Here is a v2 which moves the atomic
call into the RCU callback as otherwise the socket can be freed from
another path while we await the RCU callback.

---8<---
It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5eb..463f5cf 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
 	sk_mem_charge(sk, skb->truesize);
 }
 
-static void __netlink_sock_destruct(struct sock *sk)
+static void netlink_sock_destruct(struct sock *sk)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 
 	if (nlk->cb_running) {
+		if (nlk->cb.done)
+			nlk->cb.done(&nlk->cb);
 		module_put(nlk->cb.module);
 		kfree_skb(nlk->cb.skb);
 	}
@@ -348,21 +350,7 @@ static void netlink_sock_destruct_work(struct work_struct *work)
 	struct netlink_sock *nlk = container_of(work, struct netlink_sock,
 						work);
 
-	nlk->cb.done(&nlk->cb);
-	__netlink_sock_destruct(&nlk->sk);
-}
-
-static void netlink_sock_destruct(struct sock *sk)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-
-	if (nlk->cb_running && nlk->cb.done) {
-		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
-		schedule_work(&nlk->work);
-		return;
-	}
-
-	__netlink_sock_destruct(sk);
+	sk_free(&nlk->sk);
 }
 
 /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on
@@ -664,11 +652,21 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
+static void deferred_free_nlk_sk(struct rcu_head *head)
 {
 	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
+	struct sock *sk = &nlk->sk;
+
+	if (!atomic_dec_and_test(&sk->sk_refcnt))
+		return;
+
+	if (nlk->cb_running && nlk->cb.done) {
+		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
+		schedule_work(&nlk->work);
+		return;
+	}
 
-	sock_put(&nlk->sk);
+	sk_free(sk);
 }
 
 static int netlink_release(struct socket *sock)
@@ -743,7 +741,7 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+	call_rcu(&nlk->rcu, deferred_free_nlk_sk);
 	return 0;
 }
 
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v3 PATCH] netlink: Do not schedule work from sk_destruct
  2016-12-05  7:26       ` [v2 PATCH] netlink: Do not schedule work from sk_destruct Herbert Xu
@ 2016-12-05  7:28         ` Herbert Xu
  2016-12-05 11:51           ` Andrey Konovalov
  2016-12-06  0:44           ` David Miller
  0 siblings, 2 replies; 13+ messages in thread
From: Herbert Xu @ 2016-12-05  7:28 UTC (permalink / raw)
  To: Cong Wang
  Cc: Andrey Konovalov, David S. Miller, Johannes Berg,
	Florian Westphal, Eric Dumazet, Bob Copeland, Tom Herbert,
	David Decotigny, netdev, LKML

On Mon, Dec 05, 2016 at 03:26:00PM +0800, Herbert Xu wrote:
> On Mon, Dec 05, 2016 at 03:19:46PM +0800, Herbert Xu wrote:
> >
> > Thanks for the patch.  It'll obviously work but I wanted avoid that
> > because it penalises the common path for the rare case.
> > 
> > Andrey, please try this patch and let me know if it's any better.
> > 
> > ---8<---
> > Subject: netlink: Do not schedule work from sk_destruct
> 
> Crap, I screwed it up again.  Here is a v2 which moves the atomic
> call into the RCU callback as otherwise the socket can be freed from
> another path while we await the RCU callback.

With the move it no longer makes sense to rename deferred_put_nlk_sk
so here is v3 which restores the original name.

---8<---
It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 602e5eb..246f29d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
 	sk_mem_charge(sk, skb->truesize);
 }
 
-static void __netlink_sock_destruct(struct sock *sk)
+static void netlink_sock_destruct(struct sock *sk)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
 
 	if (nlk->cb_running) {
+		if (nlk->cb.done)
+			nlk->cb.done(&nlk->cb);
 		module_put(nlk->cb.module);
 		kfree_skb(nlk->cb.skb);
 	}
@@ -348,21 +350,7 @@ static void netlink_sock_destruct_work(struct work_struct *work)
 	struct netlink_sock *nlk = container_of(work, struct netlink_sock,
 						work);
 
-	nlk->cb.done(&nlk->cb);
-	__netlink_sock_destruct(&nlk->sk);
-}
-
-static void netlink_sock_destruct(struct sock *sk)
-{
-	struct netlink_sock *nlk = nlk_sk(sk);
-
-	if (nlk->cb_running && nlk->cb.done) {
-		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
-		schedule_work(&nlk->work);
-		return;
-	}
-
-	__netlink_sock_destruct(sk);
+	sk_free(&nlk->sk);
 }
 
 /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on
@@ -667,8 +655,18 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 static void deferred_put_nlk_sk(struct rcu_head *head)
 {
 	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
+	struct sock *sk = &nlk->sk;
+
+	if (!atomic_dec_and_test(&sk->sk_refcnt))
+		return;
+
+	if (nlk->cb_running && nlk->cb.done) {
+		INIT_WORK(&nlk->work, netlink_sock_destruct_work);
+		schedule_work(&nlk->work);
+		return;
+	}
 
-	sock_put(&nlk->sk);
+	sk_free(sk);
 }
 
 static int netlink_release(struct socket *sock)
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [v3 PATCH] netlink: Do not schedule work from sk_destruct
  2016-12-05  7:28         ` [v3 " Herbert Xu
@ 2016-12-05 11:51           ` Andrey Konovalov
  2016-12-06  0:44           ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: Andrey Konovalov @ 2016-12-05 11:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Cong Wang, David S. Miller, Johannes Berg, Florian Westphal,
	Eric Dumazet, Bob Copeland, Tom Herbert, David Decotigny, netdev,
	LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller

On Mon, Dec 5, 2016 at 8:28 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Dec 05, 2016 at 03:26:00PM +0800, Herbert Xu wrote:
>> On Mon, Dec 05, 2016 at 03:19:46PM +0800, Herbert Xu wrote:
>> >
>> > Thanks for the patch.  It'll obviously work but I wanted avoid that
>> > because it penalises the common path for the rare case.
>> >
>> > Andrey, please try this patch and let me know if it's any better.
>> >
>> > ---8<---
>> > Subject: netlink: Do not schedule work from sk_destruct
>>
>> Crap, I screwed it up again.  Here is a v2 which moves the atomic
>> call into the RCU callback as otherwise the socket can be freed from
>> another path while we await the RCU callback.
>
> With the move it no longer makes sense to rename deferred_put_nlk_sk
> so here is v3 which restores the original name.
>
> ---8<---
> It is wrong to schedule a work from sk_destruct using the socket
> as the memory reserve because the socket will be freed immediately
> after the return from sk_destruct.
>
> Instead we should do the deferral prior to sk_free.
>
> This patch does just that.
>
> Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
>
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 602e5eb..246f29d 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -322,11 +322,13 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
>         sk_mem_charge(sk, skb->truesize);
>  }
>
> -static void __netlink_sock_destruct(struct sock *sk)
> +static void netlink_sock_destruct(struct sock *sk)
>  {
>         struct netlink_sock *nlk = nlk_sk(sk);
>
>         if (nlk->cb_running) {
> +               if (nlk->cb.done)
> +                       nlk->cb.done(&nlk->cb);
>                 module_put(nlk->cb.module);
>                 kfree_skb(nlk->cb.skb);
>         }
> @@ -348,21 +350,7 @@ static void netlink_sock_destruct_work(struct work_struct *work)
>         struct netlink_sock *nlk = container_of(work, struct netlink_sock,
>                                                 work);
>
> -       nlk->cb.done(&nlk->cb);
> -       __netlink_sock_destruct(&nlk->sk);
> -}
> -
> -static void netlink_sock_destruct(struct sock *sk)
> -{
> -       struct netlink_sock *nlk = nlk_sk(sk);
> -
> -       if (nlk->cb_running && nlk->cb.done) {
> -               INIT_WORK(&nlk->work, netlink_sock_destruct_work);
> -               schedule_work(&nlk->work);
> -               return;
> -       }
> -
> -       __netlink_sock_destruct(sk);
> +       sk_free(&nlk->sk);
>  }
>
>  /* This lock without WQ_FLAG_EXCLUSIVE is good on UP and it is _very_ bad on
> @@ -667,8 +655,18 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
>  static void deferred_put_nlk_sk(struct rcu_head *head)
>  {
>         struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
> +       struct sock *sk = &nlk->sk;
> +
> +       if (!atomic_dec_and_test(&sk->sk_refcnt))
> +               return;
> +
> +       if (nlk->cb_running && nlk->cb.done) {
> +               INIT_WORK(&nlk->work, netlink_sock_destruct_work);
> +               schedule_work(&nlk->work);
> +               return;
> +       }
>
> -       sock_put(&nlk->sk);
> +       sk_free(sk);
>  }
>
>  static int netlink_release(struct socket *sock)
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Hi Herbert,

Tested the last version of your patch, the reports go away.

Thanks for the fix!

Tested-by: Andrey Konovalov <andreyknvl@google.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v3 PATCH] netlink: Do not schedule work from sk_destruct
  2016-12-05  7:28         ` [v3 " Herbert Xu
  2016-12-05 11:51           ` Andrey Konovalov
@ 2016-12-06  0:44           ` David Miller
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2016-12-06  0:44 UTC (permalink / raw)
  To: herbert
  Cc: xiyou.wangcong, andreyknvl, johannes.berg, fw, edumazet, me, tom,
	decot, netdev, linux-kernel

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Mon, 5 Dec 2016 15:28:21 +0800

> It is wrong to schedule a work from sk_destruct using the socket
> as the memory reserve because the socket will be freed immediately
> after the return from sk_destruct.
> 
> Instead we should do the deferral prior to sk_free.
> 
> This patch does just that.
> 
> Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-12-06  0:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-03 12:56 net: use-after-free in worker_thread Andrey Konovalov
2016-12-03 12:58 ` Andrey Konovalov
2016-12-03 13:05   ` Andrey Konovalov
2016-12-03 13:49     ` Eric Dumazet
2016-12-03 15:39       ` Andrey Konovalov
2016-12-05  7:21       ` Herbert Xu
2016-12-03 17:41 ` Cong Wang
2016-12-03 18:14   ` Cong Wang
2016-12-05  7:19     ` Herbert Xu
2016-12-05  7:26       ` [v2 PATCH] netlink: Do not schedule work from sk_destruct Herbert Xu
2016-12-05  7:28         ` [v3 " Herbert Xu
2016-12-05 11:51           ` Andrey Konovalov
2016-12-06  0:44           ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).