* [PATCH net] VSOCK: check sk state before receive
@ 2018-05-27 1:02 Hangbin Liu
2018-05-27 15:29 ` Hangbin Liu
0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2018-05-27 1:02 UTC (permalink / raw)
To: netdev; +Cc: Stefan Hajnoczi, Jorgen Hansen, David S. Miller, Hangbin Liu
Since vmci_transport_recv_dgram_cb is a callback function and we access the
socket struct without holding the lock here, there is a possibility that
sk has been released and we use it again. This may cause a NULL pointer
dereference later, while receiving. Here is the call trace:
[ 389.486319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 389.494148] PGD 0 P4D 0
[ 389.496687] Oops: 0000 [#1] SMP PTI
[ 389.500170] Modules linked in: vhost_net vmw_vsock_vmci_transport tun vsock vhost vmw_vmci tap iptable_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_s
[ 389.510984] Failed to add new resource (handle=0x2:0x2711), error: -22
[ 389.543309] Failed to add new resource (handle=0x2:0x2711), error: -22
[ 389.570936] ttm drm crc32c_intel mptsas scsi_transport_sas serio_raw ata_piix mptscsih libata i2c_core mptbase bnx2 dm_mirror dm_region_hash dm_log dm_mod
[ 389.597899] CPU: 3 PID: 113 Comm: kworker/3:2 Tainted: G I 4.17.0-rc6.latest+ #25
[ 389.606673] Hardware name: Dell Inc. PowerEdge R710/0XDX06, BIOS 6.1.0 10/18/2011
[ 389.614158] Workqueue: events dg_delayed_dispatch [vmw_vmci]
[ 389.619820] RIP: 0010:selinux_socket_sock_rcv_skb+0x46/0x270
[ 389.625475] RSP: 0018:ffffbcb5416b7ce0 EFLAGS: 00010293
[ 389.630698] RAX: 0000000000000000 RBX: 0000000000000028 RCX: 0000000000000007
[ 389.637825] RDX: 0000000000000000 RSI: ffff94a29feec500 RDI: ffffbcb5416b7d18
[ 389.644953] RBP: ffff94a29bd9a640 R08: 0000000000000001 R09: ffff94a187c03080
[ 389.652080] R10: ffffbcb5416b7d80 R11: 0000000000000000 R12: ffffbcb5416b7d18
[ 389.659206] R13: ffff94a29feec500 R14: ffff94a2afda5e00 R15: 0ffff94a2afda5e0
[ 389.666336] FS: 0000000000000000(0000) GS:ffff94a2afd80000(0000) knlGS:0000000000000000
[ 389.674419] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 389.680160] CR2: 0000000000000010 CR3: 000000004320a003 CR4: 00000000000206e0
[ 389.687283] Call Trace:
[ 389.689738] ? __alloc_skb+0xa0/0x230
[ 389.693407] security_sock_rcv_skb+0x32/0x60
[ 389.697679] ? __alloc_skb+0xa0/0x230
[ 389.701343] sk_filter_trim_cap+0x4e/0x1f0
[ 389.705442] __sk_receive_skb+0x32/0x290
[ 389.709372] vmci_transport_recv_dgram_cb+0xa7/0xd0 [vmw_vsock_vmci_transport]
[ 389.716593] dg_delayed_dispatch+0x22/0x50 [vmw_vmci]
[ 389.721648] process_one_work+0x1f2/0x4a0
[ 389.725662] worker_thread+0x38/0x4c0
[ 389.729329] ? process_one_work+0x4a0/0x4a0
[ 389.733512] kthread+0x12f/0x150
[ 389.736743] ? kthread_create_worker_on_cpu+0x90/0x90
[ 389.741796] ret_from_fork+0x35/0x40
[ 389.745370] Code: 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 e8 42 15 db ff 0f b7 5d 10 48 8b 85 70 02 00 00 4c 8d 64 24 38 b9 07 00 00 00 4c 89 e7 <44> 8b 70 10 31 c0 41 89 df 41 83 e7 f7
[ 389.764342] RIP: selinux_socket_sock_rcv_skb+0x46/0x270 RSP: ffffbcb5416b7ce0
[ 389.771467] CR2: 0000000000000010
[ 389.774784] ---[ end trace e83d65291a15ae6a ]---
Fix it by checking sk state before using it.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
net/vmw_vsock/vmci_transport.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a7a73ff..0d26040 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -612,6 +612,13 @@ static int vmci_transport_recv_dgram_cb(void *data, struct vmci_datagram *dg)
if (!vmci_transport_allow_dgram(vsk, dg->src.context))
return VMCI_ERROR_NO_ACCESS;
+ bh_lock_sock(sk);
+ if (sk->sk_state == TCP_CLOSE) {
+ bh_unlock_sock(sk);
+ return VMCI_ERROR_DATAGRAM_FAILED;
+ }
+ bh_unlock_sock(sk);
+
size = VMCI_DG_SIZE(dg);
/* Attach the packet to the socket's receive queue as an sk_buff. */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-05-27 1:02 [PATCH net] VSOCK: check sk state before receive Hangbin Liu
@ 2018-05-27 15:29 ` Hangbin Liu
2018-05-30 9:17 ` Stefan Hajnoczi
0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2018-05-27 15:29 UTC (permalink / raw)
To: netdev; +Cc: Stefan Hajnoczi, Jorgen Hansen, David S. Miller
Hmm...Although I won't reproduce this bug with my reproducer after
apply my patch. I could still get a similiar issue with syzkaller sock vnet test.
It looks this patch is not complete. Here is the KASAN call trace with my patch.
I can also reproduce it without my patch.
==================================================================
BUG: KASAN: use-after-free in vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
Read of size 4 at addr ffff880026a3a914 by task kworker/0:2/96
CPU: 0 PID: 96 Comm: kworker/0:2 Not tainted 4.17.0-rc6.vsock+ #28
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Workqueue: events dg_delayed_dispatch [vmw_vmci]
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xdd/0x18e lib/dump_stack.c:113
print_address_description+0x7a/0x3e0 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x1dd/0x460 mm/kasan/report.c:412
vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
vmci_transport_recv_dgram_cb+0x5d/0x200 [vmw_vsock_vmci_transport]
dg_delayed_dispatch+0x99/0x1b0 [vmw_vmci]
process_one_work+0xa4e/0x1720 kernel/workqueue.c:2145
worker_thread+0x1df/0x1400 kernel/workqueue.c:2279
kthread+0x343/0x4b0 kernel/kthread.c:240
ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412
Allocated by task 2684:
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xa0/0xd0 mm/kasan/kasan.c:553
slab_post_alloc_hook mm/slab.h:444 [inline]
slab_alloc_node mm/slub.c:2741 [inline]
slab_alloc mm/slub.c:2749 [inline]
kmem_cache_alloc+0x105/0x330 mm/slub.c:2754
sk_prot_alloc+0x6a/0x2c0 net/core/sock.c:1468
sk_alloc+0xc9/0xbb0 net/core/sock.c:1528
__vsock_create+0xc8/0x9b0 [vsock]
vsock_create+0xfd/0x1a0 [vsock]
__sock_create+0x310/0x690 net/socket.c:1285
sock_create net/socket.c:1325 [inline]
__sys_socket+0x101/0x240 net/socket.c:1355
__do_sys_socket net/socket.c:1364 [inline]
__se_sys_socket net/socket.c:1362 [inline]
__x64_sys_socket+0x7d/0xd0 net/socket.c:1362
do_syscall_64+0x175/0x630 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 2684:
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x130/0x180 mm/kasan/kasan.c:521
slab_free_hook mm/slub.c:1388 [inline]
slab_free_freelist_hook mm/slub.c:1415 [inline]
slab_free mm/slub.c:2988 [inline]
kmem_cache_free+0xce/0x410 mm/slub.c:3004
sk_prot_free net/core/sock.c:1509 [inline]
__sk_destruct+0x629/0x940 net/core/sock.c:1593
sk_destruct+0x4e/0x90 net/core/sock.c:1601
__sk_free+0xd3/0x320 net/core/sock.c:1612
sk_free+0x2a/0x30 net/core/sock.c:1623
__vsock_release+0x431/0x610 [vsock]
vsock_release+0x3c/0xc0 [vsock]
sock_release+0x91/0x200 net/socket.c:594
sock_close+0x17/0x20 net/socket.c:1149
__fput+0x368/0xa20 fs/file_table.c:209
task_work_run+0x1c5/0x2a0 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x1876/0x26c0 kernel/exit.c:865
do_group_exit+0x159/0x3e0 kernel/exit.c:968
get_signal+0x65a/0x1780 kernel/signal.c:2482
do_signal+0xa4/0x1fe0 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x1b8/0x260 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x505/0x630 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff880026a3a600
which belongs to the cache AF_VSOCK of size 1056
The buggy address is located 788 bytes inside of
1056-byte region [ffff880026a3a600, ffff880026a3aa20)
The buggy address belongs to the page:
page:ffffea00009a8e00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
flags: 0xfffffc0008100(slab|head)
raw: 000fffffc0008100 0000000000000000 0000000000000000 00000001000d000d
raw: dead000000000100 dead000000000200 ffff880034471a40 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff880026a3a800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff880026a3a880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880026a3a900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff880026a3a980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff880026a3aa00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-05-27 15:29 ` Hangbin Liu
@ 2018-05-30 9:17 ` Stefan Hajnoczi
2018-06-04 16:02 ` Jorgen S. Hansen
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2018-05-30 9:17 UTC (permalink / raw)
To: Hangbin Liu; +Cc: netdev, Jorgen Hansen, David S. Miller
[-- Attachment #1: Type: text/plain, Size: 4964 bytes --]
On Sun, May 27, 2018 at 11:29:45PM +0800, Hangbin Liu wrote:
> Hmm...Although I won't reproduce this bug with my reproducer after
> apply my patch. I could still get a similiar issue with syzkaller sock vnet test.
>
> It looks this patch is not complete. Here is the KASAN call trace with my patch.
> I can also reproduce it without my patch.
Seems like a race between vmci_datagram_destroy_handle() and the
delayed callback, vmci_transport_recv_dgram_cb().
I don't know the VMCI transport well so I'll leave this to Jorgen.
> ==================================================================
> BUG: KASAN: use-after-free in vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
> Read of size 4 at addr ffff880026a3a914 by task kworker/0:2/96
>
> CPU: 0 PID: 96 Comm: kworker/0:2 Not tainted 4.17.0-rc6.vsock+ #28
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> Workqueue: events dg_delayed_dispatch [vmw_vmci]
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0xdd/0x18e lib/dump_stack.c:113
> print_address_description+0x7a/0x3e0 mm/kasan/report.c:256
> kasan_report_error mm/kasan/report.c:354 [inline]
> kasan_report+0x1dd/0x460 mm/kasan/report.c:412
> vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
> vmci_transport_recv_dgram_cb+0x5d/0x200 [vmw_vsock_vmci_transport]
> dg_delayed_dispatch+0x99/0x1b0 [vmw_vmci]
> process_one_work+0xa4e/0x1720 kernel/workqueue.c:2145
> worker_thread+0x1df/0x1400 kernel/workqueue.c:2279
> kthread+0x343/0x4b0 kernel/kthread.c:240
> ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412
>
> Allocated by task 2684:
> set_track mm/kasan/kasan.c:460 [inline]
> kasan_kmalloc+0xa0/0xd0 mm/kasan/kasan.c:553
> slab_post_alloc_hook mm/slab.h:444 [inline]
> slab_alloc_node mm/slub.c:2741 [inline]
> slab_alloc mm/slub.c:2749 [inline]
> kmem_cache_alloc+0x105/0x330 mm/slub.c:2754
> sk_prot_alloc+0x6a/0x2c0 net/core/sock.c:1468
> sk_alloc+0xc9/0xbb0 net/core/sock.c:1528
> __vsock_create+0xc8/0x9b0 [vsock]
> vsock_create+0xfd/0x1a0 [vsock]
> __sock_create+0x310/0x690 net/socket.c:1285
> sock_create net/socket.c:1325 [inline]
> __sys_socket+0x101/0x240 net/socket.c:1355
> __do_sys_socket net/socket.c:1364 [inline]
> __se_sys_socket net/socket.c:1362 [inline]
> __x64_sys_socket+0x7d/0xd0 net/socket.c:1362
> do_syscall_64+0x175/0x630 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Freed by task 2684:
> set_track mm/kasan/kasan.c:460 [inline]
> __kasan_slab_free+0x130/0x180 mm/kasan/kasan.c:521
> slab_free_hook mm/slub.c:1388 [inline]
> slab_free_freelist_hook mm/slub.c:1415 [inline]
> slab_free mm/slub.c:2988 [inline]
> kmem_cache_free+0xce/0x410 mm/slub.c:3004
> sk_prot_free net/core/sock.c:1509 [inline]
> __sk_destruct+0x629/0x940 net/core/sock.c:1593
> sk_destruct+0x4e/0x90 net/core/sock.c:1601
> __sk_free+0xd3/0x320 net/core/sock.c:1612
> sk_free+0x2a/0x30 net/core/sock.c:1623
> __vsock_release+0x431/0x610 [vsock]
> vsock_release+0x3c/0xc0 [vsock]
> sock_release+0x91/0x200 net/socket.c:594
> sock_close+0x17/0x20 net/socket.c:1149
> __fput+0x368/0xa20 fs/file_table.c:209
> task_work_run+0x1c5/0x2a0 kernel/task_work.c:113
> exit_task_work include/linux/task_work.h:22 [inline]
> do_exit+0x1876/0x26c0 kernel/exit.c:865
> do_group_exit+0x159/0x3e0 kernel/exit.c:968
> get_signal+0x65a/0x1780 kernel/signal.c:2482
> do_signal+0xa4/0x1fe0 arch/x86/kernel/signal.c:810
> exit_to_usermode_loop+0x1b8/0x260 arch/x86/entry/common.c:162
> prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
> syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
> do_syscall_64+0x505/0x630 arch/x86/entry/common.c:290
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The buggy address belongs to the object at ffff880026a3a600
> which belongs to the cache AF_VSOCK of size 1056
> The buggy address is located 788 bytes inside of
> 1056-byte region [ffff880026a3a600, ffff880026a3aa20)
> The buggy address belongs to the page:
> page:ffffea00009a8e00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
> flags: 0xfffffc0008100(slab|head)
> raw: 000fffffc0008100 0000000000000000 0000000000000000 00000001000d000d
> raw: dead000000000100 dead000000000200 ffff880034471a40 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff880026a3a800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff880026a3a880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >ffff880026a3a900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff880026a3a980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff880026a3aa00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-05-30 9:17 ` Stefan Hajnoczi
@ 2018-06-04 16:02 ` Jorgen S. Hansen
2018-06-13 1:44 ` Hangbin Liu
0 siblings, 1 reply; 8+ messages in thread
From: Jorgen S. Hansen @ 2018-06-04 16:02 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Hangbin Liu, netdev, David S. Miller
> On May 30, 2018, at 11:17 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Sun, May 27, 2018 at 11:29:45PM +0800, Hangbin Liu wrote:
>> Hmm...Although I won't reproduce this bug with my reproducer after
>> apply my patch. I could still get a similiar issue with syzkaller sock vnet test.
>>
>> It looks this patch is not complete. Here is the KASAN call trace with my patch.
>> I can also reproduce it without my patch.
>
> Seems like a race between vmci_datagram_destroy_handle() and the
> delayed callback, vmci_transport_recv_dgram_cb().
>
> I don't know the VMCI transport well so I'll leave this to Jorgen.
Yes, it looks like we are calling the delayed callback after we return from vmci_datagram_destroy_handle(). I’ll take a closer look at the VMCI side here - the refcounting of VMCI datagram endpoints should guard against this, since the delayed callback does a get on the datagram resource, so this could a VMCI driver issue, and not a problem in the VMCI transport for AF_VSOCK.
>
>> ==================================================================
>> BUG: KASAN: use-after-free in vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
>> Read of size 4 at addr ffff880026a3a914 by task kworker/0:2/96
>>
>> CPU: 0 PID: 96 Comm: kworker/0:2 Not tainted 4.17.0-rc6.vsock+ #28
>> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> Workqueue: events dg_delayed_dispatch [vmw_vmci]
>> Call Trace:
>> __dump_stack lib/dump_stack.c:77 [inline]
>> dump_stack+0xdd/0x18e lib/dump_stack.c:113
>> print_address_description+0x7a/0x3e0 mm/kasan/report.c:256
>> kasan_report_error mm/kasan/report.c:354 [inline]
>> kasan_report+0x1dd/0x460 mm/kasan/report.c:412
>> vmci_transport_allow_dgram.part.7+0x155/0x1a0 [vmw_vsock_vmci_transport]
>> vmci_transport_recv_dgram_cb+0x5d/0x200 [vmw_vsock_vmci_transport]
>> dg_delayed_dispatch+0x99/0x1b0 [vmw_vmci]
>> process_one_work+0xa4e/0x1720 kernel/workqueue.c:2145
>> worker_thread+0x1df/0x1400 kernel/workqueue.c:2279
>> kthread+0x343/0x4b0 kernel/kthread.c:240
>> ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412
>>
>> Allocated by task 2684:
>> set_track mm/kasan/kasan.c:460 [inline]
>> kasan_kmalloc+0xa0/0xd0 mm/kasan/kasan.c:553
>> slab_post_alloc_hook mm/slab.h:444 [inline]
>> slab_alloc_node mm/slub.c:2741 [inline]
>> slab_alloc mm/slub.c:2749 [inline]
>> kmem_cache_alloc+0x105/0x330 mm/slub.c:2754
>> sk_prot_alloc+0x6a/0x2c0 net/core/sock.c:1468
>> sk_alloc+0xc9/0xbb0 net/core/sock.c:1528
>> __vsock_create+0xc8/0x9b0 [vsock]
>> vsock_create+0xfd/0x1a0 [vsock]
>> __sock_create+0x310/0x690 net/socket.c:1285
>> sock_create net/socket.c:1325 [inline]
>> __sys_socket+0x101/0x240 net/socket.c:1355
>> __do_sys_socket net/socket.c:1364 [inline]
>> __se_sys_socket net/socket.c:1362 [inline]
>> __x64_sys_socket+0x7d/0xd0 net/socket.c:1362
>> do_syscall_64+0x175/0x630 arch/x86/entry/common.c:287
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Freed by task 2684:
>> set_track mm/kasan/kasan.c:460 [inline]
>> __kasan_slab_free+0x130/0x180 mm/kasan/kasan.c:521
>> slab_free_hook mm/slub.c:1388 [inline]
>> slab_free_freelist_hook mm/slub.c:1415 [inline]
>> slab_free mm/slub.c:2988 [inline]
>> kmem_cache_free+0xce/0x410 mm/slub.c:3004
>> sk_prot_free net/core/sock.c:1509 [inline]
>> __sk_destruct+0x629/0x940 net/core/sock.c:1593
>> sk_destruct+0x4e/0x90 net/core/sock.c:1601
>> __sk_free+0xd3/0x320 net/core/sock.c:1612
>> sk_free+0x2a/0x30 net/core/sock.c:1623
>> __vsock_release+0x431/0x610 [vsock]
>> vsock_release+0x3c/0xc0 [vsock]
>> sock_release+0x91/0x200 net/socket.c:594
>> sock_close+0x17/0x20 net/socket.c:1149
>> __fput+0x368/0xa20 fs/file_table.c:209
>> task_work_run+0x1c5/0x2a0 kernel/task_work.c:113
>> exit_task_work include/linux/task_work.h:22 [inline]
>> do_exit+0x1876/0x26c0 kernel/exit.c:865
>> do_group_exit+0x159/0x3e0 kernel/exit.c:968
>> get_signal+0x65a/0x1780 kernel/signal.c:2482
>> do_signal+0xa4/0x1fe0 arch/x86/kernel/signal.c:810
>> exit_to_usermode_loop+0x1b8/0x260 arch/x86/entry/common.c:162
>> prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>> syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
>> do_syscall_64+0x505/0x630 arch/x86/entry/common.c:290
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> The buggy address belongs to the object at ffff880026a3a600
>> which belongs to the cache AF_VSOCK of size 1056
>> The buggy address is located 788 bytes inside of
>> 1056-byte region [ffff880026a3a600, ffff880026a3aa20)
>> The buggy address belongs to the page:
>> page:ffffea00009a8e00 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0
>> flags: 0xfffffc0008100(slab|head)
>> raw: 000fffffc0008100 0000000000000000 0000000000000000 00000001000d000d
>> raw: dead000000000100 dead000000000200 ffff880034471a40 0000000000000000
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>> ffff880026a3a800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ffff880026a3a880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff880026a3a900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ^
>> ffff880026a3a980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ffff880026a3aa00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-06-04 16:02 ` Jorgen S. Hansen
@ 2018-06-13 1:44 ` Hangbin Liu
2018-09-21 7:48 ` Jorgen S. Hansen
0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2018-06-13 1:44 UTC (permalink / raw)
To: Jorgen S. Hansen; +Cc: Stefan Hajnoczi, netdev, David S. Miller
On Mon, Jun 04, 2018 at 04:02:39PM +0000, Jorgen S. Hansen wrote:
>
> > On May 30, 2018, at 11:17 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Sun, May 27, 2018 at 11:29:45PM +0800, Hangbin Liu wrote:
> >> Hmm...Although I won't reproduce this bug with my reproducer after
> >> apply my patch. I could still get a similiar issue with syzkaller sock vnet test.
> >>
> >> It looks this patch is not complete. Here is the KASAN call trace with my patch.
> >> I can also reproduce it without my patch.
> >
> > Seems like a race between vmci_datagram_destroy_handle() and the
> > delayed callback, vmci_transport_recv_dgram_cb().
> >
> > I don't know the VMCI transport well so I'll leave this to Jorgen.
>
> Yes, it looks like we are calling the delayed callback after we return from vmci_datagram_destroy_handle(). I’ll take a closer look at the VMCI side here - the refcounting of VMCI datagram endpoints should guard against this, since the delayed callback does a get on the datagram resource, so this could a VMCI driver issue, and not a problem in the VMCI transport for AF_VSOCK.
Hi Jorgen,
Thanks for helping look at this. I'm happy to run test for you patch.
Thanks
Hangbin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-06-13 1:44 ` Hangbin Liu
@ 2018-09-21 7:48 ` Jorgen S. Hansen
2018-09-22 6:27 ` Hangbin Liu
0 siblings, 1 reply; 8+ messages in thread
From: Jorgen S. Hansen @ 2018-09-21 7:48 UTC (permalink / raw)
To: Hangbin Liu; +Cc: Stefan Hajnoczi, netdev, David S. Miller
Hi Hangbin,
I finaly got to the bottom of this - the issue was indeed in the VMCI driver. The patch is posted here:
https://lkml.org/lkml/2018/9/21/326
I used your reproduce.log to test the fix. Thanks for discovering this issue.
Thanks,
Jørgen
________________________________________
From: Hangbin Liu <liuhangbin@gmail.com>
Sent: Wednesday, June 13, 2018 3:44 AM
To: Jorgen S. Hansen
Cc: Stefan Hajnoczi; netdev@vger.kernel.org; David S. Miller
Subject: Re: [PATCH net] VSOCK: check sk state before receive
On Mon, Jun 04, 2018 at 04:02:39PM +0000, Jorgen S. Hansen wrote:
>
> > On May 30, 2018, at 11:17 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > On Sun, May 27, 2018 at 11:29:45PM +0800, Hangbin Liu wrote:
> >> Hmm...Although I won't reproduce this bug with my reproducer after
> >> apply my patch. I could still get a similiar issue with syzkaller sock vnet test.
> >>
> >> It looks this patch is not complete. Here is the KASAN call trace with my patch.
> >> I can also reproduce it without my patch.
> >
> > Seems like a race between vmci_datagram_destroy_handle() and the
> > delayed callback, vmci_transport_recv_dgram_cb().
> >
> > I don't know the VMCI transport well so I'll leave this to Jorgen.
>
> Yes, it looks like we are calling the delayed callback after we return from vmci_datagram_destroy_handle(). I’ll take a closer look at the VMCI side here - the refcounting of VMCI datagram endpoints should guard against this, since the delayed callback does a get on the datagram resource, so this could a VMCI driver issue, and not a problem in the VMCI transport for AF_VSOCK.
Hi Jorgen,
Thanks for helping look at this. I'm happy to run test for you patch.
Thanks
Hangbin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-09-21 7:48 ` Jorgen S. Hansen
@ 2018-09-22 6:27 ` Hangbin Liu
2018-09-24 7:00 ` Jorgen S. Hansen
0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2018-09-22 6:27 UTC (permalink / raw)
To: Jorgen S. Hansen; +Cc: Stefan Hajnoczi, netdev, David S. Miller
On Fri, Sep 21, 2018 at 07:48:25AM +0000, Jorgen S. Hansen wrote:
> Hi Hangbin,
>
> I finaly got to the bottom of this - the issue was indeed in the VMCI driver. The patch is posted here:
>
> https://lkml.org/lkml/2018/9/21/326
>
> I used your reproduce.log to test the fix. Thanks for discovering this issue.
Hi Jorgen,
Thanks for your patch. I built a test kernel with your fix, run my
reproducer and syzkaller socket vnet test for a while. There is no such
error. So I think your patch fixed this issue.
BTW, with FAULT_INJECTION enabled. I got another call trace:
[ 251.166377] FAULT_INJECTION: forcing a failure.
[ 251.178736] CPU: 15 PID: 10448 Comm: syz-executor7 Not tainted 4.19.0-rc4.syz.vnet+ #3
[ 251.187577] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
[ 251.187578] Call Trace:
[ 251.187586] dump_stack+0x8c/0xce
[ 251.187594] should_fail+0x5dd/0x6b0
[ 251.199932] ? fault_create_debugfs_attr+0x1d0/0x1d0
[ 251.199937] __should_failslab+0xe8/0x120
[ 251.199945] should_failslab+0xa/0x20
[ 251.228430] kmem_cache_alloc_trace+0x43/0x1f0
[ 251.233392] ? vhost_dev_set_owner+0x366/0x790 [vhost]
[ 251.239129] vhost_dev_set_owner+0x366/0x790 [vhost]
[ 251.244672] ? vhost_poll_wakeup+0xa0/0xa0 [vhost]
[ 251.250018] ? kasan_unpoison_shadow+0x30/0x40
[ 251.254978] ? vhost_worker+0x370/0x370 [vhost]
[ 251.260035] ? kasan_kmalloc_large+0x71/0xe0
[ 251.264799] ? kmalloc_order+0x54/0x60
[ 251.268985] vhost_net_ioctl+0xc2e/0x14c0 [vhost_net]
[ 251.274635] ? avc_ss_reset+0x150/0x150
[ 251.278915] ? kstrtouint_from_user+0xe5/0x140
[ 251.283876] ? handle_tx_kick+0x40/0x40 [vhost_net]
[ 251.289320] ? save_stack+0x89/0xb0
[ 251.293213] ? __kasan_slab_free+0x12e/0x180
[ 251.297979] ? kmem_cache_free+0x7a/0x210
[ 251.302452] ? putname+0xe2/0x120
[ 251.306151] ? get_pid_task+0x6e/0x90
[ 251.310238] ? proc_fail_nth_write+0x91/0x1c0
[ 251.315100] ? map_files_get_link+0x3c0/0x3c0
[ 251.319963] ? exit_robust_list+0x1c0/0x1c0
[ 251.324633] ? __vfs_write+0xf7/0x6a0
[ 251.328711] ? handle_tx_kick+0x40/0x40 [vhost_net]
[ 251.334154] do_vfs_ioctl+0x1a5/0xfb0
[ 251.338241] ? ioctl_preallocate+0x1c0/0x1c0
[ 251.343009] ? selinux_file_ioctl+0x382/0x560
[ 251.347872] ? selinux_capable+0x40/0x40
[ 251.352250] ? __fget+0x211/0x2e0
[ 251.355949] ? iterate_fd+0x1c0/0x1c0
[ 251.360038] ? syscall_trace_enter+0x285/0xaa0
[ 251.365011] ? security_file_ioctl+0x5d/0xb0
[ 251.369776] ? selinux_capable+0x40/0x40
[ 251.374153] ksys_ioctl+0x89/0xa0
[ 251.377853] __x64_sys_ioctl+0x74/0xb0
[ 251.382036] do_syscall_64+0xc3/0x390
[ 251.386123] ? syscall_return_slowpath+0x14c/0x230
[ 251.391473] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 251.397111] RIP: 0033:0x451b89
[ 251.400519] Code: fc ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 67 fc ff c3 66 2e 0f 1f 84 00 00 00 00
[ 251.421476] RSP: 002b:00007fc0d9673c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 251.429927] RAX: ffffffffffffffda RBX: 00007fc0d96746b4 RCX: 0000000000451b89
[ 251.437889] RDX: 0000000000000000 RSI: 000000000000af01 RDI: 0000000000000003
[ 251.445852] RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
[ 251.453815] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[ 251.461778] R13: 0000000000006450 R14: 00000000004d3090 R15: 00007fc0d9674700
Thanks
Hangbin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] VSOCK: check sk state before receive
2018-09-22 6:27 ` Hangbin Liu
@ 2018-09-24 7:00 ` Jorgen S. Hansen
0 siblings, 0 replies; 8+ messages in thread
From: Jorgen S. Hansen @ 2018-09-24 7:00 UTC (permalink / raw)
To: Hangbin Liu; +Cc: Stefan Hajnoczi, netdev, David S. Miller
On Sep 22, 2018, at 8:27 AM, Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> On Fri, Sep 21, 2018 at 07:48:25AM +0000, Jorgen S. Hansen wrote:
>> Hi Hangbin,
>>
>> I finaly got to the bottom of this - the issue was indeed in the VMCI driver. The patch is posted here:
>>
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2018%2F9%2F21%2F326&data=02%7C01%7Cjhansen%40vmware.com%7C280a9c79e99248db3d1108d6205488de%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636731944784367201&sdata=BEExXdB%2BF0SSh83Epa5fAuyoG4qn%2FAeM2J0wG2gM6GQ%3D&reserved=0
>>
>> I used your reproduce.log to test the fix. Thanks for discovering this issue.
>
> Hi Jorgen,
>
> Thanks for your patch. I built a test kernel with your fix, run my
> reproducer and syzkaller socket vnet test for a while. There is no such
> error. So I think your patch fixed this issue.
Great. Thanks a lot for trying out the patch.
> BTW, with FAULT_INJECTION enabled. I got another call trace:
The vhost_* stuff is for Virtio. Stefan would know better what is going on there.
> [ 251.166377] FAULT_INJECTION: forcing a failure.
> [ 251.178736] CPU: 15 PID: 10448 Comm: syz-executor7 Not tainted 4.19.0
> -rc4.syz.vnet+ #3
> [ 251.187577] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
> [ 251.187578] Call Trace:
> [ 251.187586] dump_stack+0x8c/0xce
> [ 251.187594] should_fail+0x5dd/0x6b0
> [ 251.199932] ? fault_create_debugfs_attr+0x1d0/0x1d0
> [ 251.199937] __should_failslab+0xe8/0x120
> [ 251.199945] should_failslab+0xa/0x20
> [ 251.228430] kmem_cache_alloc_trace+0x43/0x1f0
> [ 251.233392] ? vhost_dev_set_owner+0x366/0x790 [vhost]
> [ 251.239129] vhost_dev_set_owner+0x366/0x790 [vhost]
> [ 251.244672] ? vhost_poll_wakeup+0xa0/0xa0 [vhost]
> [ 251.250018] ? kasan_unpoison_shadow+0x30/0x40
> [ 251.254978] ? vhost_worker+0x370/0x370 [vhost]
> [ 251.260035] ? kasan_kmalloc_large+0x71/0xe0
> [ 251.264799] ? kmalloc_order+0x54/0x60
> [ 251.268985] vhost_net_ioctl+0xc2e/0x14c0 [vhost_net]
> [ 251.274635] ? avc_ss_reset+0x150/0x150
> [ 251.278915] ? kstrtouint_from_user+0xe5/0x140
> [ 251.283876] ? handle_tx_kick+0x40/0x40 [vhost_net]
> [ 251.289320] ? save_stack+0x89/0xb0
> [ 251.293213] ? __kasan_slab_free+0x12e/0x180
> [ 251.297979] ? kmem_cache_free+0x7a/0x210
> [ 251.302452] ? putname+0xe2/0x120
> [ 251.306151] ? get_pid_task+0x6e/0x90
> [ 251.310238] ? proc_fail_nth_write+0x91/0x1c0
> [ 251.315100] ? map_files_get_link+0x3c0/0x3c0
> [ 251.319963] ? exit_robust_list+0x1c0/0x1c0
> [ 251.324633] ? __vfs_write+0xf7/0x6a0
> [ 251.328711] ? handle_tx_kick+0x40/0x40 [vhost_net]
> [ 251.334154] do_vfs_ioctl+0x1a5/0xfb0
> [ 251.338241] ? ioctl_preallocate+0x1c0/0x1c0
> [ 251.343009] ? selinux_file_ioctl+0x382/0x560
> [ 251.347872] ? selinux_capable+0x40/0x40
> [ 251.352250] ? __fget+0x211/0x2e0
> [ 251.355949] ? iterate_fd+0x1c0/0x1c0
> [ 251.360038] ? syscall_trace_enter+0x285/0xaa0
> [ 251.365011] ? security_file_ioctl+0x5d/0xb0
> [ 251.369776] ? selinux_capable+0x40/0x40
> [ 251.374153] ksys_ioctl+0x89/0xa0
> [ 251.377853] __x64_sys_ioctl+0x74/0xb0
> [ 251.382036] do_syscall_64+0xc3/0x390
> [ 251.386123] ? syscall_return_slowpath+0x14c/0x230
> [ 251.391473] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 251.397111] RIP: 0033:0x451b89
> [ 251.400519] Code: fc ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 0b 67 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> [ 251.421476] RSP: 002b:00007fc0d9673c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 251.429927] RAX: ffffffffffffffda RBX: 00007fc0d96746b4 RCX: 0000000000451b89
> [ 251.437889] RDX: 0000000000000000 RSI: 000000000000af01 RDI: 0000000000000003
> [ 251.445852] RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
> [ 251.453815] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
> [ 251.461778] R13: 0000000000006450 R14: 00000000004d3090 R15: 00007fc0d9674700
>
> Thanks
> Hangbin
Thanks,
Jorgen
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-09-24 13:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-27 1:02 [PATCH net] VSOCK: check sk state before receive Hangbin Liu
2018-05-27 15:29 ` Hangbin Liu
2018-05-30 9:17 ` Stefan Hajnoczi
2018-06-04 16:02 ` Jorgen S. Hansen
2018-06-13 1:44 ` Hangbin Liu
2018-09-21 7:48 ` Jorgen S. Hansen
2018-09-22 6:27 ` Hangbin Liu
2018-09-24 7:00 ` Jorgen S. Hansen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.