linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* general protection fault in can_rx_register
@ 2020-01-17 13:46 syzbot
  2020-01-17 20:02 ` Oliver Hartkopp
  0 siblings, 1 reply; 11+ messages in thread
From: syzbot @ 2020-01-17 13:46 UTC (permalink / raw)
  To: davem, dev.kurt, linux-can, linux-kernel, mkl, netdev, o.rempel,
	socketcan, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    f5ae2ea6 Fix built-in early-load Intel microcode alignment
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1033df15e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=cfbb8fa33f49f9f3
dashboard link: https://syzkaller.appspot.com/bug?extid=c3ea30e1e2485573f953
compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/  
c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13204f15e00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000

The bug was bisected to:

commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Date:   Mon Oct 8 09:48:33 2018 +0000

     can: introduce CAN_REQUIRED_SIZE macro

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=129bfdb9e00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=119bfdb9e00000
console output: https://syzkaller.appspot.com/x/log.txt?x=169bfdb9e00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com
Fixes: 9868b5d44f3d ("can: introduce CAN_REQUIRED_SIZE macro")

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9593 Comm: syz-executor302 Not tainted 5.5.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c  
89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00  
00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  raw_enable_filters net/can/raw.c:189 [inline]
  raw_enable_allfilters net/can/raw.c:255 [inline]
  raw_bind+0x326/0x1230 net/can/raw.c:428
  __sys_bind+0x2bd/0x3a0 net/socket.c:1649
  __do_sys_bind net/socket.c:1660 [inline]
  __se_sys_bind net/socket.c:1658 [inline]
  __x64_sys_bind+0x7a/0x90 net/socket.c:1658
  do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x446ba9
Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb132f25d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 00000000006dbc88 RCX: 0000000000446ba9
RDX: 0000000000000008 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 00000000006dbc80 R08: 00007fb132f26700 R09: 0000000000000000
R10: 00007fb132f26700 R11: 0000000000000246 R12: 00000000006dbc8c
R13: 0000000000000000 R14: 0000000000000000 R15: 068500100000003c
Modules linked in:
---[ end trace 0dedabb13ca8e7d7 ]---
RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c  
89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00  
00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-17 13:46 general protection fault in can_rx_register syzbot
@ 2020-01-17 20:02 ` Oliver Hartkopp
  2020-01-20  9:11   ` Kurt Van Dijck
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Hartkopp @ 2020-01-17 20:02 UTC (permalink / raw)
  To: dev.kurt, mkl, o.rempel
  Cc: syzbot, davem, linux-can, linux-kernel, netdev, syzkaller-bugs

Hi Marc, Oleksij, Kurt,

On 17/01/2020 14.46, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    f5ae2ea6 Fix built-in early-load Intel microcode alignment
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1033df15e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=cfbb8fa33f49f9f3
> dashboard link: 
> https://syzkaller.appspot.com/bug?extid=c3ea30e1e2485573f953
> compiler:       clang version 10.0.0 
> (https://github.com/llvm/llvm-project/ 
> c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13204f15e00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000
> 
> The bug was bisected to:
> 
> commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
> Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
> Date:   Mon Oct 8 09:48:33 2018 +0000
> 
>      can: introduce CAN_REQUIRED_SIZE macro
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=129bfdb9e00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=119bfdb9e00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=169bfdb9e00000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com
> Fixes: 9868b5d44f3d ("can: introduce CAN_REQUIRED_SIZE macro")
> 
> kasan: CONFIG_KASAN_INLINE enabled
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] PREEMPT SMP KASAN
> CPU: 0 PID: 9593 Comm: syz-executor302 Not tainted 5.5.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476

include/linux/rculist.h:528 is

struct hlist_node *first = h->first;

which would mean that 'h' must be NULL.

But the h parameter is rcv_list from
rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);

Which can not return NULL - at least when dev_rcv_lists is a proper 
pointer to the dev_rcv_lists provided by can_dev_rcv_lists_find().

So either dev->ml_priv is NULL in the case of having a CAN interface 
(here vxcan) or we have not allocated net->can.rx_alldev_list in 
can_pernet_init() properly (which would lead to an -ENOMEM which is 
reported to whom?).

Hm. I'm lost. Any ideas?

Regards,
Oliver


> Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 
> 4c 89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 
> 05 00 00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   raw_enable_filters net/can/raw.c:189 [inline]
>   raw_enable_allfilters net/can/raw.c:255 [inline]
>   raw_bind+0x326/0x1230 net/can/raw.c:428
>   __sys_bind+0x2bd/0x3a0 net/socket.c:1649
>   __do_sys_bind net/socket.c:1660 [inline]
>   __se_sys_bind net/socket.c:1658 [inline]
>   __x64_sys_bind+0x7a/0x90 net/socket.c:1658
>   do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x446ba9
> Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 
> f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 
> f0 ff ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fb132f25d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
> RAX: ffffffffffffffda RBX: 00000000006dbc88 RCX: 0000000000446ba9
> RDX: 0000000000000008 RSI: 0000000020000180 RDI: 0000000000000003
> RBP: 00000000006dbc80 R08: 00007fb132f26700 R09: 0000000000000000
> R10: 00007fb132f26700 R11: 0000000000000246 R12: 00000000006dbc8c
> R13: 0000000000000000 R14: 0000000000000000 R15: 068500100000003c
> Modules linked in:
> ---[ end trace 0dedabb13ca8e7d7 ]---
> RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
> Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 
> 4c 89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 
> 05 00 00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: 
> https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-17 20:02 ` Oliver Hartkopp
@ 2020-01-20  9:11   ` Kurt Van Dijck
  2020-01-20  9:22     ` Dmitry Vyukov
  0 siblings, 1 reply; 11+ messages in thread
From: Kurt Van Dijck @ 2020-01-20  9:11 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: mkl, o.rempel, syzbot, davem, linux-can, linux-kernel, netdev,
	syzkaller-bugs

If bisect was right with this:

> >The bug was bisected to:
> >
> >commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
> >Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
> >Date:   Mon Oct 8 09:48:33 2018 +0000
> >
> >     can: introduce CAN_REQUIRED_SIZE macro

Then I'd start looking in malformed sockaddr_can data instead.

Is this code what triggers the bug?
> >C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000

Kind regards,
Kurt

On vr, 17 jan 2020 21:02:48 +0100, Oliver Hartkopp wrote:
> Hi Marc, Oleksij, Kurt,
> 
> On 17/01/2020 14.46, syzbot wrote:
> >Hello,
> >
> >syzbot found the following crash on:
> >
> >HEAD commit:    f5ae2ea6 Fix built-in early-load Intel microcode alignment
> >git tree:       upstream
> >console output: https://syzkaller.appspot.com/x/log.txt?x=1033df15e00000
> >kernel config:  https://syzkaller.appspot.com/x/.config?x=cfbb8fa33f49f9f3
> >dashboard link:
> >https://syzkaller.appspot.com/bug?extid=c3ea30e1e2485573f953
> >compiler:       clang version 10.0.0
> >(https://github.com/llvm/llvm-project/
> >c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> >syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13204f15e00000
> >C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000
> >
> >The bug was bisected to:
> >
> >commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
> >Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
> >Date:   Mon Oct 8 09:48:33 2018 +0000
> >
> >     can: introduce CAN_REQUIRED_SIZE macro
> >
> >bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=129bfdb9e00000
> >final crash:    https://syzkaller.appspot.com/x/report.txt?x=119bfdb9e00000
> >console output: https://syzkaller.appspot.com/x/log.txt?x=169bfdb9e00000
> >
> >IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com
> >Fixes: 9868b5d44f3d ("can: introduce CAN_REQUIRED_SIZE macro")
> >
> >kasan: CONFIG_KASAN_INLINE enabled
> >kasan: GPF could be caused by NULL-ptr deref or user memory access
> >general protection fault: 0000 [#1] PREEMPT SMP KASAN
> >CPU: 0 PID: 9593 Comm: syz-executor302 Not tainted 5.5.0-rc6-syzkaller #0
> >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >Google 01/01/2011
> >RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> >RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
> 
> include/linux/rculist.h:528 is
> 
> struct hlist_node *first = h->first;
> 
> which would mean that 'h' must be NULL.
> 
> But the h parameter is rcv_list from
> rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> 
> Which can not return NULL - at least when dev_rcv_lists is a proper pointer
> to the dev_rcv_lists provided by can_dev_rcv_lists_find().
> 
> So either dev->ml_priv is NULL in the case of having a CAN interface (here
> vxcan) or we have not allocated net->can.rx_alldev_list in can_pernet_init()
> properly (which would lead to an -ENOMEM which is reported to whom?).
> 
> Hm. I'm lost. Any ideas?
> 
> Regards,
> Oliver
> 
> 
> >Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c
> >89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00
> >00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> >RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> >RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> >RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> >RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> >R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> >R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> >FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000)
> >knlGS:0000000000000000
> >CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >Call Trace:
> >  raw_enable_filters net/can/raw.c:189 [inline]
> >  raw_enable_allfilters net/can/raw.c:255 [inline]
> >  raw_bind+0x326/0x1230 net/can/raw.c:428
> >  __sys_bind+0x2bd/0x3a0 net/socket.c:1649
> >  __do_sys_bind net/socket.c:1660 [inline]
> >  __se_sys_bind net/socket.c:1658 [inline]
> >  __x64_sys_bind+0x7a/0x90 net/socket.c:1658
> >  do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >RIP: 0033:0x446ba9
> >Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
> >48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> >ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> >RSP: 002b:00007fb132f25d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
> >RAX: ffffffffffffffda RBX: 00000000006dbc88 RCX: 0000000000446ba9
> >RDX: 0000000000000008 RSI: 0000000020000180 RDI: 0000000000000003
> >RBP: 00000000006dbc80 R08: 00007fb132f26700 R09: 0000000000000000
> >R10: 00007fb132f26700 R11: 0000000000000246 R12: 00000000006dbc8c
> >R13: 0000000000000000 R14: 0000000000000000 R15: 068500100000003c
> >Modules linked in:
> >---[ end trace 0dedabb13ca8e7d7 ]---
> >RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> >RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
> >Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c
> >89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00
> >00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> >RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> >RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> >RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> >RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> >R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> >R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> >FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000)
> >knlGS:0000000000000000
> >CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >
> >
> >---
> >This bug is generated by a bot. It may contain errors.
> >See https://goo.gl/tpsmEJ for more information about syzbot.
> >syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> >syzbot will keep track of this bug report. See:
> >https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >For information about bisection process see:
> >https://goo.gl/tpsmEJ#bisection
> >syzbot can test patches for this bug, for details see:
> >https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-20  9:11   ` Kurt Van Dijck
@ 2020-01-20  9:22     ` Dmitry Vyukov
  2020-01-20 22:02       ` Oliver Hartkopp
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Vyukov @ 2020-01-20  9:22 UTC (permalink / raw)
  To: Oliver Hartkopp, Marc Kleine-Budde, o.rempel, syzbot,
	David Miller, linux-can, LKML, netdev, syzkaller-bugs

On Mon, Jan 20, 2020 at 10:11 AM Kurt Van Dijck
<dev.kurt@vandijck-laurijssen.be> wrote:
>
> If bisect was right with this:
>
> > >The bug was bisected to:
> > >
> > >commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
> > >Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
> > >Date:   Mon Oct 8 09:48:33 2018 +0000
> > >
> > >     can: introduce CAN_REQUIRED_SIZE macro
>
> Then I'd start looking in malformed sockaddr_can data instead.
>
> Is this code what triggers the bug?
> > >C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000

yes

> Kind regards,
> Kurt
>
> On vr, 17 jan 2020 21:02:48 +0100, Oliver Hartkopp wrote:
> > Hi Marc, Oleksij, Kurt,
> >
> > On 17/01/2020 14.46, syzbot wrote:
> > >Hello,
> > >
> > >syzbot found the following crash on:
> > >
> > >HEAD commit:    f5ae2ea6 Fix built-in early-load Intel microcode alignment
> > >git tree:       upstream
> > >console output: https://syzkaller.appspot.com/x/log.txt?x=1033df15e00000
> > >kernel config:  https://syzkaller.appspot.com/x/.config?x=cfbb8fa33f49f9f3
> > >dashboard link:
> > >https://syzkaller.appspot.com/bug?extid=c3ea30e1e2485573f953
> > >compiler:       clang version 10.0.0
> > >(https://github.com/llvm/llvm-project/
> > >c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> > >syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13204f15e00000
> > >C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000
> > >
> > >The bug was bisected to:
> > >
> > >commit 9868b5d44f3df9dd75247acd23dddff0a42f79be
> > >Author: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
> > >Date:   Mon Oct 8 09:48:33 2018 +0000
> > >
> > >     can: introduce CAN_REQUIRED_SIZE macro
> > >
> > >bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=129bfdb9e00000
> > >final crash:    https://syzkaller.appspot.com/x/report.txt?x=119bfdb9e00000
> > >console output: https://syzkaller.appspot.com/x/log.txt?x=169bfdb9e00000
> > >
> > >IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > >Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com
> > >Fixes: 9868b5d44f3d ("can: introduce CAN_REQUIRED_SIZE macro")
> > >
> > >kasan: CONFIG_KASAN_INLINE enabled
> > >kasan: GPF could be caused by NULL-ptr deref or user memory access
> > >general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > >CPU: 0 PID: 9593 Comm: syz-executor302 Not tainted 5.5.0-rc6-syzkaller #0
> > >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > >Google 01/01/2011
> > >RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> > >RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
> >
> > include/linux/rculist.h:528 is
> >
> > struct hlist_node *first = h->first;
> >
> > which would mean that 'h' must be NULL.
> >
> > But the h parameter is rcv_list from
> > rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> >
> > Which can not return NULL - at least when dev_rcv_lists is a proper pointer
> > to the dev_rcv_lists provided by can_dev_rcv_lists_find().
> >
> > So either dev->ml_priv is NULL in the case of having a CAN interface (here
> > vxcan) or we have not allocated net->can.rx_alldev_list in can_pernet_init()
> > properly (which would lead to an -ENOMEM which is reported to whom?).
> >
> > Hm. I'm lost. Any ideas?
> >
> > Regards,
> > Oliver
> >
> >
> > >Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c
> > >89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00
> > >00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> > >RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> > >RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> > >RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> > >RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> > >R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> > >R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> > >FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000)
> > >knlGS:0000000000000000
> > >CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> > >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >Call Trace:
> > >  raw_enable_filters net/can/raw.c:189 [inline]
> > >  raw_enable_allfilters net/can/raw.c:255 [inline]
> > >  raw_bind+0x326/0x1230 net/can/raw.c:428
> > >  __sys_bind+0x2bd/0x3a0 net/socket.c:1649
> > >  __do_sys_bind net/socket.c:1660 [inline]
> > >  __se_sys_bind net/socket.c:1658 [inline]
> > >  __x64_sys_bind+0x7a/0x90 net/socket.c:1658
> > >  do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > >RIP: 0033:0x446ba9
> > >Code: e8 0c e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
> > >48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
> > >ff 0f 83 5b 07 fc ff c3 66 2e 0f 1f 84 00 00 00 00
> > >RSP: 002b:00007fb132f25d98 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
> > >RAX: ffffffffffffffda RBX: 00000000006dbc88 RCX: 0000000000446ba9
> > >RDX: 0000000000000008 RSI: 0000000020000180 RDI: 0000000000000003
> > >RBP: 00000000006dbc80 R08: 00007fb132f26700 R09: 0000000000000000
> > >R10: 00007fb132f26700 R11: 0000000000000246 R12: 00000000006dbc8c
> > >R13: 0000000000000000 R14: 0000000000000000 R15: 068500100000003c
> > >Modules linked in:
> > >---[ end trace 0dedabb13ca8e7d7 ]---
> > >RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
> > >RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
> > >Code: 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 89 22 8a fa 4c
> > >89 33 4d 89 e5 49 c1 ed 03 48 b8 00 00 00 00 00 fc ff df <41> 80 7c 05 00
> > >00 74 08 4c 89 e7 e8 c5 21 8a fa 4d 8b 34 24 4c 89
> > >RSP: 0018:ffffc90003e27d00 EFLAGS: 00010202
> > >RAX: dffffc0000000000 RBX: ffff8880a77336c8 RCX: ffff88809306a100
> > >RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a77336c0
> > >RBP: ffffc90003e27d58 R08: ffffffff87289cd6 R09: fffff520007c4f94
> > >R10: fffff520007c4f94 R11: 0000000000000000 R12: 0000000000000008
> > >R13: 0000000000000001 R14: ffff88809fbcf000 R15: ffff8880a7733690
> > >FS:  00007fb132f26700(0000) GS:ffff8880aec00000(0000)
> > >knlGS:0000000000000000
> > >CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >CR2: 000000000178f590 CR3: 00000000996d6000 CR4: 00000000001406f0
> > >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >
> > >
> > >---
> > >This bug is generated by a bot. It may contain errors.
> > >See https://goo.gl/tpsmEJ for more information about syzbot.
> > >syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > >syzbot will keep track of this bug report. See:
> > >https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > >For information about bisection process see:
> > >https://goo.gl/tpsmEJ#bisection
> > >syzbot can test patches for this bug, for details see:
> > >https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200120091146.GD11138%40x1.vandijck-laurijssen.be.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-20  9:22     ` Dmitry Vyukov
@ 2020-01-20 22:02       ` Oliver Hartkopp
  2020-01-20 22:35         ` Oliver Hartkopp
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Hartkopp @ 2020-01-20 22:02 UTC (permalink / raw)
  To: Dmitry Vyukov, Marc Kleine-Budde, o.rempel, syzbot, David Miller,
	linux-can, LKML, netdev, syzkaller-bugs

Hi all,

On 20/01/2020 10.22, Dmitry Vyukov wrote:

>> Is this code what triggers the bug?
>>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=138f5db9e00000
> 
> yes
> 

(..)

>>>> RIP: 0010:hlist_add_head_rcu include/linux/rculist.h:528 [inline]
>>>> RIP: 0010:can_rx_register+0x43b/0x600 net/can/af_can.c:476
>>>
>>> include/linux/rculist.h:528 is
>>>
>>> struct hlist_node *first = h->first;
>>>
>>> which would mean that 'h' must be NULL.
>>>
>>> But the h parameter is rcv_list from
>>> rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
>>>
>>> Which can not return NULL - at least when dev_rcv_lists is a proper pointer
>>> to the dev_rcv_lists provided by can_dev_rcv_lists_find().
>>>
>>> So either dev->ml_priv is NULL in the case of having a CAN interface (here
>>> vxcan) ...

Added some code to check whether dev->ml_priv is NULL:

~/linux$ git diff
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 128d37a4c2e0..6fb4ae4c359e 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct 
net_device *dev, canid_t can_id,
         spin_lock_bh(&net->can.rcvlists_lock);

         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
+       if (!dev_rcv_lists) {
+               pr_err("dev_rcv_lists == NULL! %p\n", dev);
+               goto out_unlock;
+       }
         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);

         rcv->can_id = can_id;
@@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct 
net_device *dev, canid_t can_id,
         rcv_lists_stats->rcv_entries++;
         rcv_lists_stats->rcv_entries_max = 
max(rcv_lists_stats->rcv_entries_max,
 
rcv_lists_stats->rcv_entries);
+out_unlock:
         spin_unlock_bh(&net->can.rcvlists_lock);

         return err;

And the output (after some time) is:

[  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
[  758.508045] bond7148: (slave vxcan1): The slave device specified does 
not support setting the MAC address
[  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
[  758.532025] bond10413: (slave vxcan1): The slave device specified 
does not support setting the MAC address
[  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
[  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
[  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
[  758.549310] bond7145: (slave vxcan1): The slave device specified does 
not support setting the MAC address
[  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
[  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
[  758.552301] bond7146: (slave vxcan1): The slave device specified does 
not support setting the MAC address

So we can see that we get a ml_priv pointer which is NULL which should 
not be possible due to this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743

Btw. the variable 'size' is set two times at the top of 
alloc_candev_mqs() depending on echo_skb_max. This looks wrong.

Best regards,
Oliver

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-20 22:02       ` Oliver Hartkopp
@ 2020-01-20 22:35         ` Oliver Hartkopp
  2020-01-21  8:30           ` Kurt Van Dijck
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Hartkopp @ 2020-01-20 22:35 UTC (permalink / raw)
  To: Dmitry Vyukov, Marc Kleine-Budde, o.rempel, syzbot, David Miller,
	linux-can, LKML, netdev, syzkaller-bugs, Kurt Van Dijck

Answering myself ...

On 20/01/2020 23.02, Oliver Hartkopp wrote:

> 
> Added some code to check whether dev->ml_priv is NULL:
> 
> ~/linux$ git diff
> diff --git a/net/can/af_can.c b/net/can/af_can.c
> index 128d37a4c2e0..6fb4ae4c359e 100644
> --- a/net/can/af_can.c
> +++ b/net/can/af_can.c
> @@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct 
> net_device *dev, canid_t can_id,
>          spin_lock_bh(&net->can.rcvlists_lock);
> 
>          dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> +       if (!dev_rcv_lists) {
> +               pr_err("dev_rcv_lists == NULL! %p\n", dev);
> +               goto out_unlock;
> +       }
>          rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> 
>          rcv->can_id = can_id;
> @@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct 
> net_device *dev, canid_t can_id,
>          rcv_lists_stats->rcv_entries++;
>          rcv_lists_stats->rcv_entries_max = 
> max(rcv_lists_stats->rcv_entries_max,
> 
> rcv_lists_stats->rcv_entries);
> +out_unlock:
>          spin_unlock_bh(&net->can.rcvlists_lock);
> 
>          return err;
> 
> And the output (after some time) is:
> 
> [  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
> [  758.508045] bond7148: (slave vxcan1): The slave device specified does 
> not support setting the MAC address
> [  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
> [  758.532025] bond10413: (slave vxcan1): The slave device specified 
> does not support setting the MAC address
> [  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
> [  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
> [  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
> [  758.549310] bond7145: (slave vxcan1): The slave device specified does 
> not support setting the MAC address
> [  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
> [  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
> [  758.552301] bond7146: (slave vxcan1): The slave device specified does 
> not support setting the MAC address
> 
> So we can see that we get a ml_priv pointer which is NULL which should 
> not be possible due to this:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743 

This reference doesn't point to the right code as vxcan has its own 
handling do assign ml_priv in vxcan.c .

> Btw. the variable 'size' is set two times at the top of 
> alloc_candev_mqs() depending on echo_skb_max. This looks wrong.

No. It looks right as I did not get behind the ALIGN() macro at first sight.

But it is still open why dev->ml_priv is not set correctly in vxcan.c as 
all the settings for .priv_size and in vxcan_setup look fine.

Best regards,
Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-20 22:35         ` Oliver Hartkopp
@ 2020-01-21  8:30           ` Kurt Van Dijck
  2020-01-21  8:35             ` Kurt Van Dijck
  2020-01-21 18:54             ` Kurt Van Dijck
  0 siblings, 2 replies; 11+ messages in thread
From: Kurt Van Dijck @ 2020-01-21  8:30 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Dmitry Vyukov, Marc Kleine-Budde, o.rempel, syzbot, David Miller,
	linux-can, LKML, netdev, syzkaller-bugs

On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:
> Answering myself ...
> 
> On 20/01/2020 23.02, Oliver Hartkopp wrote:
> 
> >
> >Added some code to check whether dev->ml_priv is NULL:
> >
> >~/linux$ git diff
> >diff --git a/net/can/af_can.c b/net/can/af_can.c
> >index 128d37a4c2e0..6fb4ae4c359e 100644
> >--- a/net/can/af_can.c
> >+++ b/net/can/af_can.c
> >@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct
> >net_device *dev, canid_t can_id,
> >         spin_lock_bh(&net->can.rcvlists_lock);
> >
> >         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> >+       if (!dev_rcv_lists) {
> >+               pr_err("dev_rcv_lists == NULL! %p\n", dev);
> >+               goto out_unlock;
> >+       }
> >         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> >
> >         rcv->can_id = can_id;
> >@@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct net_device
> >*dev, canid_t can_id,
> >         rcv_lists_stats->rcv_entries++;
> >         rcv_lists_stats->rcv_entries_max =
> >max(rcv_lists_stats->rcv_entries_max,
> >
> >rcv_lists_stats->rcv_entries);
> >+out_unlock:
> >         spin_unlock_bh(&net->can.rcvlists_lock);
> >
> >         return err;
> >
> >And the output (after some time) is:
> >
> >[  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.508045] bond7148: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.532025] bond10413: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
> >[  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.549310] bond7145: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.552301] bond7146: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >
> >So we can see that we get a ml_priv pointer which is NULL which should not
> >be possible due to this:
> >
> >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743
> 
> This reference doesn't point to the right code as vxcan has its own handling
> do assign ml_priv in vxcan.c .
> 
> >Btw. the variable 'size' is set two times at the top of alloc_candev_mqs()
> >depending on echo_skb_max. This looks wrong.
> 
> No. It looks right as I did not get behind the ALIGN() macro at first sight.
> 
> But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
> the settings for .priv_size and in vxcan_setup look fine.

Maybe I got completely lost:
Shouldn't can_ml_priv and vxcan_priv not be similar?
Where is the dev_rcv_lists in the vxcan case?

> 
> Best regards,
> Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-21  8:30           ` Kurt Van Dijck
@ 2020-01-21  8:35             ` Kurt Van Dijck
  2020-01-21 18:54             ` Kurt Van Dijck
  1 sibling, 0 replies; 11+ messages in thread
From: Kurt Van Dijck @ 2020-01-21  8:35 UTC (permalink / raw)
  To: Oliver Hartkopp, Dmitry Vyukov, Marc Kleine-Budde, o.rempel,
	syzbot, David Miller, linux-can, LKML, netdev, syzkaller-bugs

On di, 21 jan 2020 09:30:35 +0100, Kurt Van Dijck wrote:
> On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:
> > Answering myself ...
> > 
> > On 20/01/2020 23.02, Oliver Hartkopp wrote:
> > 
> > >
> > >Added some code to check whether dev->ml_priv is NULL:
> > >
> > >~/linux$ git diff
> > >diff --git a/net/can/af_can.c b/net/can/af_can.c
> > >index 128d37a4c2e0..6fb4ae4c359e 100644
> > >--- a/net/can/af_can.c
> > >+++ b/net/can/af_can.c
> > >@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct
> > >net_device *dev, canid_t can_id,
> > >         spin_lock_bh(&net->can.rcvlists_lock);
> > >
> > >         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> > >+       if (!dev_rcv_lists) {
> > >+               pr_err("dev_rcv_lists == NULL! %p\n", dev);
> > >+               goto out_unlock;
> > >+       }
> > >         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> > >
> > >         rcv->can_id = can_id;
> > >@@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct net_device
> > >*dev, canid_t can_id,
> > >         rcv_lists_stats->rcv_entries++;
> > >         rcv_lists_stats->rcv_entries_max =
> > >max(rcv_lists_stats->rcv_entries_max,
> > >
> > >rcv_lists_stats->rcv_entries);
> > >+out_unlock:
> > >         spin_unlock_bh(&net->can.rcvlists_lock);
> > >
> > >         return err;
> > >
> > >And the output (after some time) is:
> > >
> > >[  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.508045] bond7148: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.532025] bond10413: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
> > >[  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.549310] bond7145: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.552301] bond7146: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >
> > >So we can see that we get a ml_priv pointer which is NULL which should not
> > >be possible due to this:
> > >
> > >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743
> > 
> > This reference doesn't point to the right code as vxcan has its own handling
> > do assign ml_priv in vxcan.c .
> > 
> > >Btw. the variable 'size' is set two times at the top of alloc_candev_mqs()
> > >depending on echo_skb_max. This looks wrong.
> > 
> > No. It looks right as I did not get behind the ALIGN() macro at first sight.
> > 
> > But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
> > the settings for .priv_size and in vxcan_setup look fine.
> 
> Maybe I got completely lost:
> Shouldn't can_ml_priv and vxcan_priv not be similar?
> Where is the dev_rcv_lists in the vxcan case?

IMHO, net/can/af_can.c:306 is wrong in the vxcan case.

> 
> > 
> > Best regards,
> > Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-21  8:30           ` Kurt Van Dijck
  2020-01-21  8:35             ` Kurt Van Dijck
@ 2020-01-21 18:54             ` Kurt Van Dijck
  2020-01-21 19:28               ` Oliver Hartkopp
  1 sibling, 1 reply; 11+ messages in thread
From: Kurt Van Dijck @ 2020-01-21 18:54 UTC (permalink / raw)
  To: Oliver Hartkopp, Dmitry Vyukov, Marc Kleine-Budde, o.rempel,
	syzbot, David Miller, linux-can, LKML, netdev, syzkaller-bugs

On di, 21 jan 2020 09:30:35 +0100, Kurt Van Dijck wrote:
> On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:
> > Answering myself ...
> > 
> > On 20/01/2020 23.02, Oliver Hartkopp wrote:
> > 
> > >
> > >Added some code to check whether dev->ml_priv is NULL:
> > >
> > >~/linux$ git diff
> > >diff --git a/net/can/af_can.c b/net/can/af_can.c
> > >index 128d37a4c2e0..6fb4ae4c359e 100644
> > >--- a/net/can/af_can.c
> > >+++ b/net/can/af_can.c
> > >@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct
> > >net_device *dev, canid_t can_id,
> > >         spin_lock_bh(&net->can.rcvlists_lock);
> > >
> > >         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> > >+       if (!dev_rcv_lists) {
> > >+               pr_err("dev_rcv_lists == NULL! %p\n", dev);
> > >+               goto out_unlock;
> > >+       }
> > >         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> > >
> > >         rcv->can_id = can_id;
> > >@@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct net_device
> > >*dev, canid_t can_id,
> > >         rcv_lists_stats->rcv_entries++;
> > >         rcv_lists_stats->rcv_entries_max =
> > >max(rcv_lists_stats->rcv_entries_max,
> > >
> > >rcv_lists_stats->rcv_entries);
> > >+out_unlock:
> > >         spin_unlock_bh(&net->can.rcvlists_lock);
> > >
> > >         return err;
> > >
> > >And the output (after some time) is:
> > >
> > >[  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.508045] bond7148: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.532025] bond10413: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
> > >[  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.549310] bond7145: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >[  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
> > >[  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
> > >[  758.552301] bond7146: (slave vxcan1): The slave device specified does
> > >not support setting the MAC address
> > >
> > >So we can see that we get a ml_priv pointer which is NULL which should not
> > >be possible due to this:
> > >
> > >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743
> > 
> > This reference doesn't point to the right code as vxcan has its own handling
> > do assign ml_priv in vxcan.c .
> > 
> > >Btw. the variable 'size' is set two times at the top of alloc_candev_mqs()
> > >depending on echo_skb_max. This looks wrong.
> > 
> > No. It looks right as I did not get behind the ALIGN() macro at first sight.
> > 
> > But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
> > the settings for .priv_size and in vxcan_setup look fine.
> 
> Maybe I got completely lost:
> Shouldn't can_ml_priv and vxcan_priv not be similar?
> Where is the dev_rcv_lists in the vxcan case?

I indeed got completely lost. vxcan_priv & can_ml_priv form together the
private part. I continue looking
> 
> > 
> > Best regards,
> > Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-21 18:54             ` Kurt Van Dijck
@ 2020-01-21 19:28               ` Oliver Hartkopp
  2020-01-21 19:47                 ` Kurt Van Dijck
  0 siblings, 1 reply; 11+ messages in thread
From: Oliver Hartkopp @ 2020-01-21 19:28 UTC (permalink / raw)
  To: Dmitry Vyukov, Marc Kleine-Budde, o.rempel, syzbot, David Miller,
	linux-can, LKML, netdev, syzkaller-bugs, Kurt Van Dijck

Hi Kurt,

On 21/01/2020 19.54, Kurt Van Dijck wrote:
> On di, 21 jan 2020 09:30:35 +0100, Kurt Van Dijck wrote:
>> On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:


>>> But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
>>> the settings for .priv_size and in vxcan_setup look fine.
>>
>> Maybe I got completely lost:
>> Shouldn't can_ml_priv and vxcan_priv not be similar?
>> Where is the dev_rcv_lists in the vxcan case?
> 
> I indeed got completely lost. vxcan_priv & can_ml_priv form together the
> private part. I continue looking

I added some more debug output:

@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct 
net_device *dev, canid_t can_id,
         spin_lock_bh(&net->can.rcvlists_lock);

         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
+       if (!dev_rcv_lists) {
+               pr_err("dev_rcv_lists == NULL! %p (%s)\n", dev, dev->name);
+               goto out_unlock;
+       }
         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);

         rcv->can_id = can_id;


and the output becomes:

[ 1814.644087] bond5130: (slave vxcan1): The slave device specified does 
not support setting the MAC address
[ 1814.644106] bond5130: (slave vxcan1): Error -22 calling dev_set_mtu
[ 1814.648867] bond5128: (slave vxcan1): The slave device specified does 
not support setting the MAC address
[ 1814.648904] bond5128: (slave vxcan1): Error -22 calling dev_set_mtu
[ 1814.649124] dev_rcv_lists == NULL! 000000008e41fb06 (bond5128)
[ 1814.696420] bond5129: (slave vxcan1): The slave device specified does 
not support setting the MAC address
[ 1814.696438] bond5129: (slave vxcan1): Error -22 calling dev_set_mtu

So it's not the vxcan1 netdev that causes the issue but (sporadically!!) 
the bonding netdev.

Interesting enough that the bonding device bond5128 obviously passes the

        if (dev && dev->type != ARPHRD_CAN)
                 return -ENODEV;
test.

?!?

Regards,
Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: general protection fault in can_rx_register
  2020-01-21 19:28               ` Oliver Hartkopp
@ 2020-01-21 19:47                 ` Kurt Van Dijck
  0 siblings, 0 replies; 11+ messages in thread
From: Kurt Van Dijck @ 2020-01-21 19:47 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Dmitry Vyukov, Marc Kleine-Budde, o.rempel, syzbot, David Miller,
	linux-can, LKML, netdev, syzkaller-bugs

On di, 21 jan 2020 20:28:51 +0100, Oliver Hartkopp wrote:
> Hi Kurt,
> 
> On 21/01/2020 19.54, Kurt Van Dijck wrote:
> >On di, 21 jan 2020 09:30:35 +0100, Kurt Van Dijck wrote:
> >>On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:
> 
> 
> >>>But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
> >>>the settings for .priv_size and in vxcan_setup look fine.
> >>
> >>Maybe I got completely lost:
> >>Shouldn't can_ml_priv and vxcan_priv not be similar?
> >>Where is the dev_rcv_lists in the vxcan case?
> >
> >I indeed got completely lost. vxcan_priv & can_ml_priv form together the
> >private part. I continue looking
> 
> I added some more debug output:
> 
> @@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct net_device
> *dev, canid_t can_id,
>         spin_lock_bh(&net->can.rcvlists_lock);
> 
>         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> +       if (!dev_rcv_lists) {
> +               pr_err("dev_rcv_lists == NULL! %p (%s)\n", dev, dev->name);
> +               goto out_unlock;
> +       }
>         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> 
>         rcv->can_id = can_id;
> 
> 
> and the output becomes:
> 
> [ 1814.644087] bond5130: (slave vxcan1): The slave device specified does not
> support setting the MAC address
> [ 1814.644106] bond5130: (slave vxcan1): Error -22 calling dev_set_mtu
> [ 1814.648867] bond5128: (slave vxcan1): The slave device specified does not
> support setting the MAC address
> [ 1814.648904] bond5128: (slave vxcan1): Error -22 calling dev_set_mtu
> [ 1814.649124] dev_rcv_lists == NULL! 000000008e41fb06 (bond5128)
> [ 1814.696420] bond5129: (slave vxcan1): The slave device specified does not
> support setting the MAC address
> [ 1814.696438] bond5129: (slave vxcan1): Error -22 calling dev_set_mtu
> 
> So it's not the vxcan1 netdev that causes the issue but (sporadically!!) the
> bonding netdev.
> 
> Interesting enough that the bonding device bond5128 obviously passes the
> 
>        if (dev && dev->type != ARPHRD_CAN)
>                 return -ENODEV;
> test.
> 
> ?!?
Did you consider my hypothesis I sent you (at 20h22 tonight)?
I don't personally understand all the locks around networking, but your
observation acks my theory of race condition.

> 
> Regards,
> Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-01-21 19:47 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-17 13:46 general protection fault in can_rx_register syzbot
2020-01-17 20:02 ` Oliver Hartkopp
2020-01-20  9:11   ` Kurt Van Dijck
2020-01-20  9:22     ` Dmitry Vyukov
2020-01-20 22:02       ` Oliver Hartkopp
2020-01-20 22:35         ` Oliver Hartkopp
2020-01-21  8:30           ` Kurt Van Dijck
2020-01-21  8:35             ` Kurt Van Dijck
2020-01-21 18:54             ` Kurt Van Dijck
2020-01-21 19:28               ` Oliver Hartkopp
2020-01-21 19:47                 ` Kurt Van Dijck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).