KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req

linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
@ 2020-03-31 19:35 syzbot
  2020-03-31 20:27 ` Eric Biggers
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2020-03-31 19:35 UTC (permalink / raw)
  To: bp, davem, elver, herbert, hpa, linux-crypto, linux-kernel,
	mingo, syzkaller-bugs, tglx, x86

Hello,

syzbot found the following crash on:

HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
git tree:       https://github.com/google/ktsan.git kcsan
console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit

write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
 u128_xor include/crypto/b128ops.h:67 [inline]
 glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
 cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
 crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
 _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
 skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
 skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
 sock_recvmsg_nosec net/socket.c:886 [inline]
 sock_recvmsg net/socket.c:904 [inline]
 sock_recvmsg+0x92/0xb0 net/socket.c:900
 ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
 ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
 __sys_recvmsg+0x9d/0x160 net/socket.c:2642
 __do_sys_recvmsg net/socket.c:2652 [inline]
 __se_sys_recvmsg net/socket.c:2649 [inline]
 __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
 do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
 u128_xor include/crypto/b128ops.h:67 [inline]
 glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
 cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
 crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
 _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
 skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
 skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
 sock_recvmsg_nosec net/socket.c:886 [inline]
 sock_recvmsg net/socket.c:904 [inline]
 sock_recvmsg+0x92/0xb0 net/socket.c:900
 ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
 ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
 __sys_recvmsg+0x9d/0x160 net/socket.c:2642
 __do_sys_recvmsg net/socket.c:2652 [inline]
 __se_sys_recvmsg net/socket.c:2649 [inline]
 __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
 do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-03-31 19:35 KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit syzbot
@ 2020-03-31 20:27 ` Eric Biggers
  2020-04-01  7:04   ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Biggers @ 2020-03-31 20:27 UTC (permalink / raw)
  To: syzbot
  Cc: bp, davem, elver, herbert, hpa, linux-crypto, linux-kernel,
	mingo, syzkaller-bugs, tglx, x86

On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> git tree:       https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> 
> write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
>  u128_xor include/crypto/b128ops.h:67 [inline]
>  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
>  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
>  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
>  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
>  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
>  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
>  sock_recvmsg_nosec net/socket.c:886 [inline]
>  sock_recvmsg net/socket.c:904 [inline]
>  sock_recvmsg+0x92/0xb0 net/socket.c:900
>  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
>  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
>  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
>  __do_sys_recvmsg net/socket.c:2652 [inline]
>  __se_sys_recvmsg net/socket.c:2649 [inline]
>  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
>  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
>  u128_xor include/crypto/b128ops.h:67 [inline]
>  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
>  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
>  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
>  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
>  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
>  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
>  sock_recvmsg_nosec net/socket.c:886 [inline]
>  sock_recvmsg net/socket.c:904 [inline]
>  sock_recvmsg+0x92/0xb0 net/socket.c:900
>  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
>  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
>  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
>  __do_sys_recvmsg net/socket.c:2652 [inline]
>  __se_sys_recvmsg net/socket.c:2649 [inline]
>  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
>  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> ==================================================================
> 

I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
the source and destination buffers can be userspace pages that were gotten with
get_user_pages().  Such pages can be concurrently modified, not just by the
kernel but also by userspace.

I'm not sure what can be done about this.

- Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-03-31 20:27 ` Eric Biggers
@ 2020-04-01  7:04   ` Dmitry Vyukov
  2020-04-01 10:24     ` Marco Elver
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2020-04-01  7:04 UTC (permalink / raw)
  To: Eric Biggers
  Cc: syzbot, Borislav Petkov, David Miller, Marco Elver, Herbert Xu,
	H. Peter Anvin, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner,
	the arch/x86 maintainers

On Tue, Mar 31, 2020 at 10:27 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> > git tree:       https://github.com/google/ktsan.git kcsan
> > console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> >
> > write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
> >  u128_xor include/crypto/b128ops.h:67 [inline]
> >  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
> >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> >  sock_recvmsg_nosec net/socket.c:886 [inline]
> >  sock_recvmsg net/socket.c:904 [inline]
> >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> >  __do_sys_recvmsg net/socket.c:2652 [inline]
> >  __se_sys_recvmsg net/socket.c:2649 [inline]
> >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
> >  u128_xor include/crypto/b128ops.h:67 [inline]
> >  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
> >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> >  sock_recvmsg_nosec net/socket.c:886 [inline]
> >  sock_recvmsg net/socket.c:904 [inline]
> >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> >  __do_sys_recvmsg net/socket.c:2652 [inline]
> >  __se_sys_recvmsg net/socket.c:2649 [inline]
> >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > ==================================================================
> >
>
> I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
> the source and destination buffers can be userspace pages that were gotten with
> get_user_pages().  Such pages can be concurrently modified, not just by the
> kernel but also by userspace.
>
> I'm not sure what can be done about this.

Oh, I thought it's something more serious like a shared crypto object.
Thanks for debugging.
I think I've seen this before in another context (b/149818448):

BUG: KCSAN: data-race in copyin / copyin

write to 0xffff888103c8b000 of 4096 bytes by task 20917 on cpu 0:
 instrument_copy_from_user include/linux/instrumented.h:106 [inline]
 copyin+0xab/0xc0 lib/iov_iter.c:151
 copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
 copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
 process_vm_rw_pages mm/process_vm_access.c:46 [inline]
 process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
 process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
 process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
 __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
 __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
 __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
 do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

write to 0xffff888103c8b000 of 4096 bytes by task 20918 on cpu 1:
 instrument_copy_from_user include/linux/instrumented.h:106 [inline]
 copyin+0xab/0xc0 lib/iov_iter.c:151
 copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
 copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
 process_vm_rw_pages mm/process_vm_access.c:46 [inline]
 process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
 process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
 process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
 __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
 __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
 __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
 do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x44/0xa9


Marco, I think we need to ignore all memory that comes from
get_user_pages() somehow. Either not set watchpoints at all, or
perhaps filter them out later if the check is not totally free.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-04-01  7:04   ` Dmitry Vyukov
@ 2020-04-01 10:24     ` Marco Elver
  2020-04-01 16:20       ` Eric Biggers
  0 siblings, 1 reply; 7+ messages in thread
From: Marco Elver @ 2020-04-01 10:24 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Eric Biggers, syzbot, Borislav Petkov, David Miller, Herbert Xu,
	H. Peter Anvin, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner,
	the arch/x86 maintainers

On Wed, 1 Apr 2020 at 09:04, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, Mar 31, 2020 at 10:27 PM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> > > git tree:       https://github.com/google/ktsan.git kcsan
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com
> > >
> > > ==================================================================
> > > BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> > >
> > > write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
> > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > >  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
> > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > >  sock_recvmsg net/socket.c:904 [inline]
> > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
> > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > >  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
> > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > >  sock_recvmsg net/socket.c:904 [inline]
> > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > Reported by Kernel Concurrency Sanitizer on:
> > > CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > ==================================================================
> > >
> >
> > I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
> > the source and destination buffers can be userspace pages that were gotten with
> > get_user_pages().  Such pages can be concurrently modified, not just by the
> > kernel but also by userspace.
> >
> > I'm not sure what can be done about this.
>
> Oh, I thought it's something more serious like a shared crypto object.
> Thanks for debugging.
> I think I've seen this before in another context (b/149818448):
>
> BUG: KCSAN: data-race in copyin / copyin
>
> write to 0xffff888103c8b000 of 4096 bytes by task 20917 on cpu 0:
>  instrument_copy_from_user include/linux/instrumented.h:106 [inline]
>  copyin+0xab/0xc0 lib/iov_iter.c:151
>  copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
>  copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
>  process_vm_rw_pages mm/process_vm_access.c:46 [inline]
>  process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
>  process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
>  process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
>  __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
>  __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
>  __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
>  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> write to 0xffff888103c8b000 of 4096 bytes by task 20918 on cpu 1:
>  instrument_copy_from_user include/linux/instrumented.h:106 [inline]
>  copyin+0xab/0xc0 lib/iov_iter.c:151
>  copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
>  copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
>  process_vm_rw_pages mm/process_vm_access.c:46 [inline]
>  process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
>  process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
>  process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
>  __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
>  __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
>  __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
>  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>
> Marco, I think we need to ignore all memory that comes from
> get_user_pages() somehow. Either not set watchpoints at all, or
> perhaps filter them out later if the check is not totally free.

Makes sense. We already have similar checks, and they're in the
slow-path, so it shouldn't be a problem. Let me investigate.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-04-01 10:24     ` Marco Elver
@ 2020-04-01 16:20       ` Eric Biggers
  2020-04-01 22:53         ` Herbert Xu
  2020-04-14 17:49         ` Marco Elver
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Biggers @ 2020-04-01 16:20 UTC (permalink / raw)
  To: Marco Elver
  Cc: Dmitry Vyukov, syzbot, Borislav Petkov, David Miller, Herbert Xu,
	H. Peter Anvin, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner,
	the arch/x86 maintainers

On Wed, Apr 01, 2020 at 12:24:01PM +0200, Marco Elver wrote:
> On Wed, 1 Apr 2020 at 09:04, Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Tue, Mar 31, 2020 at 10:27 PM Eric Biggers <ebiggers@kernel.org> wrote:
> > >
> > > On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > >
> > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com
> > > >
> > > > ==================================================================
> > > > BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> > > >
> > > > write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
> > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > >  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
> > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > >  sock_recvmsg net/socket.c:904 [inline]
> > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
> > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > >  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
> > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > >  sock_recvmsg net/socket.c:904 [inline]
> > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >
> > > > Reported by Kernel Concurrency Sanitizer on:
> > > > CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > ==================================================================
> > > >
> > >
> > > I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
> > > the source and destination buffers can be userspace pages that were gotten with
> > > get_user_pages().  Such pages can be concurrently modified, not just by the
> > > kernel but also by userspace.
> > >
> > > I'm not sure what can be done about this.
> >
> > Oh, I thought it's something more serious like a shared crypto object.
> > Thanks for debugging.
> > I think I've seen this before in another context (b/149818448):
> >
> > BUG: KCSAN: data-race in copyin / copyin
> >
> > write to 0xffff888103c8b000 of 4096 bytes by task 20917 on cpu 0:
> >  instrument_copy_from_user include/linux/instrumented.h:106 [inline]
> >  copyin+0xab/0xc0 lib/iov_iter.c:151
> >  copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
> >  copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
> >  process_vm_rw_pages mm/process_vm_access.c:46 [inline]
> >  process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
> >  process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
> >  process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
> >  __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
> >  __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
> >  __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
> >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > write to 0xffff888103c8b000 of 4096 bytes by task 20918 on cpu 1:
> >  instrument_copy_from_user include/linux/instrumented.h:106 [inline]
> >  copyin+0xab/0xc0 lib/iov_iter.c:151
> >  copy_page_from_iter_iovec lib/iov_iter.c:296 [inline]
> >  copy_page_from_iter+0x23f/0x5f0 lib/iov_iter.c:942
> >  process_vm_rw_pages mm/process_vm_access.c:46 [inline]
> >  process_vm_rw_single_vec mm/process_vm_access.c:120 [inline]
> >  process_vm_rw_core.isra.0+0x448/0x820 mm/process_vm_access.c:218
> >  process_vm_rw+0x1c4/0x1e0 mm/process_vm_access.c:286
> >  __do_sys_process_vm_writev mm/process_vm_access.c:308 [inline]
> >  __se_sys_process_vm_writev mm/process_vm_access.c:303 [inline]
> >  __x64_sys_process_vm_writev+0x8b/0xb0 mm/process_vm_access.c:303
> >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> >
> > Marco, I think we need to ignore all memory that comes from
> > get_user_pages() somehow. Either not set watchpoints at all, or
> > perhaps filter them out later if the check is not totally free.
> 
> Makes sense. We already have similar checks, and they're in the
> slow-path, so it shouldn't be a problem. Let me investigate.
> 

I'm wondering whether you really should move so soon to ignoring these races?
They are still races; the crypto code is doing standard unannotated reads/writes
of memory that can be concurrently modified.

The issue is that fixing it would require adding READ_ONCE() / WRITE_ONCE() in
hundreds of different places, affecting most crypto-related .c files.

Generally, since encryption and hash algorithms are designed to handle arbitrary
data anyway, getting different values on each read won't crash the code.  So
hopefully this isn't a "real" problem.  But it's still undefined behavior.

- Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-04-01 16:20       ` Eric Biggers
@ 2020-04-01 22:53         ` Herbert Xu
  2020-04-14 17:49         ` Marco Elver
  1 sibling, 0 replies; 7+ messages in thread
From: Herbert Xu @ 2020-04-01 22:53 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Marco Elver, Dmitry Vyukov, syzbot, Borislav Petkov,
	David Miller, H. Peter Anvin,
	open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML,
	Ingo Molnar, syzkaller-bugs, Thomas Gleixner,
	the arch/x86 maintainers

On Wed, Apr 01, 2020 at 09:20:28AM -0700, Eric Biggers wrote:
>
> The issue is that fixing it would require adding READ_ONCE() / WRITE_ONCE() in
> hundreds of different places, affecting most crypto-related .c files.

I don't think we should be doing that.  This is exactly the same
as using sendfile(2) and modifying the data during the send.  As
long as you don't trigger behaviours such as crashes or uncontrolled
execution then it's fine.  The output is simply undefined.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
  2020-04-01 16:20       ` Eric Biggers
  2020-04-01 22:53         ` Herbert Xu
@ 2020-04-14 17:49         ` Marco Elver
  1 sibling, 0 replies; 7+ messages in thread
From: Marco Elver @ 2020-04-14 17:49 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Dmitry Vyukov, syzbot, Borislav Petkov, David Miller, Herbert Xu,
	H. Peter Anvin, open list:HARDWARE RANDOM NUMBER GENERATOR CORE,
	LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner,
	the arch/x86 maintainers, kasan-dev, Paul E. McKenney

On Wed, 1 Apr 2020 at 18:20, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, Apr 01, 2020 at 12:24:01PM +0200, Marco Elver wrote:
> > On Wed, 1 Apr 2020 at 09:04, Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > On Tue, Mar 31, 2020 at 10:27 PM Eric Biggers <ebiggers@kernel.org> wrote:
> > > >
> > > > On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> > > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> > > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+6a6bca8169ffda8ce77b@syzkaller.appspotmail.com
> > > > >
> > > > > ==================================================================
> > > > > BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> > > > >
> > > > > write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
> > > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > > >  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
> > > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > > >  sock_recvmsg net/socket.c:904 [inline]
> > > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
> > > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > > >  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
> > > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > > >  sock_recvmsg net/socket.c:904 [inline]
> > > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > Reported by Kernel Concurrency Sanitizer on:
> > > > > CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > > ==================================================================
> > > > >
> > > >
> > > > I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
> > > > the source and destination buffers can be userspace pages that were gotten with
> > > > get_user_pages().  Such pages can be concurrently modified, not just by the
> > > > kernel but also by userspace.
> > > >
> > > > I'm not sure what can be done about this.
> > >
> > > Oh, I thought it's something more serious like a shared crypto object.
> > > Thanks for debugging.
[...]
> > >
> > > Marco, I think we need to ignore all memory that comes from
> > > get_user_pages() somehow. Either not set watchpoints at all, or
> > > perhaps filter them out later if the check is not totally free.
> >
> > Makes sense. We already have similar checks, and they're in the
> > slow-path, so it shouldn't be a problem. Let me investigate.
>
> I'm wondering whether you really should move so soon to ignoring these races?
> They are still races; the crypto code is doing standard unannotated reads/writes
> of memory that can be concurrently modified.
>
[...]

Wanted to follow up on this, just to clarify: The issue here
essentially boils down to a user-space race involving an API that
isn't designed to be thread-safe with the provided arguments (pointer
to same user-space memory). The data race here merely manifests in
kernel code, but otherwise the kernel is unaffected (if it were
affected, a real fix would be needed). I.e. if we observe this data
race, KCSAN is helpfully pointing out that user space has a bug.

There are some options to deal with cases like this:

1. Do nothing, and just let KCSAN report the data race.

2. Somehow make KCSAN distinguish in-kernel data races that are due to
user space misusing the API. KCSAN can still show the race, but
clearly denote the nature of it by e.g. saying "KCSAN: user data-race
in ..." (instead of "KCSAN: data-race in ..."). This will require one
of 2 things:

    a. Distinguish the access by memory range. This doesn't seem
great, because I don't know if we can apply a general rule like "all
races involving this memory are user-space's fault". What if we have
data races in the memory range that aren't user-space's fault?

    b. Mark the accesses somehow, either by providing a region in
which all races are deemed user-space's fault. This is likely more
problematic than (a), because saying something like "all races in this
section of code are user-space's fault" may also hide real issues.

Because none of (2.a) or (2.b) seem great, at present I would opt for (1).

Anything better we can do here?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-04-14 17:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-31 19:35 KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit syzbot
2020-03-31 20:27 ` Eric Biggers
2020-04-01  7:04   ` Dmitry Vyukov
2020-04-01 10:24     ` Marco Elver
2020-04-01 16:20       ` Eric Biggers
2020-04-01 22:53         ` Herbert Xu
2020-04-14 17:49         ` Marco Elver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).