linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
@ 2017-01-31 13:16 Dmitry Vyukov
  2017-01-31 18:51 ` Tim Chen
  2017-02-01 18:45 ` Tim Chen
  0 siblings, 2 replies; 8+ messages in thread
From: Dmitry Vyukov @ 2017-01-31 13:16 UTC (permalink / raw)
  To: Herbert Xu, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, tim.c.chen
  Cc: syzkaller

Hello,

I am getting the following reports with low frequency while running
syzkaller fuzzer. Unfortunately they are not reproducible and happen
in a background thread, so it is difficult to extract any context on
my side. I see only few such crashes per week, so most likely it is
some hard to trigger data race. The following reports are from mmotm
tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
happen?

BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1d2395067 [  220.874864] PUD 1d2860067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: ffff8801d9f346c0 task.stack: ffff8801d9f08000
RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:ffff8801d9f0eef8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8801d7db1190 RCX: 0000000000000006
RDX: 0000000000000001 RSI: ffff8801d9f34ee8 RDI: ffff8801d7db1040
RBP: ffff8801d9f0f258 R08: 0000000100000000 R09: 0000000000000001
R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d9f0f230
R13: ffff8801c8bbc4e0 R14: ffff8801c8bbc530 R15: ffff8801d9f0ef70
FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 00000001cc15a000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d7db1040 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
 0000000000000002 1ffff1003b3e1dea ffffe8ffffc0f218 ffff8801d9f0f190
 0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
Call Trace:
 [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [<ffffffff8219d4ad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [<ffffffff8219d4ad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [<ffffffff8219d4ad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [<ffffffff8219c99f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff8436fbaa>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
 RSP <ffff8801d9f0eef8>
CR2: 0000000000000060
---[ end trace 139fd4cda5dfe2c4 ]---

BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1c68ad067 [  624.973638] PUD 1d485a067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 517 Comm: kworker/0:1 Not tainted 4.9.0 #3
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: ffff8801d9e64700 task.stack: ffff8801d9838000
RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:ffff8801d983eef8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8801d7d96950 RCX: 0000000000000006
RDX: 0000000000000001 RSI: ffff8801d9e64f28 RDI: ffff8801d7d96800
RBP: ffff8801d983f258 R08: 0000000100000000 R09: 0000000000000001
R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d983f230
R13: ffff8801b67f5720 R14: ffff8801b67f5770 R15: ffff8801d983ef70
FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 00000001cee58000 CR4: 00000000001406f0
Stack:
 ffff8801d7d96800 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
 0000000000000002 1ffff1003b307dea ffffe8ffffc0f218 ffff8801d983f190
 0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
Call Trace:
 [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
 RSP <ffff8801d983eef8>
CR2: 0000000000000060
---[ end trace 76403e033556dcb7 ]---

BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1d6242067 [  226.248182] PUD 1d2093067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 1788 Comm: kworker/1:2 Not tainted 4.9.0 #3
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: ffff8801cc3ee100 task.stack: ffff8801cd068000
RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:ffff8801cd06eef8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8801d7dc3c10 RCX: 0000000000000006
RDX: 0000000000000001 RSI: ffff8801cc3ee928 RDI: ffff8801d7dc3ac0
RBP: ffff8801cd06f258 R08: 0000000100000000 R09: 0000000000000001
R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801cd06f230
R13: ffff8801c6eb9aa0 R14: ffff8801c6eb9af0 R15: ffff8801cd06ef70
FS:  0000000000000000(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 00000001d6201000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d7dc3ac0 ffffffff813fa207 dffffc0000000000 ffffe8ffffd0f238
 0000000000000002 1ffff10039a0ddea ffffe8ffffd0f218 ffff8801cd06f190
 0000000000000282 ffffe8ffffd0f140 ffffe8ffffd0f220 0000000041b58ab3
Call Trace:
 [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
 RSP <ffff8801cd06eef8>
CR2: 0000000000000060
---[ end trace 47d3302a6c62cfbc ]---

BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1ccad4067 [   32.785777] PUD 1cb96c067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 3231 Comm: kworker/1:2 Not tainted 4.9.0 #3
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: ffff8801cf472700 task.stack: ffff8801ce848000
RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:ffff8801ce84eef8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8801d7c82950 RCX: 0000000000000006
RDX: 0000000000000001 RSI: ffff8801cf472f28 RDI: ffff8801d7c82800
RBP: ffff8801ce84f258 R08: 0000000100000000 R09: 0000000000000001
R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801ce84f230
R13: ffff8801c970e760 R14: ffff8801c970e7b0 R15: ffff8801ce84ef70
FS:  0000000000000000(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 00000001ca654000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 ffff8801d7c82800 ffffffff813fa207 dffffc0000000000 ffffe8ffffd0f238
 0000000000000002 1ffff10039d09dea ffffe8ffffd0f218 ffff8801ce84f190
 0000000000000282 ffffe8ffffd0f140 ffffe8ffffd0f220 0000000041b58ab3
Call Trace:
 [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
 [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
 RSP <ffff8801ce84eef8>
CR2: 0000000000000060
---[ end trace 3af8184eabd21203 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-01-31 13:16 crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2 Dmitry Vyukov
@ 2017-01-31 18:51 ` Tim Chen
  2017-02-01 18:45 ` Tim Chen
  1 sibling, 0 replies; 8+ messages in thread
From: Tim Chen @ 2017-01-31 18:51 UTC (permalink / raw)
  To: Dmitry Vyukov, Herbert Xu, David Miller, linux-crypto, LKML,
	megha.dey, fenghua.yu
  Cc: syzkaller

On Tue, 2017-01-31 at 14:16 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am getting the following reports with low frequency while running
> syzkaller fuzzer. Unfortunately they are not reproducible and happen
> in a background thread, so it is difficult to extract any context on
> my side. I see only few such crashes per week, so most likely it is
> some hard to trigger data race. The following reports are from mmotm
> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> happen?

Wonder if there is a race between the flusher thread that flush out
existing jobs if we don't have incoming jobs for a while and computation
via mcryptd.  Maybe the flusher fires at the same time when there is
a new job arriving.

Let Megha and I think a bit about it to come up with a patch to see
if that's the case.

Tim

> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1d2395067 [  220.874864] PUD 1d2860067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: ffff8801d9f346c0 task.stack: ffff8801d9f08000
> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:ffff8801d9f0eef8  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff8801d7db1190 RCX: 0000000000000006
> RDX: 0000000000000001 RSI: ffff8801d9f34ee8 RDI: ffff8801d7db1040
> RBP: ffff8801d9f0f258 R08: 0000000100000000 R09: 0000000000000001
> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d9f0f230
> R13: ffff8801c8bbc4e0 R14: ffff8801c8bbc530 R15: ffff8801d9f0ef70
> FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001cc15a000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  ffff8801d7db1040 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
>  0000000000000002 1ffff1003b3e1dea ffffe8ffffc0f218 ffff8801d9f0f190
>  0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
> Call Trace:
>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [<ffffffff8219d4ad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [<ffffffff8219d4ad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [<ffffffff8219d4ad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [<ffffffff8219c99f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff8436fbaa>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP <ffff8801d9f0eef8>
> CR2: 0000000000000060
> ---[ end trace 139fd4cda5dfe2c4 ]---
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1c68ad067 [  624.973638] PUD 1d485a067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 517 Comm: kworker/0:1 Not tainted 4.9.0 #3
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: ffff8801d9e64700 task.stack: ffff8801d9838000
> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:ffff8801d983eef8  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff8801d7d96950 RCX: 0000000000000006
> RDX: 0000000000000001 RSI: ffff8801d9e64f28 RDI: ffff8801d7d96800
> RBP: ffff8801d983f258 R08: 0000000100000000 R09: 0000000000000001
> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d983f230
> R13: ffff8801b67f5720 R14: ffff8801b67f5770 R15: ffff8801d983ef70
> FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001cee58000 CR4: 00000000001406f0
> Stack:
>  ffff8801d7d96800 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
>  0000000000000002 1ffff1003b307dea ffffe8ffffc0f218 ffff8801d983f190
>  0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
> Call Trace:
>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP <ffff8801d983eef8>
> CR2: 0000000000000060
> ---[ end trace 76403e033556dcb7 ]---
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1d6242067 [  226.248182] PUD 1d2093067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 1788 Comm: kworker/1:2 Not tainted 4.9.0 #3
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: ffff8801cc3ee100 task.stack: ffff8801cd068000
> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:ffff8801cd06eef8  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff8801d7dc3c10 RCX: 0000000000000006
> RDX: 0000000000000001 RSI: ffff8801cc3ee928 RDI: ffff8801d7dc3ac0
> RBP: ffff8801cd06f258 R08: 0000000100000000 R09: 0000000000000001
> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801cd06f230
> R13: ffff8801c6eb9aa0 R14: ffff8801c6eb9af0 R15: ffff8801cd06ef70
> FS:  0000000000000000(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001d6201000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  ffff8801d7dc3ac0 ffffffff813fa207 dffffc0000000000 ffffe8ffffd0f238
>  0000000000000002 1ffff10039a0ddea ffffe8ffffd0f218 ffff8801cd06f190
>  0000000000000282 ffffe8ffffd0f140 ffffe8ffffd0f220 0000000041b58ab3
> Call Trace:
>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP <ffff8801cd06eef8>
> CR2: 0000000000000060
> ---[ end trace 47d3302a6c62cfbc ]---
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1ccad4067 [   32.785777] PUD 1cb96c067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 3231 Comm: kworker/1:2 Not tainted 4.9.0 #3
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: ffff8801cf472700 task.stack: ffff8801ce848000
> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:ffff8801ce84eef8  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff8801d7c82950 RCX: 0000000000000006
> RDX: 0000000000000001 RSI: ffff8801cf472f28 RDI: ffff8801d7c82800
> RBP: ffff8801ce84f258 R08: 0000000100000000 R09: 0000000000000001
> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801ce84f230
> R13: ffff8801c970e760 R14: ffff8801c970e7b0 R15: ffff8801ce84ef70
> FS:  0000000000000000(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001ca654000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  ffff8801d7c82800 ffffffff813fa207 dffffc0000000000 ffffe8ffffd0f238
>  0000000000000002 1ffff10039d09dea ffffe8ffffd0f218 ffff8801ce84f190
>  0000000000000282 ffffe8ffffd0f140 ffffe8ffffd0f220 0000000041b58ab3
> Call Trace:
>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [<ffffffff8219cdad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [<ffffffff8219cdad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [<ffffffff8219cdad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [<ffffffff8219c29f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff8436fbea>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP <ffff8801ce84eef8>
> CR2: 0000000000000060
> ---[ end trace 3af8184eabd21203 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-01-31 13:16 crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2 Dmitry Vyukov
  2017-01-31 18:51 ` Tim Chen
@ 2017-02-01 18:45 ` Tim Chen
  2017-02-02 10:58   ` Dmitry Vyukov
  2017-02-11 10:50   ` Herbert Xu
  1 sibling, 2 replies; 8+ messages in thread
From: Tim Chen @ 2017-02-01 18:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Herbert Xu, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

[-- Attachment #1: Type: text/plain, Size: 11690 bytes --]

On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am getting the following reports with low frequency while running
> syzkaller fuzzer. Unfortunately they are not reproducible and happen
> in a background thread, so it is difficult to extract any context on
> my side. I see only few such crashes per week, so most likely it is
> some hard to trigger data race. The following reports are from mmotm
> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> happen?
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1d2395067 [  220.874864] PUD 1d2860067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: ffff8801d9f346c0 task.stack: ffff8801d9f08000
> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:ffff8801d9f0eef8  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff8801d7db1190 RCX: 0000000000000006
> RDX: 0000000000000001 RSI: ffff8801d9f34ee8 RDI: ffff8801d7db1040
> RBP: ffff8801d9f0f258 R08: 0000000100000000 R09: 0000000000000001
> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d9f0f230
> R13: ffff8801c8bbc4e0 R14: ffff8801c8bbc530 R15: ffff8801d9f0ef70
> FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000060 CR3: 00000001cc15a000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  ffff8801d7db1040 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
>  0000000000000002 1ffff1003b3e1dea ffffe8ffffc0f218 ffff8801d9f0f190
>  0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
> Call Trace:
>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [<ffffffff8219d4ad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [<ffffffff8219d4ad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [<ffffffff8219d4ad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [<ffffffff8219c99f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [<ffffffff8436fbaa>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP <ffff8801d9f0eef8>
> CR2: 0000000000000060
> ---[ end trace 139fd4cda5dfe2c4 ]---
> 

Dmitry,

One theory that Mehga and I have is that perhaps the flusher
and regular computaion updates are stepping on each other. 
Can you try this patch and see if it helps?

Tim

--->8---

From: Tim Chen <tim.c.chen@linux.intel.com>
Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
To: Herbert Xu <herbert@gondor.apana.org.au>, Dmitry Vyukov <dvyukov@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>, David Miller <davem@davemloft.net>, linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, megha.dey@linux.intel.com, fenghua.yu@intel.com

The flusher and regular multi-buffer computation via mcryptd may race with another.
Add here a lock and turn off interrupt to to access multi-buffer
computation state cstate->mgr before a round of computation. This should
prevent the flusher code jumping in.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 arch/x86/crypto/sha512-mb/sha512_mb.c | 64 +++++++++++++++++++++++------------
 1 file changed, 42 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c b/arch/x86/crypto/sha512-mb/sha512_mb.c
index d210174..f3c1c21 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb.c
+++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
@@ -221,7 +221,7 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_resubmit
 }
 
 static struct sha512_hash_ctx
-		*sha512_ctx_mgr_get_comp_ctx(struct sha512_ctx_mgr *mgr)
+		*sha512_ctx_mgr_get_comp_ctx(struct mcryptd_alg_cstate *cstate)
 {
 	/*
 	 * If get_comp_job returns NULL, there are no jobs complete.
@@ -233,11 +233,17 @@ static struct sha512_hash_ctx
 	 * Otherwise, all jobs currently being managed by the hash_ctx_mgr
 	 * still need processing.
 	 */
+	struct sha512_ctx_mgr *mgr;
 	struct sha512_hash_ctx *ctx;
+	unsigned long flags;
 
+	mgr = cstate->mgr;
+	spin_lock_irqsave(&cstate->work_lock, flags);
 	ctx = (struct sha512_hash_ctx *)
 				sha512_job_mgr_get_comp_job(&mgr->mgr);
-	return sha512_ctx_mgr_resubmit(mgr, ctx);
+	ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
+	spin_unlock_irqrestore(&cstate->work_lock, flags);
+	return ctx;
 }
 
 static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
@@ -246,12 +252,17 @@ static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
 }
 
 static struct sha512_hash_ctx
-			*sha512_ctx_mgr_submit(struct sha512_ctx_mgr *mgr,
+			*sha512_ctx_mgr_submit(struct mcryptd_alg_cstate *cstate,
 					  struct sha512_hash_ctx *ctx,
 					  const void *buffer,
 					  uint32_t len,
 					  int flags)
 {
+	struct sha512_ctx_mgr *mgr;
+	unsigned long irqflags;
+
+	mgr = cstate->mgr;
+	spin_lock_irqsave(&cstate->work_lock, irqflags);
 	if (flags & (~HASH_ENTIRE)) {
 		/*
 		 * User should not pass anything other than FIRST, UPDATE, or
@@ -351,20 +362,26 @@ static struct sha512_hash_ctx
 		}
 	}
 
-	return sha512_ctx_mgr_resubmit(mgr, ctx);
+	ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
+	spin_unlock_irqrestore(&cstate->work_lock, irqflags);
+	return ctx;
 }
 
-static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
+static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct mcryptd_alg_cstate *cstate)
 {
+	struct sha512_ctx_mgr *mgr;
 	struct sha512_hash_ctx *ctx;
+	unsigned long flags;
 
+	mgr = cstate->mgr;
+	spin_lock_irqsave(&cstate->work_lock, flags);
 	while (1) {
 		ctx = (struct sha512_hash_ctx *)
 					sha512_job_mgr_flush(&mgr->mgr);
 
 		/* If flush returned 0, there are no more jobs in flight. */
 		if (!ctx)
-			return NULL;
+			break;
 
 		/*
 		 * If flush returned a job, resubmit the job to finish
@@ -378,8 +395,10 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
 		 * the sha512_ctx_mgr still need processing. Loop.
 		 */
 		if (ctx)
-			return ctx;
+			break;
 	}
+	spin_unlock_irqrestore(&cstate->work_lock, flags);
+	return ctx;
 }
 
 static int sha512_mb_init(struct ahash_request *areq)
@@ -439,11 +458,11 @@ static int sha_finish_walk(struct mcryptd_hash_request_ctx **ret_rctx,
 		sha_ctx = (struct sha512_hash_ctx *)
 						ahash_request_ctx(&rctx->areq);
 		kernel_fpu_begin();
-		sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx,
+		sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx,
 						rctx->walk.data, nbytes, flag);
 		if (!sha_ctx) {
 			if (flush)
-				sha_ctx = sha512_ctx_mgr_flush(cstate->mgr);
+				sha_ctx = sha512_ctx_mgr_flush(cstate);
 		}
 		kernel_fpu_end();
 		if (sha_ctx)
@@ -471,11 +490,12 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
 	struct sha512_hash_ctx *sha_ctx;
 	struct mcryptd_hash_request_ctx *req_ctx;
 	int ret;
+	unsigned long flags;
 
 	/* remove from work list */
-	spin_lock(&cstate->work_lock);
+	spin_lock_irqsave(&cstate->work_lock, flags);
 	list_del(&rctx->waiter);
-	spin_unlock(&cstate->work_lock);
+	spin_unlock_irqrestore(&cstate->work_lock, flags);
 
 	if (irqs_disabled())
 		rctx->complete(&req->base, err);
@@ -486,14 +506,14 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
 	}
 
 	/* check to see if there are other jobs that are done */
-	sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
+	sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
 	while (sha_ctx) {
 		req_ctx = cast_hash_to_mcryptd_ctx(sha_ctx);
 		ret = sha_finish_walk(&req_ctx, cstate, false);
 		if (req_ctx) {
-			spin_lock(&cstate->work_lock);
+			spin_lock_irqsave(&cstate->work_lock, flags);
 			list_del(&req_ctx->waiter);
-			spin_unlock(&cstate->work_lock);
+			spin_unlock_irqrestore(&cstate->work_lock, flags);
 
 			req = cast_mcryptd_ctx_to_req(req_ctx);
 			if (irqs_disabled())
@@ -504,7 +524,7 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
 				local_bh_enable();
 			}
 		}
-		sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
+		sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
 	}
 
 	return 0;
@@ -515,6 +535,7 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
 {
 	unsigned long next_flush;
 	unsigned long delay = usecs_to_jiffies(FLUSH_INTERVAL);
+	unsigned long flags;
 
 	/* initialize tag */
 	rctx->tag.arrival = jiffies;    /* tag the arrival time */
@@ -522,9 +543,9 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
 	next_flush = rctx->tag.arrival + delay;
 	rctx->tag.expire = next_flush;
 
-	spin_lock(&cstate->work_lock);
+	spin_lock_irqsave(&cstate->work_lock, flags);
 	list_add_tail(&rctx->waiter, &cstate->work_list);
-	spin_unlock(&cstate->work_lock);
+	spin_unlock_irqrestore(&cstate->work_lock, flags);
 
 	mcryptd_arm_flusher(cstate, delay);
 }
@@ -565,7 +586,7 @@ static int sha512_mb_update(struct ahash_request *areq)
 	sha_ctx = (struct sha512_hash_ctx *) ahash_request_ctx(areq);
 	sha512_mb_add_list(rctx, cstate);
 	kernel_fpu_begin();
-	sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
+	sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
 							nbytes, HASH_UPDATE);
 	kernel_fpu_end();
 
@@ -628,7 +649,7 @@ static int sha512_mb_finup(struct ahash_request *areq)
 	sha512_mb_add_list(rctx, cstate);
 
 	kernel_fpu_begin();
-	sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
+	sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
 								nbytes, flag);
 	kernel_fpu_end();
 
@@ -677,8 +698,7 @@ static int sha512_mb_final(struct ahash_request *areq)
 	/* flag HASH_FINAL and 0 data size */
 	sha512_mb_add_list(rctx, cstate);
 	kernel_fpu_begin();
-	sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, &data, 0,
-								HASH_LAST);
+	sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, &data, 0, HASH_LAST);
 	kernel_fpu_end();
 
 	/* check if anything is returned */
@@ -940,7 +960,7 @@ static unsigned long sha512_mb_flusher(struct mcryptd_alg_cstate *cstate)
 			break;
 		kernel_fpu_begin();
 		sha_ctx = (struct sha512_hash_ctx *)
-					sha512_ctx_mgr_flush(cstate->mgr);
+					sha512_ctx_mgr_flush(cstate);
 		kernel_fpu_end();
 		if (!sha_ctx) {
 			pr_err("sha512_mb error: nothing got flushed for"
-- 
2.5.5


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-02-01 18:45 ` Tim Chen
@ 2017-02-02 10:58   ` Dmitry Vyukov
  2017-02-02 16:00     ` Tim Chen
  2017-02-11 10:50   ` Herbert Xu
  1 sibling, 1 reply; 8+ messages in thread
From: Dmitry Vyukov @ 2017-02-02 10:58 UTC (permalink / raw)
  To: tim.c.chen
  Cc: Herbert Xu, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

On Wed, Feb 1, 2017 at 7:45 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote:
> On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I am getting the following reports with low frequency while running
>> syzkaller fuzzer. Unfortunately they are not reproducible and happen
>> in a background thread, so it is difficult to extract any context on
>> my side. I see only few such crashes per week, so most likely it is
>> some hard to trigger data race. The following reports are from mmotm
>> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
>> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
>> happen?
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
>> IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>> PGD 1d2395067 [  220.874864] PUD 1d2860067
>> Oops: 0002 [#1] SMP KASAN
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> Workqueue: crypto mcryptd_queue_worker
>> task: ffff8801d9f346c0 task.stack: ffff8801d9f08000
>> RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
>> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>> RSP: 0018:ffff8801d9f0eef8  EFLAGS: 00010202
>> RAX: 0000000000000000 RBX: ffff8801d7db1190 RCX: 0000000000000006
>> RDX: 0000000000000001 RSI: ffff8801d9f34ee8 RDI: ffff8801d7db1040
>> RBP: ffff8801d9f0f258 R08: 0000000100000000 R09: 0000000000000001
>> R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d9f0f230
>> R13: ffff8801c8bbc4e0 R14: ffff8801c8bbc530 R15: ffff8801d9f0ef70
>> FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000060 CR3: 00000001cc15a000 CR4: 00000000001406f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Stack:
>>  ffff8801d7db1040 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
>>  0000000000000002 1ffff1003b3e1dea ffffe8ffffc0f218 ffff8801d9f0f190
>>  0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
>> Call Trace:
>>  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
>> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>>  [<ffffffff8219d4ad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
>>  [<ffffffff8219d4ad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>>  [<ffffffff8219d4ad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>>  [<ffffffff8219c99f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>>  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>>  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>>  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [<ffffffff8436fbaa>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
>> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
>> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
>> RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>>  RSP <ffff8801d9f0eef8>
>> CR2: 0000000000000060
>> ---[ end trace 139fd4cda5dfe2c4 ]---
>>
>
> Dmitry,
>
> One theory that Mehga and I have is that perhaps the flusher
> and regular computaion updates are stepping on each other.
> Can you try this patch and see if it helps?


No, for this one I can't. Sorry.
It happens with very low frequency and only one fuzzer that tests
mmotm tree. If/when this is committed, I can keep an eye on these
reports and notify if I still see them.
If you have a hypothesis as to how it happens, perhaps you could write
a test that provokes the crash and maybe add some sleeps to kernel
code or alter timeouts to increase probability.



> --->8---
>
> From: Tim Chen <tim.c.chen@linux.intel.com>
> Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
> To: Herbert Xu <herbert@gondor.apana.org.au>, Dmitry Vyukov <dvyukov@google.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>, David Miller <davem@davemloft.net>, linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, megha.dey@linux.intel.com, fenghua.yu@intel.com
>
> The flusher and regular multi-buffer computation via mcryptd may race with another.
> Add here a lock and turn off interrupt to to access multi-buffer
> computation state cstate->mgr before a round of computation. This should
> prevent the flusher code jumping in.
>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  arch/x86/crypto/sha512-mb/sha512_mb.c | 64 +++++++++++++++++++++++------------
>  1 file changed, 42 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c b/arch/x86/crypto/sha512-mb/sha512_mb.c
> index d210174..f3c1c21 100644
> --- a/arch/x86/crypto/sha512-mb/sha512_mb.c
> +++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
> @@ -221,7 +221,7 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_resubmit
>  }
>
>  static struct sha512_hash_ctx
> -               *sha512_ctx_mgr_get_comp_ctx(struct sha512_ctx_mgr *mgr)
> +               *sha512_ctx_mgr_get_comp_ctx(struct mcryptd_alg_cstate *cstate)
>  {
>         /*
>          * If get_comp_job returns NULL, there are no jobs complete.
> @@ -233,11 +233,17 @@ static struct sha512_hash_ctx
>          * Otherwise, all jobs currently being managed by the hash_ctx_mgr
>          * still need processing.
>          */
> +       struct sha512_ctx_mgr *mgr;
>         struct sha512_hash_ctx *ctx;
> +       unsigned long flags;
>
> +       mgr = cstate->mgr;
> +       spin_lock_irqsave(&cstate->work_lock, flags);
>         ctx = (struct sha512_hash_ctx *)
>                                 sha512_job_mgr_get_comp_job(&mgr->mgr);
> -       return sha512_ctx_mgr_resubmit(mgr, ctx);
> +       ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
> +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> +       return ctx;
>  }
>
>  static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
> @@ -246,12 +252,17 @@ static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
>  }
>
>  static struct sha512_hash_ctx
> -                       *sha512_ctx_mgr_submit(struct sha512_ctx_mgr *mgr,
> +                       *sha512_ctx_mgr_submit(struct mcryptd_alg_cstate *cstate,
>                                           struct sha512_hash_ctx *ctx,
>                                           const void *buffer,
>                                           uint32_t len,
>                                           int flags)
>  {
> +       struct sha512_ctx_mgr *mgr;
> +       unsigned long irqflags;
> +
> +       mgr = cstate->mgr;
> +       spin_lock_irqsave(&cstate->work_lock, irqflags);
>         if (flags & (~HASH_ENTIRE)) {
>                 /*
>                  * User should not pass anything other than FIRST, UPDATE, or
> @@ -351,20 +362,26 @@ static struct sha512_hash_ctx
>                 }
>         }
>
> -       return sha512_ctx_mgr_resubmit(mgr, ctx);
> +       ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
> +       spin_unlock_irqrestore(&cstate->work_lock, irqflags);
> +       return ctx;
>  }
>
> -static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
> +static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct mcryptd_alg_cstate *cstate)
>  {
> +       struct sha512_ctx_mgr *mgr;
>         struct sha512_hash_ctx *ctx;
> +       unsigned long flags;
>
> +       mgr = cstate->mgr;
> +       spin_lock_irqsave(&cstate->work_lock, flags);
>         while (1) {
>                 ctx = (struct sha512_hash_ctx *)
>                                         sha512_job_mgr_flush(&mgr->mgr);
>
>                 /* If flush returned 0, there are no more jobs in flight. */
>                 if (!ctx)
> -                       return NULL;
> +                       break;
>
>                 /*
>                  * If flush returned a job, resubmit the job to finish
> @@ -378,8 +395,10 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
>                  * the sha512_ctx_mgr still need processing. Loop.
>                  */
>                 if (ctx)
> -                       return ctx;
> +                       break;
>         }
> +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> +       return ctx;
>  }
>
>  static int sha512_mb_init(struct ahash_request *areq)
> @@ -439,11 +458,11 @@ static int sha_finish_walk(struct mcryptd_hash_request_ctx **ret_rctx,
>                 sha_ctx = (struct sha512_hash_ctx *)
>                                                 ahash_request_ctx(&rctx->areq);
>                 kernel_fpu_begin();
> -               sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx,
> +               sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx,
>                                                 rctx->walk.data, nbytes, flag);
>                 if (!sha_ctx) {
>                         if (flush)
> -                               sha_ctx = sha512_ctx_mgr_flush(cstate->mgr);
> +                               sha_ctx = sha512_ctx_mgr_flush(cstate);
>                 }
>                 kernel_fpu_end();
>                 if (sha_ctx)
> @@ -471,11 +490,12 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
>         struct sha512_hash_ctx *sha_ctx;
>         struct mcryptd_hash_request_ctx *req_ctx;
>         int ret;
> +       unsigned long flags;
>
>         /* remove from work list */
> -       spin_lock(&cstate->work_lock);
> +       spin_lock_irqsave(&cstate->work_lock, flags);
>         list_del(&rctx->waiter);
> -       spin_unlock(&cstate->work_lock);
> +       spin_unlock_irqrestore(&cstate->work_lock, flags);
>
>         if (irqs_disabled())
>                 rctx->complete(&req->base, err);
> @@ -486,14 +506,14 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
>         }
>
>         /* check to see if there are other jobs that are done */
> -       sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
> +       sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
>         while (sha_ctx) {
>                 req_ctx = cast_hash_to_mcryptd_ctx(sha_ctx);
>                 ret = sha_finish_walk(&req_ctx, cstate, false);
>                 if (req_ctx) {
> -                       spin_lock(&cstate->work_lock);
> +                       spin_lock_irqsave(&cstate->work_lock, flags);
>                         list_del(&req_ctx->waiter);
> -                       spin_unlock(&cstate->work_lock);
> +                       spin_unlock_irqrestore(&cstate->work_lock, flags);
>
>                         req = cast_mcryptd_ctx_to_req(req_ctx);
>                         if (irqs_disabled())
> @@ -504,7 +524,7 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
>                                 local_bh_enable();
>                         }
>                 }
> -               sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
> +               sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
>         }
>
>         return 0;
> @@ -515,6 +535,7 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
>  {
>         unsigned long next_flush;
>         unsigned long delay = usecs_to_jiffies(FLUSH_INTERVAL);
> +       unsigned long flags;
>
>         /* initialize tag */
>         rctx->tag.arrival = jiffies;    /* tag the arrival time */
> @@ -522,9 +543,9 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
>         next_flush = rctx->tag.arrival + delay;
>         rctx->tag.expire = next_flush;
>
> -       spin_lock(&cstate->work_lock);
> +       spin_lock_irqsave(&cstate->work_lock, flags);
>         list_add_tail(&rctx->waiter, &cstate->work_list);
> -       spin_unlock(&cstate->work_lock);
> +       spin_unlock_irqrestore(&cstate->work_lock, flags);
>
>         mcryptd_arm_flusher(cstate, delay);
>  }
> @@ -565,7 +586,7 @@ static int sha512_mb_update(struct ahash_request *areq)
>         sha_ctx = (struct sha512_hash_ctx *) ahash_request_ctx(areq);
>         sha512_mb_add_list(rctx, cstate);
>         kernel_fpu_begin();
> -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
> +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
>                                                         nbytes, HASH_UPDATE);
>         kernel_fpu_end();
>
> @@ -628,7 +649,7 @@ static int sha512_mb_finup(struct ahash_request *areq)
>         sha512_mb_add_list(rctx, cstate);
>
>         kernel_fpu_begin();
> -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
> +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
>                                                                 nbytes, flag);
>         kernel_fpu_end();
>
> @@ -677,8 +698,7 @@ static int sha512_mb_final(struct ahash_request *areq)
>         /* flag HASH_FINAL and 0 data size */
>         sha512_mb_add_list(rctx, cstate);
>         kernel_fpu_begin();
> -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, &data, 0,
> -                                                               HASH_LAST);
> +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, &data, 0, HASH_LAST);
>         kernel_fpu_end();
>
>         /* check if anything is returned */
> @@ -940,7 +960,7 @@ static unsigned long sha512_mb_flusher(struct mcryptd_alg_cstate *cstate)
>                         break;
>                 kernel_fpu_begin();
>                 sha_ctx = (struct sha512_hash_ctx *)
> -                                       sha512_ctx_mgr_flush(cstate->mgr);
> +                                       sha512_ctx_mgr_flush(cstate);
>                 kernel_fpu_end();
>                 if (!sha_ctx) {
>                         pr_err("sha512_mb error: nothing got flushed for"
> --
> 2.5.5
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-02-02 10:58   ` Dmitry Vyukov
@ 2017-02-02 16:00     ` Tim Chen
  0 siblings, 0 replies; 8+ messages in thread
From: Tim Chen @ 2017-02-02 16:00 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Herbert Xu, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

On Thu, 2017-02-02 at 11:58 +0100, Dmitry Vyukov wrote:
> On Wed, Feb 1, 2017 at 7:45 PM, Tim Chen <tim.c.chen@linux.intel.com> wrote:
> > 
> > On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
> > > 
> > > Hello,
> > > 
> > > I am getting the following reports with low frequency while running
> > > syzkaller fuzzer. Unfortunately they are not reproducible and happen
> > > in a background thread, so it is difficult to extract any context on
> > > my side. I see only few such crashes per week, so most likely it is
> > > some hard to trigger data race. The following reports are from mmotm
> > > tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> > > fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> > > happen?
> > > 
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
> > > IP: [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > > PGD 1d2395067 [  220.874864] PUD 1d2860067
> > > Oops: 0002 [#1] SMP KASAN
> > > Dumping ftrace buffer:
> > >    (ftrace buffer empty)
> > > Modules linked in:
> > > CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> > > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > > BIOS Google 01/01/2011
> > > Workqueue: crypto mcryptd_queue_worker
> > > task: ffff8801d9f346c0 task.stack: ffff8801d9f08000
> > > RIP: 0010:[<ffffffff813fc09e>]  [<ffffffff813fc09e>]
> > > sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > > RSP: 0018:ffff8801d9f0eef8  EFLAGS: 00010202
> > > RAX: 0000000000000000 RBX: ffff8801d7db1190 RCX: 0000000000000006
> > > RDX: 0000000000000001 RSI: ffff8801d9f34ee8 RDI: ffff8801d7db1040
> > > RBP: ffff8801d9f0f258 R08: 0000000100000000 R09: 0000000000000001
> > > R10: 0000000000000002 R11: 0000000000000003 R12: ffff8801d9f0f230
> > > R13: ffff8801c8bbc4e0 R14: ffff8801c8bbc530 R15: ffff8801d9f0ef70
> > > FS:  0000000000000000(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000000000060 CR3: 00000001cc15a000 CR4: 00000000001406f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Stack:
> > >  ffff8801d7db1040 ffffffff813fa207 dffffc0000000000 ffffe8ffffc0f238
> > >  0000000000000002 1ffff1003b3e1dea ffffe8ffffc0f218 ffff8801d9f0f190
> > >  0000000000000282 ffffe8ffffc0f140 ffffe8ffffc0f220 0000000041b58ab3
> > > Call Trace:
> > >  [<ffffffff813fb407>] sha512_mb_update+0x2f7/0x4e0
> > > arch/x86/crypto/sha512-mb/sha512_mb.c:588
> > >  [<ffffffff8219d4ad>] crypto_ahash_update include/crypto/hash.h:512 [inline]
> > >  [<ffffffff8219d4ad>] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
> > >  [<ffffffff8219d4ad>] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
> > >  [<ffffffff8219c99f>] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
> > >  [<ffffffff81492960>] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
> > >  [<ffffffff81493bc3>] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
> > >  [<ffffffff814abb33>] kthread+0x323/0x3e0 kernel/kthread.c:209
> > >  [<ffffffff8436fbaa>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> > > Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> > > 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 <c7>
> > > 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> > > RIP  [<ffffffff813fc09e>] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > >  RSP <ffff8801d9f0eef8>
> > > CR2: 0000000000000060
> > > ---[ end trace 139fd4cda5dfe2c4 ]---
> > > 
> > Dmitry,
> > 
> > One theory that Mehga and I have is that perhaps the flusher
> > and regular computaion updates are stepping on each other.
> > Can you try this patch and see if it helps?
> 
> No, for this one I can't. Sorry.
> It happens with very low frequency and only one fuzzer that tests
> mmotm tree. If/when this is committed, I can keep an eye on these
> reports and notify if I still see them.
> If you have a hypothesis as to how it happens, perhaps you could write
> a test that provokes the crash and maybe add some sleeps to kernel
> code or alter timeouts to increase probability.
> 

If this patch is merged, it will most likely go into Herbert's crypto-dev
tree and not Andrew's mm tree for testing.  We will try to do the best
on our side for testing to replicate the crash scenario.  

Will it be possible to have one of your fuzzer run the crypto-dev tree 
once the patch got merged there?  

Thanks.

Tim

> 
> 
> > 
> > --->8---
> > 
> > From: Tim Chen <tim.c.chen@linux.intel.com>
> > Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
> > To: Herbert Xu <herbert@gondor.apana.org.au>, Dmitry Vyukov <dvyukov@google.com>
> > Cc: Tim Chen <tim.c.chen@linux.intel.com>, David Miller <davem@davemloft.net>, linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, megha.dey@linux.intel.com, fenghua.yu@intel.com
> > 
> > The flusher and regular multi-buffer computation via mcryptd may race with another.
> > Add here a lock and turn off interrupt to to access multi-buffer
> > computation state cstate->mgr before a round of computation. This should
> > prevent the flusher code jumping in.
> > 
> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > ---
> >  arch/x86/crypto/sha512-mb/sha512_mb.c | 64 +++++++++++++++++++++++------------
> >  1 file changed, 42 insertions(+), 22 deletions(-)
> > 
> > diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c b/arch/x86/crypto/sha512-mb/sha512_mb.c
> > index d210174..f3c1c21 100644
> > --- a/arch/x86/crypto/sha512-mb/sha512_mb.c
> > +++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
> > @@ -221,7 +221,7 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_resubmit
> >  }
> > 
> >  static struct sha512_hash_ctx
> > -               *sha512_ctx_mgr_get_comp_ctx(struct sha512_ctx_mgr *mgr)
> > +               *sha512_ctx_mgr_get_comp_ctx(struct mcryptd_alg_cstate *cstate)
> >  {
> >         /*
> >          * If get_comp_job returns NULL, there are no jobs complete.
> > @@ -233,11 +233,17 @@ static struct sha512_hash_ctx
> >          * Otherwise, all jobs currently being managed by the hash_ctx_mgr
> >          * still need processing.
> >          */
> > +       struct sha512_ctx_mgr *mgr;
> >         struct sha512_hash_ctx *ctx;
> > +       unsigned long flags;
> > 
> > +       mgr = cstate->mgr;
> > +       spin_lock_irqsave(&cstate->work_lock, flags);
> >         ctx = (struct sha512_hash_ctx *)
> >                                 sha512_job_mgr_get_comp_job(&mgr->mgr);
> > -       return sha512_ctx_mgr_resubmit(mgr, ctx);
> > +       ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
> > +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> > +       return ctx;
> >  }
> > 
> >  static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
> > @@ -246,12 +252,17 @@ static void sha512_ctx_mgr_init(struct sha512_ctx_mgr *mgr)
> >  }
> > 
> >  static struct sha512_hash_ctx
> > -                       *sha512_ctx_mgr_submit(struct sha512_ctx_mgr *mgr,
> > +                       *sha512_ctx_mgr_submit(struct mcryptd_alg_cstate *cstate,
> >                                           struct sha512_hash_ctx *ctx,
> >                                           const void *buffer,
> >                                           uint32_t len,
> >                                           int flags)
> >  {
> > +       struct sha512_ctx_mgr *mgr;
> > +       unsigned long irqflags;
> > +
> > +       mgr = cstate->mgr;
> > +       spin_lock_irqsave(&cstate->work_lock, irqflags);
> >         if (flags & (~HASH_ENTIRE)) {
> >                 /*
> >                  * User should not pass anything other than FIRST, UPDATE, or
> > @@ -351,20 +362,26 @@ static struct sha512_hash_ctx
> >                 }
> >         }
> > 
> > -       return sha512_ctx_mgr_resubmit(mgr, ctx);
> > +       ctx = sha512_ctx_mgr_resubmit(mgr, ctx);
> > +       spin_unlock_irqrestore(&cstate->work_lock, irqflags);
> > +       return ctx;
> >  }
> > 
> > -static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
> > +static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct mcryptd_alg_cstate *cstate)
> >  {
> > +       struct sha512_ctx_mgr *mgr;
> >         struct sha512_hash_ctx *ctx;
> > +       unsigned long flags;
> > 
> > +       mgr = cstate->mgr;
> > +       spin_lock_irqsave(&cstate->work_lock, flags);
> >         while (1) {
> >                 ctx = (struct sha512_hash_ctx *)
> >                                         sha512_job_mgr_flush(&mgr->mgr);
> > 
> >                 /* If flush returned 0, there are no more jobs in flight. */
> >                 if (!ctx)
> > -                       return NULL;
> > +                       break;
> > 
> >                 /*
> >                  * If flush returned a job, resubmit the job to finish
> > @@ -378,8 +395,10 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_flush(struct sha512_ctx_mgr *mgr)
> >                  * the sha512_ctx_mgr still need processing. Loop.
> >                  */
> >                 if (ctx)
> > -                       return ctx;
> > +                       break;
> >         }
> > +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> > +       return ctx;
> >  }
> > 
> >  static int sha512_mb_init(struct ahash_request *areq)
> > @@ -439,11 +458,11 @@ static int sha_finish_walk(struct mcryptd_hash_request_ctx **ret_rctx,
> >                 sha_ctx = (struct sha512_hash_ctx *)
> >                                                 ahash_request_ctx(&rctx->areq);
> >                 kernel_fpu_begin();
> > -               sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx,
> > +               sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx,
> >                                                 rctx->walk.data, nbytes, flag);
> >                 if (!sha_ctx) {
> >                         if (flush)
> > -                               sha_ctx = sha512_ctx_mgr_flush(cstate->mgr);
> > +                               sha_ctx = sha512_ctx_mgr_flush(cstate);
> >                 }
> >                 kernel_fpu_end();
> >                 if (sha_ctx)
> > @@ -471,11 +490,12 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
> >         struct sha512_hash_ctx *sha_ctx;
> >         struct mcryptd_hash_request_ctx *req_ctx;
> >         int ret;
> > +       unsigned long flags;
> > 
> >         /* remove from work list */
> > -       spin_lock(&cstate->work_lock);
> > +       spin_lock_irqsave(&cstate->work_lock, flags);
> >         list_del(&rctx->waiter);
> > -       spin_unlock(&cstate->work_lock);
> > +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> > 
> >         if (irqs_disabled())
> >                 rctx->complete(&req->base, err);
> > @@ -486,14 +506,14 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
> >         }
> > 
> >         /* check to see if there are other jobs that are done */
> > -       sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
> > +       sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
> >         while (sha_ctx) {
> >                 req_ctx = cast_hash_to_mcryptd_ctx(sha_ctx);
> >                 ret = sha_finish_walk(&req_ctx, cstate, false);
> >                 if (req_ctx) {
> > -                       spin_lock(&cstate->work_lock);
> > +                       spin_lock_irqsave(&cstate->work_lock, flags);
> >                         list_del(&req_ctx->waiter);
> > -                       spin_unlock(&cstate->work_lock);
> > +                       spin_unlock_irqrestore(&cstate->work_lock, flags);
> > 
> >                         req = cast_mcryptd_ctx_to_req(req_ctx);
> >                         if (irqs_disabled())
> > @@ -504,7 +524,7 @@ static int sha_complete_job(struct mcryptd_hash_request_ctx *rctx,
> >                                 local_bh_enable();
> >                         }
> >                 }
> > -               sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate->mgr);
> > +               sha_ctx = sha512_ctx_mgr_get_comp_ctx(cstate);
> >         }
> > 
> >         return 0;
> > @@ -515,6 +535,7 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
> >  {
> >         unsigned long next_flush;
> >         unsigned long delay = usecs_to_jiffies(FLUSH_INTERVAL);
> > +       unsigned long flags;
> > 
> >         /* initialize tag */
> >         rctx->tag.arrival = jiffies;    /* tag the arrival time */
> > @@ -522,9 +543,9 @@ static void sha512_mb_add_list(struct mcryptd_hash_request_ctx *rctx,
> >         next_flush = rctx->tag.arrival + delay;
> >         rctx->tag.expire = next_flush;
> > 
> > -       spin_lock(&cstate->work_lock);
> > +       spin_lock_irqsave(&cstate->work_lock, flags);
> >         list_add_tail(&rctx->waiter, &cstate->work_list);
> > -       spin_unlock(&cstate->work_lock);
> > +       spin_unlock_irqrestore(&cstate->work_lock, flags);
> > 
> >         mcryptd_arm_flusher(cstate, delay);
> >  }
> > @@ -565,7 +586,7 @@ static int sha512_mb_update(struct ahash_request *areq)
> >         sha_ctx = (struct sha512_hash_ctx *) ahash_request_ctx(areq);
> >         sha512_mb_add_list(rctx, cstate);
> >         kernel_fpu_begin();
> > -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
> > +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
> >                                                         nbytes, HASH_UPDATE);
> >         kernel_fpu_end();
> > 
> > @@ -628,7 +649,7 @@ static int sha512_mb_finup(struct ahash_request *areq)
> >         sha512_mb_add_list(rctx, cstate);
> > 
> >         kernel_fpu_begin();
> > -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, rctx->walk.data,
> > +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, rctx->walk.data,
> >                                                                 nbytes, flag);
> >         kernel_fpu_end();
> > 
> > @@ -677,8 +698,7 @@ static int sha512_mb_final(struct ahash_request *areq)
> >         /* flag HASH_FINAL and 0 data size */
> >         sha512_mb_add_list(rctx, cstate);
> >         kernel_fpu_begin();
> > -       sha_ctx = sha512_ctx_mgr_submit(cstate->mgr, sha_ctx, &data, 0,
> > -                                                               HASH_LAST);
> > +       sha_ctx = sha512_ctx_mgr_submit(cstate, sha_ctx, &data, 0, HASH_LAST);
> >         kernel_fpu_end();
> > 
> >         /* check if anything is returned */
> > @@ -940,7 +960,7 @@ static unsigned long sha512_mb_flusher(struct mcryptd_alg_cstate *cstate)
> >                         break;
> >                 kernel_fpu_begin();
> >                 sha_ctx = (struct sha512_hash_ctx *)
> > -                                       sha512_ctx_mgr_flush(cstate->mgr);
> > +                                       sha512_ctx_mgr_flush(cstate);
> >                 kernel_fpu_end();
> >                 if (!sha_ctx) {
> >                         pr_err("sha512_mb error: nothing got flushed for"
> > --
> > 2.5.5
> > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-02-01 18:45 ` Tim Chen
  2017-02-02 10:58   ` Dmitry Vyukov
@ 2017-02-11 10:50   ` Herbert Xu
  2017-02-13 17:20     ` Tim Chen
  1 sibling, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2017-02-11 10:50 UTC (permalink / raw)
  To: Tim Chen
  Cc: Dmitry Vyukov, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

On Wed, Feb 01, 2017 at 10:45:02AM -0800, Tim Chen wrote:
>
> One theory that Mehga and I have is that perhaps the flusher
> and regular computaion updates are stepping on each other. 
> Can you try this patch and see if it helps?

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-02-11 10:50   ` Herbert Xu
@ 2017-02-13 17:20     ` Tim Chen
  2017-02-15  3:42       ` Herbert Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Tim Chen @ 2017-02-13 17:20 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Dmitry Vyukov, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

On Sat, 2017-02-11 at 18:50 +0800, Herbert Xu wrote:
> On Wed, Feb 01, 2017 at 10:45:02AM -0800, Tim Chen wrote:
> > 
> > 
> > One theory that Mehga and I have is that perhaps the flusher
> > and regular computaion updates are stepping on each other. 
> > Can you try this patch and see if it helps?
> Patch applied.  Thanks.

Herbert,

Megha is now able to create a test set up that produce
similar problem reported by Dmitry.  This patch did not
completely fix it.  So maybe you can hold off on merging
this patch to the mainline till we can develop a more
complete fix.

Thanks.

Tim

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2
  2017-02-13 17:20     ` Tim Chen
@ 2017-02-15  3:42       ` Herbert Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Herbert Xu @ 2017-02-15  3:42 UTC (permalink / raw)
  To: Tim Chen
  Cc: Dmitry Vyukov, David Miller, linux-crypto, LKML, megha.dey,
	fenghua.yu, syzkaller

On Mon, Feb 13, 2017 at 09:20:48AM -0800, Tim Chen wrote:
> 
> Megha is now able to create a test set up that produce
> similar problem reported by Dmitry.  This patch did not
> completely fix it.  So maybe you can hold off on merging
> this patch to the mainline till we can develop a more
> complete fix.

Tim, please send them as follow-up patches.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-02-15  3:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-31 13:16 crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2 Dmitry Vyukov
2017-01-31 18:51 ` Tim Chen
2017-02-01 18:45 ` Tim Chen
2017-02-02 10:58   ` Dmitry Vyukov
2017-02-02 16:00     ` Tim Chen
2017-02-11 10:50   ` Herbert Xu
2017-02-13 17:20     ` Tim Chen
2017-02-15  3:42       ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).