All of lore.kernel.org
 help / color / mirror / Atom feed
* race condition in kernel/padata.c
@ 2017-03-22 23:03 ` Jason A. Donenfeld
  0 siblings, 0 replies; 13+ messages in thread
From: Jason A. Donenfeld @ 2017-03-22 23:03 UTC (permalink / raw)
  To: Steffen Klassert, Linux Crypto Mailing List, LKML, Netdev
  Cc: WireGuard mailing list

Hey Steffen,

WireGuard makes really heavy use of padata, feeding it units of work
from different cores in different contexts all at the same time. For
the most part, everything has been fine, but one particular user has
consistently run into mysterious bugs. He's using a rather old dual
core CPU, which have a tendency to bring out race conditions
sometimes. After struggling with getting a good backtrace, we finally
managed to extract this from list debugging:

[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120

padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:

spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);

This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:

next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
       padata = list_entry(reorder->list.next,
                           struct padata_priv, list);
       spin_lock(&reorder->lock);
       list_del_init(&padata->list);
       atomic_dec(&pd->reorder_objects);
       spin_unlock(&reorder->lock);

       pd->processed++;

       goto out;
}
out:
return padata;

I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix would thus be to hoist that lock
outside of that block.

This theory is unconfirmed at the moment, but I'll be playing with
some patches to see if this fixes the issue and then I'll get back to
you. In the meantime, if you have any insight before I potentially
waste some time, I'm all ears.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* race condition in kernel/padata.c
@ 2017-03-22 23:03 ` Jason A. Donenfeld
  0 siblings, 0 replies; 13+ messages in thread
From: Jason A. Donenfeld @ 2017-03-22 23:03 UTC (permalink / raw)
  To: Steffen Klassert, Linux Crypto Mailing List, LKML, Netdev
  Cc: Samuel Holland, WireGuard mailing list

Hey Steffen,

WireGuard makes really heavy use of padata, feeding it units of work
from different cores in different contexts all at the same time. For
the most part, everything has been fine, but one particular user has
consistently run into mysterious bugs. He's using a rather old dual
core CPU, which have a tendency to bring out race conditions
sometimes. After struggling with getting a good backtrace, we finally
managed to extract this from list debugging:

[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120

padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:

spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);

This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:

next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
       padata = list_entry(reorder->list.next,
                           struct padata_priv, list);
       spin_lock(&reorder->lock);
       list_del_init(&padata->list);
       atomic_dec(&pd->reorder_objects);
       spin_unlock(&reorder->lock);

       pd->processed++;

       goto out;
}
out:
return padata;

I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix would thus be to hoist that lock
outside of that block.

This theory is unconfirmed at the moment, but I'll be playing with
some patches to see if this fixes the issue and then I'll get back to
you. In the meantime, if you have any insight before I potentially
waste some time, I'm all ears.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: race condition in kernel/padata.c
  2017-03-22 23:03 ` Jason A. Donenfeld
@ 2017-03-23  8:40   ` Steffen Klassert
  -1 siblings, 0 replies; 13+ messages in thread
From: Steffen Klassert @ 2017-03-23  8:40 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Netdev, Linux Crypto Mailing List, WireGuard mailing list, LKML

On Thu, Mar 23, 2017 at 12:03:43AM +0100, Jason A. Donenfeld wrote:
> Hey Steffen,
> 
> WireGuard makes really heavy use of padata, feeding it units of work
> from different cores in different contexts all at the same time. For
> the most part, everything has been fine, but one particular user has
> consistently run into mysterious bugs. He's using a rather old dual
> core CPU, which have a tendency to bring out race conditions
> sometimes. After struggling with getting a good backtrace, we finally
> managed to extract this from list debugging:
> 
> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> __list_add+0xae/0x130
> [87487.301868] list_add corruption. prev->next should be next
> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
> 
> padata_reorder calls list_add_tail with the list to which its adding
> locked, which seems correct:
> 
> spin_lock(&squeue->serial.lock);
> list_add_tail(&padata->list, &squeue->serial.list);
> spin_unlock(&squeue->serial.lock);
> 
> This therefore leaves only place where such inconsistency could occur:
> if padata->list is added at the same time on two different threads.
> This pdata pointer comes from the function call to
> padata_get_next(pd), which has in it the following block:
> 
> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> padata = NULL;
> reorder = &next_queue->reorder;
> if (!list_empty(&reorder->list)) {
>        padata = list_entry(reorder->list.next,
>                            struct padata_priv, list);
>        spin_lock(&reorder->lock);
>        list_del_init(&padata->list);
>        atomic_dec(&pd->reorder_objects);
>        spin_unlock(&reorder->lock);
> 
>        pd->processed++;
> 
>        goto out;
> }
> out:
> return padata;
> 
> I strongly suspect that the problem here is that two threads can race
> on reorder list. Even though the deletion is locked, call to
> list_entry is not locked, which means it's feasible that two threads
> pick up the same padata object and subsequently call list_add_tail on
> them at the same time. The fix would thus be to hoist that lock
> outside of that block.

Yes, looks like we should lock the whole list handling block here.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: race condition in kernel/padata.c
@ 2017-03-23  8:40   ` Steffen Klassert
  0 siblings, 0 replies; 13+ messages in thread
From: Steffen Klassert @ 2017-03-23  8:40 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Linux Crypto Mailing List, LKML, Netdev, Samuel Holland,
	WireGuard mailing list

On Thu, Mar 23, 2017 at 12:03:43AM +0100, Jason A. Donenfeld wrote:
> Hey Steffen,
> 
> WireGuard makes really heavy use of padata, feeding it units of work
> from different cores in different contexts all at the same time. For
> the most part, everything has been fine, but one particular user has
> consistently run into mysterious bugs. He's using a rather old dual
> core CPU, which have a tendency to bring out race conditions
> sometimes. After struggling with getting a good backtrace, we finally
> managed to extract this from list debugging:
> 
> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> __list_add+0xae/0x130
> [87487.301868] list_add corruption. prev->next should be next
> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
> 
> padata_reorder calls list_add_tail with the list to which its adding
> locked, which seems correct:
> 
> spin_lock(&squeue->serial.lock);
> list_add_tail(&padata->list, &squeue->serial.list);
> spin_unlock(&squeue->serial.lock);
> 
> This therefore leaves only place where such inconsistency could occur:
> if padata->list is added at the same time on two different threads.
> This pdata pointer comes from the function call to
> padata_get_next(pd), which has in it the following block:
> 
> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> padata = NULL;
> reorder = &next_queue->reorder;
> if (!list_empty(&reorder->list)) {
>        padata = list_entry(reorder->list.next,
>                            struct padata_priv, list);
>        spin_lock(&reorder->lock);
>        list_del_init(&padata->list);
>        atomic_dec(&pd->reorder_objects);
>        spin_unlock(&reorder->lock);
> 
>        pd->processed++;
> 
>        goto out;
> }
> out:
> return padata;
> 
> I strongly suspect that the problem here is that two threads can race
> on reorder list. Even though the deletion is locked, call to
> list_entry is not locked, which means it's feasible that two threads
> pick up the same padata object and subsequently call list_add_tail on
> them at the same time. The fix would thus be to hoist that lock
> outside of that block.

Yes, looks like we should lock the whole list handling block here.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] padata: avoid race in reordering
  2017-03-22 23:03 ` Jason A. Donenfeld
  (?)
  (?)
@ 2017-03-23 11:24 ` Jason A. Donenfeld
  2017-03-24  9:41   ` Steffen Klassert
  2017-03-24 14:16   ` Herbert Xu
  -1 siblings, 2 replies; 13+ messages in thread
From: Jason A. Donenfeld @ 2017-03-23 11:24 UTC (permalink / raw)
  To: Steffen Klassert, Linux Crypto Mailing List, LKML, Netdev
  Cc: Jason A. Donenfeld

Under extremely heavy uses of padata, crashes occur, and with list
debugging turned on, this happens instead:

[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120

padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:

spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);

This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:

next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
       padata = list_entry(reorder->list.next,
                           struct padata_priv, list);
       spin_lock(&reorder->lock);
       list_del_init(&padata->list);
       atomic_dec(&pd->reorder_objects);
       spin_unlock(&reorder->lock);

       pd->processed++;

       goto out;
}
out:
return padata;

I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix is thus be hoist that lock outside of
that block.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 kernel/padata.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 05316c9f32da..3202aa17492c 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -186,19 +186,20 @@ static struct padata_priv *padata_get_next(struct parallel_data *pd)
 
 	reorder = &next_queue->reorder;
 
+	spin_lock(&reorder->lock);
 	if (!list_empty(&reorder->list)) {
 		padata = list_entry(reorder->list.next,
 				    struct padata_priv, list);
 
-		spin_lock(&reorder->lock);
 		list_del_init(&padata->list);
 		atomic_dec(&pd->reorder_objects);
-		spin_unlock(&reorder->lock);
 
 		pd->processed++;
 
+		spin_unlock(&reorder->lock);
 		goto out;
 	}
+	spin_unlock(&reorder->lock);
 
 	if (__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index) {
 		padata = ERR_PTR(-ENODATA);
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-23 11:24 ` [PATCH] padata: avoid race in reordering Jason A. Donenfeld
@ 2017-03-24  9:41   ` Steffen Klassert
  2017-03-26  3:01     ` David Miller
  2017-03-24 14:16   ` Herbert Xu
  1 sibling, 1 reply; 13+ messages in thread
From: Steffen Klassert @ 2017-03-24  9:41 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: Linux Crypto Mailing List, LKML, Netdev

On Thu, Mar 23, 2017 at 12:24:43PM +0100, Jason A. Donenfeld wrote:
> Under extremely heavy uses of padata, crashes occur, and with list
> debugging turned on, this happens instead:
> 
> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> __list_add+0xae/0x130
> [87487.301868] list_add corruption. prev->next should be next
> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
> 
> padata_reorder calls list_add_tail with the list to which its adding
> locked, which seems correct:
> 
> spin_lock(&squeue->serial.lock);
> list_add_tail(&padata->list, &squeue->serial.list);
> spin_unlock(&squeue->serial.lock);
> 
> This therefore leaves only place where such inconsistency could occur:
> if padata->list is added at the same time on two different threads.
> This pdata pointer comes from the function call to
> padata_get_next(pd), which has in it the following block:
> 
> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> padata = NULL;
> reorder = &next_queue->reorder;
> if (!list_empty(&reorder->list)) {
>        padata = list_entry(reorder->list.next,
>                            struct padata_priv, list);
>        spin_lock(&reorder->lock);
>        list_del_init(&padata->list);
>        atomic_dec(&pd->reorder_objects);
>        spin_unlock(&reorder->lock);
> 
>        pd->processed++;
> 
>        goto out;
> }
> out:
> return padata;
> 
> I strongly suspect that the problem here is that two threads can race
> on reorder list. Even though the deletion is locked, call to
> list_entry is not locked, which means it's feasible that two threads
> pick up the same padata object and subsequently call list_add_tail on
> them at the same time. The fix is thus be hoist that lock outside of
> that block.
> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Acked-by: Steffen Klassert <steffen.klassert@secunet.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-23 11:24 ` [PATCH] padata: avoid race in reordering Jason A. Donenfeld
  2017-03-24  9:41   ` Steffen Klassert
@ 2017-03-24 14:16   ` Herbert Xu
  2017-04-04 11:53     ` Jason A. Donenfeld
  1 sibling, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2017-03-24 14:16 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: steffen.klassert, linux-crypto, linux-kernel, netdev, Jason

Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Under extremely heavy uses of padata, crashes occur, and with list
> debugging turned on, this happens instead:
> 
> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> __list_add+0xae/0x130
> [87487.301868] list_add corruption. prev->next should be next
> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
> 
> padata_reorder calls list_add_tail with the list to which its adding
> locked, which seems correct:
> 
> spin_lock(&squeue->serial.lock);
> list_add_tail(&padata->list, &squeue->serial.list);
> spin_unlock(&squeue->serial.lock);
> 
> This therefore leaves only place where such inconsistency could occur:
> if padata->list is added at the same time on two different threads.
> This pdata pointer comes from the function call to
> padata_get_next(pd), which has in it the following block:
> 
> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> padata = NULL;
> reorder = &next_queue->reorder;
> if (!list_empty(&reorder->list)) {
>       padata = list_entry(reorder->list.next,
>                           struct padata_priv, list);
>       spin_lock(&reorder->lock);
>       list_del_init(&padata->list);
>       atomic_dec(&pd->reorder_objects);
>       spin_unlock(&reorder->lock);
> 
>       pd->processed++;
> 
>       goto out;
> }
> out:
> return padata;
> 
> I strongly suspect that the problem here is that two threads can race
> on reorder list. Even though the deletion is locked, call to
> list_entry is not locked, which means it's feasible that two threads
> pick up the same padata object and subsequently call list_add_tail on
> them at the same time. The fix is thus be hoist that lock outside of
> that block.
> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-24  9:41   ` Steffen Klassert
@ 2017-03-26  3:01     ` David Miller
  2017-03-26  3:11       ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2017-03-26  3:01 UTC (permalink / raw)
  To: steffen.klassert; +Cc: Jason, linux-crypto, linux-kernel, netdev, herbert

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Fri, 24 Mar 2017 10:41:59 +0100

> On Thu, Mar 23, 2017 at 12:24:43PM +0100, Jason A. Donenfeld wrote:
>> Under extremely heavy uses of padata, crashes occur, and with list
>> debugging turned on, this happens instead:
 ...
>> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> 
> Acked-by: Steffen Klassert <steffen.klassert@secunet.com>

Herbert, this should probably go via your crypto tree.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-26  3:01     ` David Miller
@ 2017-03-26  3:11       ` Herbert Xu
  2017-03-26 12:32         ` Jason A. Donenfeld
  0 siblings, 1 reply; 13+ messages in thread
From: Herbert Xu @ 2017-03-26  3:11 UTC (permalink / raw)
  To: David Miller; +Cc: steffen.klassert, Jason, linux-crypto, linux-kernel, netdev

On Sat, Mar 25, 2017 at 08:01:51PM -0700, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Fri, 24 Mar 2017 10:41:59 +0100
> 
> > On Thu, Mar 23, 2017 at 12:24:43PM +0100, Jason A. Donenfeld wrote:
> >> Under extremely heavy uses of padata, crashes occur, and with list
> >> debugging turned on, this happens instead:
>  ...
> >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > 
> > Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> Herbert, this should probably go via your crypto tree.

Thanks David.  It's already in the crypto tree.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-26  3:11       ` Herbert Xu
@ 2017-03-26 12:32         ` Jason A. Donenfeld
  0 siblings, 0 replies; 13+ messages in thread
From: Jason A. Donenfeld @ 2017-03-26 12:32 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, Steffen Klassert, Linux Crypto Mailing List, LKML, Netdev

I've got a few other races in padata, I think, that I'm working on
fixing. If these pan out, I'll submit them exclusively to -crypto
instead of -netdev, to avoid this confusion next time. Of course, if
I'm able to fix these, then I'll probably be bald from pulling my hair
out during this insane debugging frenzy of the last few days...

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-03-24 14:16   ` Herbert Xu
@ 2017-04-04 11:53     ` Jason A. Donenfeld
  2017-04-04 18:26       ` Greg KH
  0 siblings, 1 reply; 13+ messages in thread
From: Jason A. Donenfeld @ 2017-04-04 11:53 UTC (permalink / raw)
  To: stable; +Cc: Steffen Klassert, Linux Crypto Mailing List, LKML, Herbert Xu

Herbert applied this to his tree. It's probably a good stable
candidate, since it's a two line change to fix a race condition.

On Fri, Mar 24, 2017 at 3:16 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>> Under extremely heavy uses of padata, crashes occur, and with list
>> debugging turned on, this happens instead:
>>
>> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
>> __list_add+0xae/0x130
>> [87487.301868] list_add corruption. prev->next should be next
>> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
>> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
>> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
>> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
>> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
>> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
>> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
>> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
>> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
>>
>> padata_reorder calls list_add_tail with the list to which its adding
>> locked, which seems correct:
>>
>> spin_lock(&squeue->serial.lock);
>> list_add_tail(&padata->list, &squeue->serial.list);
>> spin_unlock(&squeue->serial.lock);
>>
>> This therefore leaves only place where such inconsistency could occur:
>> if padata->list is added at the same time on two different threads.
>> This pdata pointer comes from the function call to
>> padata_get_next(pd), which has in it the following block:
>>
>> next_queue = per_cpu_ptr(pd->pqueue, cpu);
>> padata = NULL;
>> reorder = &next_queue->reorder;
>> if (!list_empty(&reorder->list)) {
>>       padata = list_entry(reorder->list.next,
>>                           struct padata_priv, list);
>>       spin_lock(&reorder->lock);
>>       list_del_init(&padata->list);
>>       atomic_dec(&pd->reorder_objects);
>>       spin_unlock(&reorder->lock);
>>
>>       pd->processed++;
>>
>>       goto out;
>> }
>> out:
>> return padata;
>>
>> I strongly suspect that the problem here is that two threads can race
>> on reorder list. Even though the deletion is locked, call to
>> list_entry is not locked, which means it's feasible that two threads
>> pick up the same padata object and subsequently call list_add_tail on
>> them at the same time. The fix is thus be hoist that lock outside of
>> that block.
>>
>> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
>
> Patch applied.  Thanks.
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-04-04 11:53     ` Jason A. Donenfeld
@ 2017-04-04 18:26       ` Greg KH
  2017-04-05 10:29         ` Herbert Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2017-04-04 18:26 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: stable, Steffen Klassert, Linux Crypto Mailing List, LKML, Herbert Xu

On Tue, Apr 04, 2017 at 01:53:15PM +0200, Jason A. Donenfeld wrote:
> Herbert applied this to his tree. It's probably a good stable
> candidate, since it's a two line change to fix a race condition.
> 
> On Fri, Mar 24, 2017 at 3:16 PM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >> Under extremely heavy uses of padata, crashes occur, and with list
> >> debugging turned on, this happens instead:
> >>
> >> [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
> >> __list_add+0xae/0x130
> >> [87487.301868] list_add corruption. prev->next should be next
> >> (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
> >> [87487.339011]  [<ffffffff9a53d075>] dump_stack+0x68/0xa3
> >> [87487.342198]  [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
> >> [87487.345364]  [<ffffffff99d6b91f>] __warn+0xff/0x140
> >> [87487.348513]  [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
> >> [87487.351659]  [<ffffffff9a58b5de>] __list_add+0xae/0x130
> >> [87487.354772]  [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
> >> [87487.357915]  [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
> >> [87487.361084]  [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
> >>
> >> padata_reorder calls list_add_tail with the list to which its adding
> >> locked, which seems correct:
> >>
> >> spin_lock(&squeue->serial.lock);
> >> list_add_tail(&padata->list, &squeue->serial.list);
> >> spin_unlock(&squeue->serial.lock);
> >>
> >> This therefore leaves only place where such inconsistency could occur:
> >> if padata->list is added at the same time on two different threads.
> >> This pdata pointer comes from the function call to
> >> padata_get_next(pd), which has in it the following block:
> >>
> >> next_queue = per_cpu_ptr(pd->pqueue, cpu);
> >> padata = NULL;
> >> reorder = &next_queue->reorder;
> >> if (!list_empty(&reorder->list)) {
> >>       padata = list_entry(reorder->list.next,
> >>                           struct padata_priv, list);
> >>       spin_lock(&reorder->lock);
> >>       list_del_init(&padata->list);
> >>       atomic_dec(&pd->reorder_objects);
> >>       spin_unlock(&reorder->lock);
> >>
> >>       pd->processed++;
> >>
> >>       goto out;
> >> }
> >> out:
> >> return padata;
> >>
> >> I strongly suspect that the problem here is that two threads can race
> >> on reorder list. Even though the deletion is locked, call to
> >> list_entry is not locked, which means it's feasible that two threads
> >> pick up the same padata object and subsequently call list_add_tail on
> >> them at the same time. The fix is thus be hoist that lock outside of
> >> that block.
> >>
> >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >
> > Patch applied.  Thanks.

Any clue as to what the git commit id is?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] padata: avoid race in reordering
  2017-04-04 18:26       ` Greg KH
@ 2017-04-05 10:29         ` Herbert Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Herbert Xu @ 2017-04-05 10:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Jason A. Donenfeld, stable, Steffen Klassert,
	Linux Crypto Mailing List, LKML

On Tue, Apr 04, 2017 at 08:26:12PM +0200, Greg KH wrote:
> Any clue as to what the git commit id is?

It's

https://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git/commit/?h=linus&id=de5540d088fe97ad583cc7d396586437b32149a5


Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-04-05 10:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22 23:03 race condition in kernel/padata.c Jason A. Donenfeld
2017-03-22 23:03 ` Jason A. Donenfeld
2017-03-23  8:40 ` Steffen Klassert
2017-03-23  8:40   ` Steffen Klassert
2017-03-23 11:24 ` [PATCH] padata: avoid race in reordering Jason A. Donenfeld
2017-03-24  9:41   ` Steffen Klassert
2017-03-26  3:01     ` David Miller
2017-03-26  3:11       ` Herbert Xu
2017-03-26 12:32         ` Jason A. Donenfeld
2017-03-24 14:16   ` Herbert Xu
2017-04-04 11:53     ` Jason A. Donenfeld
2017-04-04 18:26       ` Greg KH
2017-04-05 10:29         ` Herbert Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.