All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] packet: hold bind lock when rebinding to fanout hook
@ 2017-09-14 21:14 Willem de Bruijn
  2017-09-14 22:53 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Willem de Bruijn @ 2017-09-14 21:14 UTC (permalink / raw)
  To: netdev; +Cc: davem, nixiaoming, Willem de Bruijn

Packet socket bind operations must hold the po->bind_lock. This keeps
po->running consistent with whether the socket is actually on a ptype
list to receive packets.

fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
binds the fanout object to receive through packet_rcv_fanout.

Make it hold the po->bind_lock when testing po->running and rebinding.
Else, it can race with other rebind operations, such as that in
packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
can result in a socket being added to a fanout group twice, causing
use-after-free KASAN bug reports, among others.

Reported independently by both trinity and syzkaller.
Verified that the syzkaller reproducer passes after this patch.

Reported-by: nixioaming <nixiaoming@huawei.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/packet/af_packet.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index c26172995511..d288f52c53f7 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1684,10 +1684,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
 
 	mutex_lock(&fanout_mutex);
 
-	err = -EINVAL;
-	if (!po->running)
-		goto out;
-
 	err = -EALREADY;
 	if (po->fanout)
 		goto out;
@@ -1749,7 +1745,10 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
 		list_add(&match->list, &fanout_list);
 	}
 	err = -EINVAL;
-	if (match->type == type &&
+
+	spin_lock(&po->bind_lock);
+	if (po->running &&
+	    match->type == type &&
 	    match->prot_hook.type == po->prot_hook.type &&
 	    match->prot_hook.dev == po->prot_hook.dev) {
 		err = -ENOSPC;
@@ -1761,6 +1760,13 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
 			err = 0;
 		}
 	}
+	spin_unlock(&po->bind_lock);
+
+	if (err && !refcount_read(&match->sk_ref)) {
+		list_del(&match->list);
+		kfree(match);
+	}
+
 out:
 	if (err && rollover) {
 		kfree(rollover);
-- 
2.14.1.690.gbb1197296e-goog

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-14 21:14 [PATCH net] packet: hold bind lock when rebinding to fanout hook Willem de Bruijn
@ 2017-09-14 22:53 ` Eric Dumazet
  2017-09-15 14:07 ` Willem de Bruijn
  2017-09-15 18:33 ` Cong Wang
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2017-09-14 22:53 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, davem, nixiaoming

On Thu, 2017-09-14 at 17:14 -0400, Willem de Bruijn wrote:
> Packet socket bind operations must hold the po->bind_lock. This keeps
> po->running consistent with whether the socket is actually on a ptype
> list to receive packets.
> 
> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
> binds the fanout object to receive through packet_rcv_fanout.
> 
> Make it hold the po->bind_lock when testing po->running and rebinding.
> Else, it can race with other rebind operations, such as that in
> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
> can result in a socket being added to a fanout group twice, causing
> use-after-free KASAN bug reports, among others.
> 
> Reported independently by both trinity and syzkaller.
> Verified that the syzkaller reproducer passes after this patch.
> 
> Reported-by: nixioaming <nixiaoming@huawei.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/packet/af_packet.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index c26172995511..d288f52c53f7 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1684,10 +1684,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>  
>  	mutex_lock(&fanout_mutex);
>  
> -	err = -EINVAL;
> -	if (!po->running)
> -		goto out;
> -
>  	err = -EALREADY;
>  	if (po->fanout)
>  		goto out;
> @@ -1749,7 +1745,10 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>  		list_add(&match->list, &fanout_list);
>  	}
>  	err = -EINVAL;
> -	if (match->type == type &&
> +
> +	spin_lock(&po->bind_lock);
> +	if (po->running &&
> +	    match->type == type &&
>  	    match->prot_hook.type == po->prot_hook.type &&
>  	    match->prot_hook.dev == po->prot_hook.dev) {
>  		err = -ENOSPC;
> @@ -1761,6 +1760,13 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>  			err = 0;
>  		}
>  	}
> +	spin_unlock(&po->bind_lock);
> +
> +	if (err && !refcount_read(&match->sk_ref)) {

It seems sk_ref is always read/changed under 
mutex_lock(&fanout_mutex) protection.

Not sure why we use a refcount_t (or an atomic_t in older kernels)

All these atomic/spinlock/mutexes are a maze. 

> +		list_del(&match->list);
> +		kfree(match);
> +	}
> +
>  out:
>  	if (err && rollover) {
>  		kfree(rollover);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-14 21:14 [PATCH net] packet: hold bind lock when rebinding to fanout hook Willem de Bruijn
  2017-09-14 22:53 ` Eric Dumazet
@ 2017-09-15 14:07 ` Willem de Bruijn
  2017-09-20 21:03   ` David Miller
  2017-09-15 18:33 ` Cong Wang
  2 siblings, 1 reply; 7+ messages in thread
From: Willem de Bruijn @ 2017-09-15 14:07 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller, nixiaoming

On Thu, Sep 14, 2017 at 5:14 PM, Willem de Bruijn <willemb@google.com> wrote:
> Packet socket bind operations must hold the po->bind_lock. This keeps
> po->running consistent with whether the socket is actually on a ptype
> list to receive packets.
>
> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
> binds the fanout object to receive through packet_rcv_fanout.
>
> Make it hold the po->bind_lock when testing po->running and rebinding.
> Else, it can race with other rebind operations, such as that in
> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
> can result in a socket being added to a fanout group twice, causing
> use-after-free KASAN bug reports, among others.
>
> Reported independently by both trinity and syzkaller.
> Verified that the syzkaller reproducer passes after this patch.
>

I forgot to add the Fixes tag, sorry.

Fixes: dc99f600698d ("packet: Add fanout support.")

> Reported-by: nixioaming <nixiaoming@huawei.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-14 21:14 [PATCH net] packet: hold bind lock when rebinding to fanout hook Willem de Bruijn
  2017-09-14 22:53 ` Eric Dumazet
  2017-09-15 14:07 ` Willem de Bruijn
@ 2017-09-15 18:33 ` Cong Wang
  2017-09-15 20:01   ` Willem de Bruijn
  2 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2017-09-15 18:33 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Linux Kernel Network Developers, David Miller, nixiaoming

On Thu, Sep 14, 2017 at 2:14 PM, Willem de Bruijn <willemb@google.com> wrote:
> Packet socket bind operations must hold the po->bind_lock. This keeps
> po->running consistent with whether the socket is actually on a ptype
> list to receive packets.
>
> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
> binds the fanout object to receive through packet_rcv_fanout.
>
> Make it hold the po->bind_lock when testing po->running and rebinding.
> Else, it can race with other rebind operations, such as that in
> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
> can result in a socket being added to a fanout group twice, causing
> use-after-free KASAN bug reports, among others.
>
> Reported independently by both trinity and syzkaller.
> Verified that the syzkaller reproducer passes after this patch.
>
> Reported-by: nixioaming <nixiaoming@huawei.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---
>  net/packet/af_packet.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index c26172995511..d288f52c53f7 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1684,10 +1684,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>
>         mutex_lock(&fanout_mutex);
>
> -       err = -EINVAL;
> -       if (!po->running)
> -               goto out;
> -
>         err = -EALREADY;
>         if (po->fanout)
>                 goto out;
> @@ -1749,7 +1745,10 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>                 list_add(&match->list, &fanout_list);
>         }
>         err = -EINVAL;
> -       if (match->type == type &&
> +
> +       spin_lock(&po->bind_lock);
> +       if (po->running &&


As you move the po->running check later after setting po->rollover, I wonder
if po->rollover possibly depends on po>running on other path?


> +           match->type == type &&
>             match->prot_hook.type == po->prot_hook.type &&
>             match->prot_hook.dev == po->prot_hook.dev) {
>                 err = -ENOSPC;
> @@ -1761,6 +1760,13 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>                         err = 0;
>                 }
>         }
> +       spin_unlock(&po->bind_lock);
> +
> +       if (err && !refcount_read(&match->sk_ref)) {
> +               list_del(&match->list);
> +               kfree(match);
> +       }

This looks correct but still seems odd, it smells you don't use refcnt in an
expected way.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-15 18:33 ` Cong Wang
@ 2017-09-15 20:01   ` Willem de Bruijn
  0 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2017-09-15 20:01 UTC (permalink / raw)
  To: Cong Wang
  Cc: Willem de Bruijn, Linux Kernel Network Developers, David Miller,
	nixiaoming

>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index c26172995511..d288f52c53f7 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -1684,10 +1684,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>>
>>         mutex_lock(&fanout_mutex);
>>
>> -       err = -EINVAL;
>> -       if (!po->running)
>> -               goto out;
>> -
>>         err = -EALREADY;
>>         if (po->fanout)
>>                 goto out;
>> @@ -1749,7 +1745,10 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>>                 list_add(&match->list, &fanout_list);
>>         }
>>         err = -EINVAL;
>> -       if (match->type == type &&
>> +
>> +       spin_lock(&po->bind_lock);
>> +       if (po->running &&
>
>
> As you move the po->running check later after setting po->rollover, I wonder
> if po->rollover possibly depends on po>running on other path?

The rollover code does not explicitly check po->running, if that is
what you are concerned about.

A newly allocated po->rollover structure will also only be accessed
from the datapath once the socket is added to a fanout group. Both
before and after this patch, that happens well after the po->running
check.

If the socket already had a po->rollover, then it must already have
had a po->fanout, too, so fanout_add does not reach this code.

>> +           match->type == type &&
>>             match->prot_hook.type == po->prot_hook.type &&
>>             match->prot_hook.dev == po->prot_hook.dev) {
>>                 err = -ENOSPC;
>> @@ -1761,6 +1760,13 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
>>                         err = 0;
>>                 }
>>         }
>> +       spin_unlock(&po->bind_lock);
>> +
>> +       if (err && !refcount_read(&match->sk_ref)) {
>> +               list_del(&match->list);
>> +               kfree(match);
>> +       }
>
> This looks correct but still seems odd, it smells you don't use refcnt in an
> expected way.

It tests whether the object has no references, in which case it must have
been newly allocated and the fanout join operation must have failed.

I don't see an obviously simpler patch.

The entire code could perhaps be restructured eventually. There is,
for instance, no reason to test match->{type, prot_hook.type, prot_hook.dev}
when having just allocated the structure. Nor to test whether sk_ref exceeds
PACKET_FANOUT_MAX. Conversely, this sk_ref == 0 test only makes sense
when having taken the (!match) branch earlier.

But a refactor is out of scope for a bug fix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-15 14:07 ` Willem de Bruijn
@ 2017-09-20 21:03   ` David Miller
  2017-09-21 21:10     ` Willem de Bruijn
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2017-09-20 21:03 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: willemb, netdev, nixiaoming

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Fri, 15 Sep 2017 10:07:46 -0400

> On Thu, Sep 14, 2017 at 5:14 PM, Willem de Bruijn <willemb@google.com> wrote:
>> Packet socket bind operations must hold the po->bind_lock. This keeps
>> po->running consistent with whether the socket is actually on a ptype
>> list to receive packets.
>>
>> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
>> binds the fanout object to receive through packet_rcv_fanout.
>>
>> Make it hold the po->bind_lock when testing po->running and rebinding.
>> Else, it can race with other rebind operations, such as that in
>> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
>> can result in a socket being added to a fanout group twice, causing
>> use-after-free KASAN bug reports, among others.
>>
>> Reported independently by both trinity and syzkaller.
>> Verified that the syzkaller reproducer passes after this patch.
>>
> 
> I forgot to add the Fixes tag, sorry.
> 
> Fixes: dc99f600698d ("packet: Add fanout support.")

Applied and queued up for stable as it fixes this race and I can't
see any new problems it introduces.

But boy is this one messy area.

The scariest thing to me now is the save/restore sequence done by
packet_set_ring(), for example.

	spin_lock(&po->bind_lock);
	was_running = po->running;
	num = po->num;
	if (was_running) {
		po->num = 0;
		__unregister_prot_hook(sk, false);
	}
	spin_unlock(&po->bind_lock);
 ...
	spin_lock(&po->bind_lock);
	if (was_running) {
		po->num = num;
		register_prot_hook(sk);
	}
	spin_unlock(&po->bind_lock);

The socket is also locked during this sequence but that doesn't
prevent parallel changes to the running state.

Since po->bind_lock is dropped, it's possible for another thread
to grab bind_lock and bind it meanwhile.

The above code seems to assume that can't happen, and that
register_prot_hook() will always see po->running set to zero
and rebind the socket.

If the race happens we'll have weird state, because we did not
rebind yet we modified po->num.

We seem to have a hierachy of sleeping and non-sleeping locks
that do not work well together.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] packet: hold bind lock when rebinding to fanout hook
  2017-09-20 21:03   ` David Miller
@ 2017-09-21 21:10     ` Willem de Bruijn
  0 siblings, 0 replies; 7+ messages in thread
From: Willem de Bruijn @ 2017-09-21 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: Willem de Bruijn, Network Development, nixiaoming

On Wed, Sep 20, 2017 at 5:03 PM, David Miller <davem@davemloft.net> wrote:
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Fri, 15 Sep 2017 10:07:46 -0400
>
>> On Thu, Sep 14, 2017 at 5:14 PM, Willem de Bruijn <willemb@google.com> wrote:
>>> Packet socket bind operations must hold the po->bind_lock. This keeps
>>> po->running consistent with whether the socket is actually on a ptype
>>> list to receive packets.
>>>
>>> fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
>>> binds the fanout object to receive through packet_rcv_fanout.
>>>
>>> Make it hold the po->bind_lock when testing po->running and rebinding.
>>> Else, it can race with other rebind operations, such as that in
>>> packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
>>> can result in a socket being added to a fanout group twice, causing
>>> use-after-free KASAN bug reports, among others.
>>>
>>> Reported independently by both trinity and syzkaller.
>>> Verified that the syzkaller reproducer passes after this patch.
>>>
>>
>> I forgot to add the Fixes tag, sorry.
>>
>> Fixes: dc99f600698d ("packet: Add fanout support.")
>
> Applied and queued up for stable as it fixes this race and I can't
> see any new problems it introduces.
>
> But boy is this one messy area.

Yeah, I've been staring at this code for a while now. But I don't see
an obvious way to simplify it. packet_notifier does not hold the socket
lock, so even if all of bind, set_ring and fanout_add do hold that,
bind_lock is still needed. packet_mmap may not take the socket lock,
so pg_vec_lock must remain to synchronize with packet_set_ring
even if for no other reason. I had a look at using rcu pointers for rx_ring
and tx_ring, to avoid taking that lock in the datapath and possibly updating
without the unbind/bind dance. But that update needs to be atomic with
purging the socket queue, so the socket must be properly quiesced.

> The scariest thing to me now is the save/restore sequence done by
> packet_set_ring(), for example.
>
>         spin_lock(&po->bind_lock);
>         was_running = po->running;
>         num = po->num;
>         if (was_running) {
>                 po->num = 0;
>                 __unregister_prot_hook(sk, false);
>         }
>         spin_unlock(&po->bind_lock);
>  ...
>         spin_lock(&po->bind_lock);
>         if (was_running) {
>                 po->num = num;
>                 register_prot_hook(sk);
>         }
>         spin_unlock(&po->bind_lock);
>
> The socket is also locked during this sequence but that doesn't
> prevent parallel changes to the running state.
>
> Since po->bind_lock is dropped, it's possible for another thread
> to grab bind_lock and bind it meanwhile.
>
> The above code seems to assume that can't happen, and that
> register_prot_hook() will always see po->running set to zero
> and rebind the socket.
>
> If the race happens we'll have weird state, because we did not
> rebind yet we modified po->num.

It appears that the only path that may try to bind without holding the
socket lock is packet_notifier. That skips register_prot_hook if !po->num.

> We seem to have a hierachy of sleeping and non-sleeping locks
> that do not work well together.

Given the number recent bugs that were fixed by locking the socket
inside a particular setsockopt case, I think that we should lock that
entire function, similar to other protocol families (after verifying that
all cases can indeed sleep).

Another issue that looks fragile is the test for po->fanout in
packet_do_bind before taking the socket lock.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-09-21 21:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-14 21:14 [PATCH net] packet: hold bind lock when rebinding to fanout hook Willem de Bruijn
2017-09-14 22:53 ` Eric Dumazet
2017-09-15 14:07 ` Willem de Bruijn
2017-09-20 21:03   ` David Miller
2017-09-21 21:10     ` Willem de Bruijn
2017-09-15 18:33 ` Cong Wang
2017-09-15 20:01   ` Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.