All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] net: Fix possible race in peernet2id_alloc()
@ 2017-12-21 13:13 Kirill Tkhai
  2017-12-21 13:14 ` [PATCH 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Kirill Tkhai @ 2017-12-21 13:13 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count)
under net->nsid_lock does not guarantee, peer is alive:

rcu_read_lock()
peernet2id_alloc()                            ..
  spin_lock_bh(&net->nsid_lock)               ..
  atomic_read(&peer->count) == 1              ..
  ..                                          put_net()
  ..                                            cleanup_net()
  ..                                              for_each_net(tmp)
  ..                                                spin_lock_bh(&tmp->nsid_lock)
  ..                                                __peernet2id(tmp, net) == -1
  ..                                                    ..
  ..                                                    ..
    __peernet2id_alloc(alloc == true)                   ..
  ..                                                    ..
rcu_read_unlock()                                       ..
..                                                synchronize_rcu()
..                                                kmem_cache_free(net)

After the above situation, net::netns_id contains id pointing to freed memory,
and any other dereferencing by the id will operate with this freed memory.

Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
is generic interface, and better we fix it before someone really starts
use it in wrong context.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |   23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 60a71be75aea..6a4eab438221 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -221,17 +221,32 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id);
  */
 int peernet2id_alloc(struct net *net, struct net *peer)
 {
-	bool alloc;
+	bool alloc = false, alive = false;
 	int id;
 
-	if (atomic_read(&net->count) == 0)
-		return NETNSA_NSID_NOT_ASSIGNED;
 	spin_lock_bh(&net->nsid_lock);
-	alloc = atomic_read(&peer->count) == 0 ? false : true;
+	/* Spinlock guarantees we never hash a peer to net->netns_ids
+	 * after idr_destroy(&net->netns_ids) occurs in cleanup_net().
+	 */
+	if (atomic_read(&net->count) == 0) {
+		id = NETNSA_NSID_NOT_ASSIGNED;
+		goto unlock;
+	}
+	/*
+	 * When peer is obtained from RCU lists, we may race with
+	 * its cleanup. Check whether it's alive, and this guarantees
+	 * we never hash a peer back to net->netns_ids, after it has
+	 * just been idr_remove()'d from there in cleanup_net().
+	 */
+	if (maybe_get_net(peer))
+		alive = alloc = true;
 	id = __peernet2id_alloc(net, peer, &alloc);
+unlock:
 	spin_unlock_bh(&net->nsid_lock);
 	if (alloc && id >= 0)
 		rtnl_net_notifyid(net, RTM_NEWNSID, id);
+	if (alive)
+		put_net(peer);
 	return id;
 }
 EXPORT_SYMBOL_GPL(peernet2id_alloc);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] net: Add BUG_ON() to get_net()
  2017-12-21 13:13 [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
@ 2017-12-21 13:14 ` Kirill Tkhai
  2017-12-21 13:14 ` [PATCH 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai
  2017-12-21 17:39 ` [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Eric W. Biederman
  2 siblings, 0 replies; 7+ messages in thread
From: Kirill Tkhai @ 2017-12-21 13:14 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

Since people may mistakenly obtain destroying net
from net_namespace_list and from net::netns_ids
without checking for its net::counter, let's protect
against such situations and insert BUG_ON() to stop
move on after this.

Panic is better, than memory corruption and undefined
behavior.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/net_namespace.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99dafd5ac..ff0e47471d5b 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -195,7 +195,7 @@ void __put_net(struct net *net);
 
 static inline struct net *get_net(struct net *net)
 {
-	atomic_inc(&net->count);
+	BUG_ON(atomic_inc_return(&net->count) <= 1);
 	return net;
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] net: Remove spinlock from get_net_ns_by_id()
  2017-12-21 13:13 [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
  2017-12-21 13:14 ` [PATCH 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
@ 2017-12-21 13:14 ` Kirill Tkhai
  2017-12-21 17:39 ` [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Eric W. Biederman
  2 siblings, 0 replies; 7+ messages in thread
From: Kirill Tkhai @ 2017-12-21 13:14 UTC (permalink / raw)
  To: netdev, davem; +Cc: eric.dumazet, ktkhai, ebiederm

idr_find() is safe under rcu_read_lock() and
maybe_get_net() guarantees that net is alive.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/net_namespace.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 6a4eab438221..a675f35a18ff 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -279,11 +279,9 @@ struct net *get_net_ns_by_id(struct net *net, int id)
 		return NULL;
 
 	rcu_read_lock();
-	spin_lock_bh(&net->nsid_lock);
 	peer = idr_find(&net->netns_ids, id);
 	if (peer)
 		peer = maybe_get_net(peer);
-	spin_unlock_bh(&net->nsid_lock);
 	rcu_read_unlock();
 
 	return peer;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] net: Fix possible race in peernet2id_alloc()
  2017-12-21 13:13 [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
  2017-12-21 13:14 ` [PATCH 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
  2017-12-21 13:14 ` [PATCH 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai
@ 2017-12-21 17:39 ` Eric W. Biederman
  2017-12-22 10:30   ` Kirill Tkhai
  2 siblings, 1 reply; 7+ messages in thread
From: Eric W. Biederman @ 2017-12-21 17:39 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: netdev, davem, eric.dumazet

Kirill Tkhai <ktkhai@virtuozzo.com> writes:

> peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count)
> under net->nsid_lock does not guarantee, peer is alive:
>
> rcu_read_lock()
> peernet2id_alloc()                            ..
>   spin_lock_bh(&net->nsid_lock)               ..
>   atomic_read(&peer->count) == 1              ..
>   ..                                          put_net()
>   ..                                            cleanup_net()
>   ..                                              for_each_net(tmp)
>   ..                                                spin_lock_bh(&tmp->nsid_lock)
>   ..                                                __peernet2id(tmp, net) == -1
>   ..                                                    ..
>   ..                                                    ..
>     __peernet2id_alloc(alloc == true)                   ..
>   ..                                                    ..
> rcu_read_unlock()                                       ..
> ..                                                synchronize_rcu()
> ..                                                kmem_cache_free(net)
>
> After the above situation, net::netns_id contains id pointing to freed memory,
> and any other dereferencing by the id will operate with this freed memory.
>
> Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
> ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
> is generic interface, and better we fix it before someone really starts
> use it in wrong context.

So it comes down to this piece of code from ovs and just let me say ick.
	if (!net_eq(net, dev_net(vport->dev))) {
		int id = peernet2id_alloc(net, dev_net(vport->dev));

		if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id))
			goto nla_put_failure;
	}

Without the rtnl lock dev_net can cange between the test and the
call of peernet2id_alloc.

At first glance it looks like the bug is that we are running a control
path of the networking stack without the rtnl lock. So it may be that
ASSERT_RTNL() is the better fix.

Given that it would be nice to reduce the scope of the rtnl lock this
might not be a bad direction.  Let me see.

Is rtnl_notify safe without the rtnl lock?


>
> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> ---
>  net/core/net_namespace.c |   23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
> index 60a71be75aea..6a4eab438221 100644
> --- a/net/core/net_namespace.c
> +++ b/net/core/net_namespace.c
> @@ -221,17 +221,32 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id);
>   */
>  int peernet2id_alloc(struct net *net, struct net *peer)
>  {
> -	bool alloc;
> +	bool alloc = false, alive = false;
>  	int id;

        ^^^ Perhaps we want "ASSERT_RTNL();" here?
>  
> -	if (atomic_read(&net->count) == 0)
> -		return NETNSA_NSID_NOT_ASSIGNED;

Moving this hunk is of no benefit.  The code must be called with a valid
reference to net.   Which means net->count is a fancy way of testing to
see if the code is in cleanup_net.  In all other cases net->count should
be non-zero and it should remain that way because of our caller must
keep a reference.

>  	spin_lock_bh(&net->nsid_lock);
> -	alloc = atomic_read(&peer->count) == 0 ? false : true;
> +	/* Spinlock guarantees we never hash a peer to net->netns_ids
> +	 * after idr_destroy(&net->netns_ids) occurs in cleanup_net().
> +	 */
> +	if (atomic_read(&net->count) == 0) {
> +		id = NETNSA_NSID_NOT_ASSIGNED;
> +		goto unlock;
> +	}
> +	/*
> +	 * When peer is obtained from RCU lists, we may race with
> +	 * its cleanup. Check whether it's alive, and this guarantees
> +	 * we never hash a peer back to net->netns_ids, after it has
> +	 * just been idr_remove()'d from there in cleanup_net().
> +	 */
> +	if (maybe_get_net(peer))
> +		alive = alloc = true;

Yes this does seem reasonable.  The more obvious looking code which
would return NETNSA_NSID_NOT_ASSIGNED if the peer has a count of 0, is
silly as it makes would make it appear that a peer is momentary outside
of a network namespace when the peer is in fact moving from one network
namespace to another.
        
>  	id = __peernet2id_alloc(net, peer, &alloc);
> +unlock:
>  	spin_unlock_bh(&net->nsid_lock);
>  	if (alloc && id >= 0)
>  		rtnl_net_notifyid(net, RTM_NEWNSID, id);
                ^^^^^^
                Is this safe without the rtnl lock?
> +	if (alive)
> +		put_net(peer);
>  	return id;
>  }
>  EXPORT_SYMBOL_GPL(peernet2id_alloc);

Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] net: Fix possible race in peernet2id_alloc()
  2017-12-21 17:39 ` [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Eric W. Biederman
@ 2017-12-22 10:30   ` Kirill Tkhai
  2017-12-28 12:55     ` Kirill Tkhai
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill Tkhai @ 2017-12-22 10:30 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, davem, eric.dumazet

Hi, Eric,

On 21.12.2017 20:39, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai@virtuozzo.com> writes:
> 
>> peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count)
>> under net->nsid_lock does not guarantee, peer is alive:
>>
>> rcu_read_lock()
>> peernet2id_alloc()                            ..
>>   spin_lock_bh(&net->nsid_lock)               ..
>>   atomic_read(&peer->count) == 1              ..
>>   ..                                          put_net()
>>   ..                                            cleanup_net()
>>   ..                                              for_each_net(tmp)
>>   ..                                                spin_lock_bh(&tmp->nsid_lock)
>>   ..                                                __peernet2id(tmp, net) == -1
>>   ..                                                    ..
>>   ..                                                    ..
>>     __peernet2id_alloc(alloc == true)                   ..
>>   ..                                                    ..
>> rcu_read_unlock()                                       ..
>> ..                                                synchronize_rcu()
>> ..                                                kmem_cache_free(net)
>>
>> After the above situation, net::netns_id contains id pointing to freed memory,
>> and any other dereferencing by the id will operate with this freed memory.
>>
>> Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
>> ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
>> is generic interface, and better we fix it before someone really starts
>> use it in wrong context.
> 
> So it comes down to this piece of code from ovs and just let me say ick.
> 	if (!net_eq(net, dev_net(vport->dev))) {
> 		int id = peernet2id_alloc(net, dev_net(vport->dev));
> 
> 		if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id))
> 			goto nla_put_failure;
> 	}
> 
> Without the rtnl lock dev_net can cange between the test and the
> call of peernet2id_alloc.

The places like this are usually protected via netdevice notifier. There is
ovs_dp_device_notifier in openvswitch and it seems to be a handler for
the situation above. But I'm not sure it's safe, and it's not clear for
me why it does not take ovl_mutex to synchronize with ovs_vport_cmd_fill_info(),
and whether there is another synchronization.

> At first glance it looks like the bug is that we are running a control
> path of the networking stack without the rtnl lock. So it may be that
> ASSERT_RTNL() is the better fix.

Do you mean inserting ASSERT_RTNL() into peernet2id_alloc()?
If so, it's not need, as all nsid are protected via separate lock starting from

commit 95f38411df05 "netns: use a spin_lock to protect nsid management"

and commit de133464c9e7 "netns: make nsid_lock per net".

nsid primitives are aimed to be safe without respect to rtnl_lock().
My patch just adds some sanity checks to make them more safe.
 
> Given that it would be nice to reduce the scope of the rtnl lock this
> might not be a bad direction.  Let me see.
>> Is rtnl_notify safe without the rtnl lock?

It seems they have to be safe, as there is only sending a skb to netlink.
rtnl_notify() is used without rtnl_lock() in most other places of net code.

The only reason rtnl_lock() is needed in cleanup_net() is we unlink net from the list.
rtnl_lock() around nsid cleanup is leftover after nsid are made protected via
separate lock. So we may relax rtnl_lock() in cleanup_net() and I'm going to do that
in one of next patches (not yet sent).

>>
>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>> ---
>>  net/core/net_namespace.c |   23 +++++++++++++++++++----
>>  1 file changed, 19 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>> index 60a71be75aea..6a4eab438221 100644
>> --- a/net/core/net_namespace.c
>> +++ b/net/core/net_namespace.c
>> @@ -221,17 +221,32 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id);
>>   */
>>  int peernet2id_alloc(struct net *net, struct net *peer)
>>  {
>> -	bool alloc;
>> +	bool alloc = false, alive = false;
>>  	int id;
> 
>         ^^^ Perhaps we want "ASSERT_RTNL();" here?
>>  
>> -	if (atomic_read(&net->count) == 0)
>> -		return NETNSA_NSID_NOT_ASSIGNED;
> 
> Moving this hunk is of no benefit.  The code must be called with a valid
> reference to net.   Which means net->count is a fancy way of testing to
> see if the code is in cleanup_net.  In all other cases net->count should
> be non-zero and it should remain that way because of our caller must
> keep a reference.

Let me discuss about this.
This interface is exported to drivers, and they don't always follow the rules
about what is allowed or disallowed. This hunk adds sanity checks, which
protects us from bad drivers code etc. Some people want to use peernet2id_alloc()
in generic functions, which may be called in different context (also, for net
obtained from rcu list). Instead of every user makes such a test in its function,
this hunks introduces generic valid check, which allows people do not insert n+1
their own checks.

Also, unlocked check does not give performance advantages as later we take the lock
*almost always*. So, my suggestion is to make peernet2id_alloc() more safe for
end user.
 
>>  	spin_lock_bh(&net->nsid_lock);
>> -	alloc = atomic_read(&peer->count) == 0 ? false : true;
>> +	/* Spinlock guarantees we never hash a peer to net->netns_ids
>> +	 * after idr_destroy(&net->netns_ids) occurs in cleanup_net().
>> +	 */
>> +	if (atomic_read(&net->count) == 0) {
>> +		id = NETNSA_NSID_NOT_ASSIGNED;
>> +		goto unlock;
>> +	}
>> +	/*
>> +	 * When peer is obtained from RCU lists, we may race with
>> +	 * its cleanup. Check whether it's alive, and this guarantees
>> +	 * we never hash a peer back to net->netns_ids, after it has
>> +	 * just been idr_remove()'d from there in cleanup_net().
>> +	 */
>> +	if (maybe_get_net(peer))
>> +		alive = alloc = true;
> 
> Yes this does seem reasonable.  The more obvious looking code which
> would return NETNSA_NSID_NOT_ASSIGNED if the peer has a count of 0, is
> silly as it makes would make it appear that a peer is momentary outside
> of a network namespace when the peer is in fact moving from one network
> namespace to another.
>         
>>  	id = __peernet2id_alloc(net, peer, &alloc);
>> +unlock:
>>  	spin_unlock_bh(&net->nsid_lock);
>>  	if (alloc && id >= 0)
>>  		rtnl_net_notifyid(net, RTM_NEWNSID, id);
>                 ^^^^^^
>                 Is this safe without the rtnl lock?

Yes, it seems to be safe. Please, look above.

>> +	if (alive)
>> +		put_net(peer);
>>  	return id;
>>  }
>>  EXPORT_SYMBOL_GPL(peernet2id_alloc);

Kirill

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] net: Fix possible race in peernet2id_alloc()
  2017-12-22 10:30   ` Kirill Tkhai
@ 2017-12-28 12:55     ` Kirill Tkhai
  2017-12-28 17:17       ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill Tkhai @ 2017-12-28 12:55 UTC (permalink / raw)
  To: Eric W. Biederman, davem; +Cc: netdev, eric.dumazet

Hi, David,

I see the status of patchset has changed on https://patchwork.ozlabs.org/patch/851928/
and now it is "Changes Requested".

There was Eric's reply, and my reply on his message. After that there wasn't objections on
my reply or other patches.

Could you please clarify the status or what I should do with the patchset
(because it's not clear for me)?

Thanks,
Kirill

On 22.12.2017 13:30, Kirill Tkhai wrote:
> Hi, Eric,
> 
> On 21.12.2017 20:39, Eric W. Biederman wrote:
>> Kirill Tkhai <ktkhai@virtuozzo.com> writes:
>>
>>> peernet2id_alloc() is racy without rtnl_lock() as atomic_read(&peer->count)
>>> under net->nsid_lock does not guarantee, peer is alive:
>>>
>>> rcu_read_lock()
>>> peernet2id_alloc()                            ..
>>>   spin_lock_bh(&net->nsid_lock)               ..
>>>   atomic_read(&peer->count) == 1              ..
>>>   ..                                          put_net()
>>>   ..                                            cleanup_net()
>>>   ..                                              for_each_net(tmp)
>>>   ..                                                spin_lock_bh(&tmp->nsid_lock)
>>>   ..                                                __peernet2id(tmp, net) == -1
>>>   ..                                                    ..
>>>   ..                                                    ..
>>>     __peernet2id_alloc(alloc == true)                   ..
>>>   ..                                                    ..
>>> rcu_read_unlock()                                       ..
>>> ..                                                synchronize_rcu()
>>> ..                                                kmem_cache_free(net)
>>>
>>> After the above situation, net::netns_id contains id pointing to freed memory,
>>> and any other dereferencing by the id will operate with this freed memory.
>>>
>>> Currently, peernet2id_alloc() is used under rtnl_lock() everywhere except
>>> ovs_vport_cmd_fill_info(), and this race can't occur. But peernet2id_alloc()
>>> is generic interface, and better we fix it before someone really starts
>>> use it in wrong context.
>>
>> So it comes down to this piece of code from ovs and just let me say ick.
>> 	if (!net_eq(net, dev_net(vport->dev))) {
>> 		int id = peernet2id_alloc(net, dev_net(vport->dev));
>>
>> 		if (nla_put_s32(skb, OVS_VPORT_ATTR_NETNSID, id))
>> 			goto nla_put_failure;
>> 	}
>>
>> Without the rtnl lock dev_net can cange between the test and the
>> call of peernet2id_alloc.
> 
> The places like this are usually protected via netdevice notifier. There is
> ovs_dp_device_notifier in openvswitch and it seems to be a handler for
> the situation above. But I'm not sure it's safe, and it's not clear for
> me why it does not take ovl_mutex to synchronize with ovs_vport_cmd_fill_info(),
> and whether there is another synchronization.
> 
>> At first glance it looks like the bug is that we are running a control
>> path of the networking stack without the rtnl lock. So it may be that
>> ASSERT_RTNL() is the better fix.
> 
> Do you mean inserting ASSERT_RTNL() into peernet2id_alloc()?
> If so, it's not need, as all nsid are protected via separate lock starting from
> 
> commit 95f38411df05 "netns: use a spin_lock to protect nsid management"
> 
> and commit de133464c9e7 "netns: make nsid_lock per net".
> 
> nsid primitives are aimed to be safe without respect to rtnl_lock().
> My patch just adds some sanity checks to make them more safe.
>  
>> Given that it would be nice to reduce the scope of the rtnl lock this
>> might not be a bad direction.  Let me see.
>>> Is rtnl_notify safe without the rtnl lock?
> 
> It seems they have to be safe, as there is only sending a skb to netlink.
> rtnl_notify() is used without rtnl_lock() in most other places of net code.
> 
> The only reason rtnl_lock() is needed in cleanup_net() is we unlink net from the list.
> rtnl_lock() around nsid cleanup is leftover after nsid are made protected via
> separate lock. So we may relax rtnl_lock() in cleanup_net() and I'm going to do that
> in one of next patches (not yet sent).
> 
>>>
>>> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
>>> ---
>>>  net/core/net_namespace.c |   23 +++++++++++++++++++----
>>>  1 file changed, 19 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>>> index 60a71be75aea..6a4eab438221 100644
>>> --- a/net/core/net_namespace.c
>>> +++ b/net/core/net_namespace.c
>>> @@ -221,17 +221,32 @@ static void rtnl_net_notifyid(struct net *net, int cmd, int id);
>>>   */
>>>  int peernet2id_alloc(struct net *net, struct net *peer)
>>>  {
>>> -	bool alloc;
>>> +	bool alloc = false, alive = false;
>>>  	int id;
>>
>>         ^^^ Perhaps we want "ASSERT_RTNL();" here?
>>>  
>>> -	if (atomic_read(&net->count) == 0)
>>> -		return NETNSA_NSID_NOT_ASSIGNED;
>>
>> Moving this hunk is of no benefit.  The code must be called with a valid
>> reference to net.   Which means net->count is a fancy way of testing to
>> see if the code is in cleanup_net.  In all other cases net->count should
>> be non-zero and it should remain that way because of our caller must
>> keep a reference.
> 
> Let me discuss about this.
> This interface is exported to drivers, and they don't always follow the rules
> about what is allowed or disallowed. This hunk adds sanity checks, which
> protects us from bad drivers code etc. Some people want to use peernet2id_alloc()
> in generic functions, which may be called in different context (also, for net
> obtained from rcu list). Instead of every user makes such a test in its function,
> this hunks introduces generic valid check, which allows people do not insert n+1
> their own checks.
> 
> Also, unlocked check does not give performance advantages as later we take the lock
> *almost always*. So, my suggestion is to make peernet2id_alloc() more safe for
> end user.
>  
>>>  	spin_lock_bh(&net->nsid_lock);
>>> -	alloc = atomic_read(&peer->count) == 0 ? false : true;
>>> +	/* Spinlock guarantees we never hash a peer to net->netns_ids
>>> +	 * after idr_destroy(&net->netns_ids) occurs in cleanup_net().
>>> +	 */
>>> +	if (atomic_read(&net->count) == 0) {
>>> +		id = NETNSA_NSID_NOT_ASSIGNED;
>>> +		goto unlock;
>>> +	}
>>> +	/*
>>> +	 * When peer is obtained from RCU lists, we may race with
>>> +	 * its cleanup. Check whether it's alive, and this guarantees
>>> +	 * we never hash a peer back to net->netns_ids, after it has
>>> +	 * just been idr_remove()'d from there in cleanup_net().
>>> +	 */
>>> +	if (maybe_get_net(peer))
>>> +		alive = alloc = true;
>>
>> Yes this does seem reasonable.  The more obvious looking code which
>> would return NETNSA_NSID_NOT_ASSIGNED if the peer has a count of 0, is
>> silly as it makes would make it appear that a peer is momentary outside
>> of a network namespace when the peer is in fact moving from one network
>> namespace to another.
>>         
>>>  	id = __peernet2id_alloc(net, peer, &alloc);
>>> +unlock:
>>>  	spin_unlock_bh(&net->nsid_lock);
>>>  	if (alloc && id >= 0)
>>>  		rtnl_net_notifyid(net, RTM_NEWNSID, id);
>>                 ^^^^^^
>>                 Is this safe without the rtnl lock?
> 
> Yes, it seems to be safe. Please, look above.
> 
>>> +	if (alive)
>>> +		put_net(peer);
>>>  	return id;
>>>  }
>>>  EXPORT_SYMBOL_GPL(peernet2id_alloc);
> 
> Kirill
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] net: Fix possible race in peernet2id_alloc()
  2017-12-28 12:55     ` Kirill Tkhai
@ 2017-12-28 17:17       ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2017-12-28 17:17 UTC (permalink / raw)
  To: ktkhai; +Cc: ebiederm, netdev, eric.dumazet

From: Kirill Tkhai <ktkhai@virtuozzo.com>
Date: Thu, 28 Dec 2017 15:55:15 +0300

> Could you please clarify the status or what I should do with the patchset
> (because it's not clear for me)?

Please resend.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-12-28 17:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-21 13:13 [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Kirill Tkhai
2017-12-21 13:14 ` [PATCH 2/3] net: Add BUG_ON() to get_net() Kirill Tkhai
2017-12-21 13:14 ` [PATCH 3/3] net: Remove spinlock from get_net_ns_by_id() Kirill Tkhai
2017-12-21 17:39 ` [PATCH 1/3] net: Fix possible race in peernet2id_alloc() Eric W. Biederman
2017-12-22 10:30   ` Kirill Tkhai
2017-12-28 12:55     ` Kirill Tkhai
2017-12-28 17:17       ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.