All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
@ 2011-05-18  7:01 Jacek Luczak
  2011-05-18  7:48 ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18  7:01 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

During the sctp_close() call, we do not use rcu primitives to
destroy the address list attached to the endpoint.  At the same
time, we do the removal of addresses from this list before
attempting to remove the socket from the port hash

As a result, it is possible for another process to find the socket
in the port hash that is in the process of being closed.  It then
proceeds to traverse the address list to find the conflict, only
to have that address list suddenly disappear without rcu() critical
section.

This can result in a kernel crash with general protection fault or
kernel NULL pointer dereference.

Fix issue by closing address list removal inside RCU critical
section.

Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>

---
 bind_addr.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index faf71d1..19d1329 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
 /* Dispose of an SCTP_bind_addr structure  */
 void sctp_bind_addr_free(struct sctp_bind_addr *bp)
 {
-       /* Empty the bind address list. */
-       sctp_bind_addr_clean(bp);
+       struct sctp_sockaddr_entry *addr;
+
+       /* Empty the bind address list inside RCU section. */
+       rcu_read_lock();
+       list_for_each_entry_rcu(addr, &bp->address_list, list) {
+               list_del_rcu(&addr->list);
+               call_rcu(&addr->rcu, sctp_local_addr_free);
+               SCTP_DBG_OBJCNT_DEC(addr);
+       }
+       rcu_read_unlock();

        if (bp->malloced) {
                kfree(bp);

[-- Attachment #2: sctp_fix_close_vs_conflict_race_in_bind_addr.patch --]
[-- Type: application/octet-stream, Size: 732 bytes --]

diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index faf71d1..19d1329 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
 /* Dispose of an SCTP_bind_addr structure  */
 void sctp_bind_addr_free(struct sctp_bind_addr *bp)
 {
-	/* Empty the bind address list. */
-	sctp_bind_addr_clean(bp);
+	struct sctp_sockaddr_entry *addr;
+
+	/* Empty the bind address list inside RCU section. */
+	rcu_read_lock();
+	list_for_each_entry_rcu(addr, &bp->address_list, list) {
+		list_del_rcu(&addr->list);
+		call_rcu(&addr->rcu, sctp_local_addr_free);
+		SCTP_DBG_OBJCNT_DEC(addr);
+	}
+	rcu_read_unlock();
 
 	if (bp->malloced) {
 		kfree(bp);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  7:01 [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict() Jacek Luczak
@ 2011-05-18  7:48 ` Eric Dumazet
  2011-05-18  8:06   ` Jacek Luczak
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2011-05-18  7:48 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: netdev, Vlad Yasevich

Le mercredi 18 mai 2011 à 09:01 +0200, Jacek Luczak a écrit :
> During the sctp_close() call, we do not use rcu primitives to
> destroy the address list attached to the endpoint.  At the same
> time, we do the removal of addresses from this list before
> attempting to remove the socket from the port hash
> 
> As a result, it is possible for another process to find the socket
> in the port hash that is in the process of being closed.  It then
> proceeds to traverse the address list to find the conflict, only
> to have that address list suddenly disappear without rcu() critical
> section.
> 
> This can result in a kernel crash with general protection fault or
> kernel NULL pointer dereference.
> 
> Fix issue by closing address list removal inside RCU critical
> section.
> 
> Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> 
> ---
>  bind_addr.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index faf71d1..19d1329 100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>  /* Dispose of an SCTP_bind_addr structure  */
>  void sctp_bind_addr_free(struct sctp_bind_addr *bp)
>  {
> -       /* Empty the bind address list. */
> -       sctp_bind_addr_clean(bp);
> +       struct sctp_sockaddr_entry *addr;
> +
> +       /* Empty the bind address list inside RCU section. */
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(addr, &bp->address_list, list) {
> +               list_del_rcu(&addr->list);
> +               call_rcu(&addr->rcu, sctp_local_addr_free);
> +               SCTP_DBG_OBJCNT_DEC(addr);
> +       }
> +       rcu_read_unlock();
> 

Sorry this looks odd.

If you're removing items from this list, you must be a writer here, with
exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.

Therefore, I guess following code is better :

list_for_each_entry(addr, &bp->address_list, list) {
	list_del_rcu(&addr->list);
	call_rcu(&addr->rcu, sctp_local_addr_free);
	SCTP_DBG_OBJCNT_DEC(addr);
}

Then, why dont you fix sctp_bind_addr_clean() instead ?

if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
be protected as well.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  7:48 ` Eric Dumazet
@ 2011-05-18  8:06   ` Jacek Luczak
  2011-05-18  8:29     ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18  8:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Vlad Yasevich

2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 09:01 +0200, Jacek Luczak a écrit :
>> During the sctp_close() call, we do not use rcu primitives to
>> destroy the address list attached to the endpoint.  At the same
>> time, we do the removal of addresses from this list before
>> attempting to remove the socket from the port hash
>>
>> As a result, it is possible for another process to find the socket
>> in the port hash that is in the process of being closed.  It then
>> proceeds to traverse the address list to find the conflict, only
>> to have that address list suddenly disappear without rcu() critical
>> section.
>>
>> This can result in a kernel crash with general protection fault or
>> kernel NULL pointer dereference.
>>
>> Fix issue by closing address list removal inside RCU critical
>> section.
>>
>> Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com>
>> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>>
>> ---
>>  bind_addr.c |   12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>> index faf71d1..19d1329 100644
>> --- a/net/sctp/bind_addr.c
>> +++ b/net/sctp/bind_addr.c
>> @@ -155,8 +155,16 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>  /* Dispose of an SCTP_bind_addr structure  */
>>  void sctp_bind_addr_free(struct sctp_bind_addr *bp)
>>  {
>> -       /* Empty the bind address list. */
>> -       sctp_bind_addr_clean(bp);
>> +       struct sctp_sockaddr_entry *addr;
>> +
>> +       /* Empty the bind address list inside RCU section. */
>> +       rcu_read_lock();
>> +       list_for_each_entry_rcu(addr, &bp->address_list, list) {
>> +               list_del_rcu(&addr->list);
>> +               call_rcu(&addr->rcu, sctp_local_addr_free);
>> +               SCTP_DBG_OBJCNT_DEC(addr);
>> +       }
>> +       rcu_read_unlock();
>>
>
> Sorry this looks odd.
>
> If you're removing items from this list, you must be a writer here, with
> exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.

I could agree to some extend ... but strict RCU section IMO is needed here.
I can check this if the issue exists.

> Therefore, I guess following code is better :
>
> list_for_each_entry(addr, &bp->address_list, list) {
>        list_del_rcu(&addr->list);
>        call_rcu(&addr->rcu, sctp_local_addr_free);
>        SCTP_DBG_OBJCNT_DEC(addr);
> }
>
> Then, why dont you fix sctp_bind_addr_clean() instead ?
>
> if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
> be protected as well.

The _clean() as claimed by Vlad is called many times from various places
in code and this could give a overhead. I guess Vlad would need to comment.

-Jacek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  8:06   ` Jacek Luczak
@ 2011-05-18  8:29     ` Eric Dumazet
  2011-05-18  9:02       ` Wei Yongjun
  2011-05-18 12:06       ` Jacek Luczak
  0 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2011-05-18  8:29 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: netdev, Vlad Yasevich

Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:

> > If you're removing items from this list, you must be a writer here, with
> > exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
> 
> I could agree to some extend ... but strict RCU section IMO is needed here.
> I can check this if the issue exists.
> 

I can tell you for sure rcu_read_lock() is not needed here. Its only
showing confusion from code's author.

Please read Documentation/RCU/listRCU.txt for concise explanations,
line 117.


> > Therefore, I guess following code is better :
> >
> > list_for_each_entry(addr, &bp->address_list, list) {
> >        list_del_rcu(&addr->list);
> >        call_rcu(&addr->rcu, sctp_local_addr_free);
> >        SCTP_DBG_OBJCNT_DEC(addr);
> > }
> >
> > Then, why dont you fix sctp_bind_addr_clean() instead ?
> >
> > if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
> > be protected as well.
> 
> The _clean() as claimed by Vlad is called many times from various places
> in code and this could give a overhead. I guess Vlad would need to comment.

I guess a full review of this code is needed. You'll have to prove
sctp_bind_addr_clean() is always called after one RCU grace period if
you want to leave it as is.

You cant get RCU for free, some rules must be followed or you risk
crashes.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  8:29     ` Eric Dumazet
@ 2011-05-18  9:02       ` Wei Yongjun
  2011-05-18 11:01         ` Jacek Luczak
  2011-05-18 12:33         ` Vladislav Yasevich
  2011-05-18 12:06       ` Jacek Luczak
  1 sibling, 2 replies; 16+ messages in thread
From: Wei Yongjun @ 2011-05-18  9:02 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Eric Dumazet, netdev, Vlad Yasevich


> Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
>> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>>> If you're removing items from this list, you must be a writer here, with
>>> exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
>> I could agree to some extend ... but strict RCU section IMO is needed here.
>> I can check this if the issue exists.
>>
> I can tell you for sure rcu_read_lock() is not needed here. Its only
> showing confusion from code's author.
>
> Please read Documentation/RCU/listRCU.txt for concise explanations,
> line 117.
>
>
>>> Therefore, I guess following code is better :
>>>
>>> list_for_each_entry(addr, &bp->address_list, list) {
>>>        list_del_rcu(&addr->list);
>>>        call_rcu(&addr->rcu, sctp_local_addr_free);
>>>        SCTP_DBG_OBJCNT_DEC(addr);
>>> }
>>>
>>> Then, why dont you fix sctp_bind_addr_clean() instead ?
>>>
>>> if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
>>> be protected as well.
>> The _clean() as claimed by Vlad is called many times from various places
>> in code and this could give a overhead. I guess Vlad would need to comment.
> I guess a full review of this code is needed. You'll have to prove
> sctp_bind_addr_clean() is always called after one RCU grace period if
> you want to leave it as is.
>
> You cant get RCU for free, some rules must be followed or you risk
> crashes.
>

fix the race between sctp_bind_addr_free() and sctp_bind_addr_conflict(), maybe you just
need to remove the socket from the port hash before empty the bind address list.
some thing like this, not test.

diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
index c8cc24e..924d846 100644
--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -268,12 +268,13 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
 
 	/* Cleanup. */
 	sctp_inq_free(&ep->base.inqueue);
-	sctp_bind_addr_free(&ep->base.bind_addr);
 
 	/* Remove and free the port */
 	if (sctp_sk(ep->base.sk)->bind_hash)
 		sctp_put_port(ep->base.sk);
 
+	sctp_bind_addr_free(&ep->base.bind_addr);
+
 	/* Give up our hold on the sock. */
 	if (ep->base.sk)
 		sock_put(ep->base.sk);



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  9:02       ` Wei Yongjun
@ 2011-05-18 11:01         ` Jacek Luczak
  2011-05-18 11:41           ` Eric Dumazet
  2011-05-18 12:33         ` Vladislav Yasevich
  1 sibling, 1 reply; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 11:01 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: Eric Dumazet, netdev, Vlad Yasevich

2011/5/18 Wei Yongjun <yjwei@cn.fujitsu.com>:
>
>> Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
>>> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>>>> If you're removing items from this list, you must be a writer here, with
>>>> exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
>>> I could agree to some extend ... but strict RCU section IMO is needed here.
>>> I can check this if the issue exists.
>>>
>> I can tell you for sure rcu_read_lock() is not needed here. Its only
>> showing confusion from code's author.
>>
>> Please read Documentation/RCU/listRCU.txt for concise explanations,
>> line 117.
>>
>>
>>>> Therefore, I guess following code is better :
>>>>
>>>> list_for_each_entry(addr, &bp->address_list, list) {
>>>>        list_del_rcu(&addr->list);
>>>>        call_rcu(&addr->rcu, sctp_local_addr_free);
>>>>        SCTP_DBG_OBJCNT_DEC(addr);
>>>> }
>>>>
>>>> Then, why dont you fix sctp_bind_addr_clean() instead ?
>>>>
>>>> if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
>>>> be protected as well.
>>> The _clean() as claimed by Vlad is called many times from various places
>>> in code and this could give a overhead. I guess Vlad would need to comment.
>> I guess a full review of this code is needed. You'll have to prove
>> sctp_bind_addr_clean() is always called after one RCU grace period if
>> you want to leave it as is.
>>
>> You cant get RCU for free, some rules must be followed or you risk
>> crashes.
>>
>
> fix the race between sctp_bind_addr_free() and sctp_bind_addr_conflict(), maybe you just
> need to remove the socket from the port hash before empty the bind address list.
> some thing like this, not test.
>
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index c8cc24e..924d846 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -268,12 +268,13 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
>
>        /* Cleanup. */
>        sctp_inq_free(&ep->base.inqueue);
> -       sctp_bind_addr_free(&ep->base.bind_addr);
>
>        /* Remove and free the port */
>        if (sctp_sk(ep->base.sk)->bind_hash)
>                sctp_put_port(ep->base.sk);
>
> +       sctp_bind_addr_free(&ep->base.bind_addr);
> +
>        /* Give up our hold on the sock. */
>        if (ep->base.sk)
>                sock_put(ep->base.sk);
>

Done tests with that one and looks like it survive the flood.

How then this will be handled is up to you. Both ways fix
the issue which makes me happy as damn hell.

@Eric, if you will take a look into the code you might notice
that there are few places where list operations could be
optimised and the main question here is do we really care
to have the data ,,safe'' so that things like that won't popup.
The good example can be the set of _local_ functions.
Ahhh... and I'm aware of how tricky can be abuse of RCU.

-Jacek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 11:01         ` Jacek Luczak
@ 2011-05-18 11:41           ` Eric Dumazet
  2011-05-18 11:58             ` Jacek Luczak
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2011-05-18 11:41 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Wei Yongjun, netdev, Vlad Yasevich

Le mercredi 18 mai 2011 à 13:01 +0200, Jacek Luczak a écrit :

> @Eric, if you will take a look into the code you might notice
> that there are few places where list operations could be
> optimised and the main question here is do we really care
> to have the data ,,safe'' so that things like that won't popup.
> The good example can be the set of _local_ functions.
> Ahhh... and I'm aware of how tricky can be abuse of RCU.

I took a quick look at existing rcu_read_lock() uses in sctp and did not
find other problems [or optimizations if you prefer this point of view].
Please elaborate.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 11:41           ` Eric Dumazet
@ 2011-05-18 11:58             ` Jacek Luczak
  0 siblings, 0 replies; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 11:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Wei Yongjun, netdev, Vlad Yasevich

2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 13:01 +0200, Jacek Luczak a écrit :
>
>> @Eric, if you will take a look into the code you might notice
>> that there are few places where list operations could be
>> optimised and the main question here is do we really care
>> to have the data ,,safe'' so that things like that won't popup.
>> The good example can be the set of _local_ functions.

There should be a vertical space here ...

>> Ahhh... and I'm aware of how tricky can be abuse of RCU.
>
> I took a quick look at existing rcu_read_lock() uses in sctp and did not
> find other problems [or optimizations if you prefer this point of view].
> Please elaborate.

... so that the last line does not have any connection with what I've
wrote above.

I get your point Eric so you don't really have to prove it - if you still need.

-jacek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  8:29     ` Eric Dumazet
  2011-05-18  9:02       ` Wei Yongjun
@ 2011-05-18 12:06       ` Jacek Luczak
  1 sibling, 0 replies; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 12:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Vlad Yasevich

2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 10:06 +0200, Jacek Luczak a écrit :
>> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>
>> > If you're removing items from this list, you must be a writer here, with
>> > exclusive access. So rcu_read_lock()/rcu_read_unlock() is not necessary.
>>
>> I could agree to some extend ... but strict RCU section IMO is needed here.
>> I can check this if the issue exists.
>>
>
> I can tell you for sure rcu_read_lock() is not needed here. Its only
> showing confusion from code's author.
>
> Please read Documentation/RCU/listRCU.txt for concise explanations,
> line 117.
>
>
>> > Therefore, I guess following code is better :
>> >
>> > list_for_each_entry(addr, &bp->address_list, list) {
>> >        list_del_rcu(&addr->list);
>> >        call_rcu(&addr->rcu, sctp_local_addr_free);
>> >        SCTP_DBG_OBJCNT_DEC(addr);
>> > }
>> >
>> > Then, why dont you fix sctp_bind_addr_clean() instead ?
>> >
>> > if 'struct sctp_sockaddr_entry' is recu protected, then all frees should
>> > be protected as well.
>>
>> The _clean() as claimed by Vlad is called many times from various places
>> in code and this could give a overhead. I guess Vlad would need to comment.
>
> I guess a full review of this code is needed. You'll have to prove
> sctp_bind_addr_clean() is always called after one RCU grace period if
> you want to leave it as is.

Eric this is actually good point which I think did not found at first
glance. As the original
issue occurs between _clean() and _conflict() then if the former is
used here and
there we can hit the grace period not only in that case. By that then
_clean() should
be ,,fixed''. Right?

-Jacek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18  9:02       ` Wei Yongjun
  2011-05-18 11:01         ` Jacek Luczak
@ 2011-05-18 12:33         ` Vladislav Yasevich
  2011-05-18 12:47           ` Jacek Luczak
  1 sibling, 1 reply; 16+ messages in thread
From: Vladislav Yasevich @ 2011-05-18 12:33 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: Jacek Luczak, Eric Dumazet, netdev

On 05/18/2011 05:02 AM, Wei Yongjun wrote:
 
> fix the race between sctp_bind_addr_free() and sctp_bind_addr_conflict(), maybe you just
> need to remove the socket from the port hash before empty the bind address list.
> some thing like this, not test.
> 
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index c8cc24e..924d846 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -268,12 +268,13 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
>  
>  	/* Cleanup. */
>  	sctp_inq_free(&ep->base.inqueue);
> -	sctp_bind_addr_free(&ep->base.bind_addr);
>  
>  	/* Remove and free the port */
>  	if (sctp_sk(ep->base.sk)->bind_hash)
>  		sctp_put_port(ep->base.sk);
>  
> +	sctp_bind_addr_free(&ep->base.bind_addr);
> +
>  	/* Give up our hold on the sock. */
>  	if (ep->base.sk)
>  		sock_put(ep->base.sk);
> 
> 

I am not sure that this will guarantee avoidance of this crash, even though it is the right
thing to do in general.

We simply make the race condition much smaller and much harder to hit.  Potentially, one
cpu may be doing lookup of the socket while the other is doing the destroy.  If the lookup cpu
finds the socket just as this code removes the socket from the hash, we still have this potential
race condition.

I agree with Eric, rcu_read_lock() is not strictly necessary, as what we are really after is call_rcu()
based destruction.  We need to delay memory destruction for the rcu grace period.

Thinking a little more about how bind_addr_clean() is used, it would probably benefit from getting
converted to call_rcu().  That function is used as local clean-up in case of malloc failure; however,
that doesn't preclude a potential race.  The fact that we do not have this race simply points out that
we don't have any kind of parallel lookup that can hit it (and the code confirms it).  This doesn't
mean that we wouldn't have it in the future.

-vlad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 12:33         ` Vladislav Yasevich
@ 2011-05-18 12:47           ` Jacek Luczak
  2011-05-18 12:50             ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 12:47 UTC (permalink / raw)
  To: Vladislav Yasevich; +Cc: Wei Yongjun, Eric Dumazet, netdev

2011/5/18 Vladislav Yasevich <vladislav.yasevich@hp.com>:
> On 05/18/2011 05:02 AM, Wei Yongjun wrote:
>
>> fix the race between sctp_bind_addr_free() and sctp_bind_addr_conflict(), maybe you just
>> need to remove the socket from the port hash before empty the bind address list.
>> some thing like this, not test.
>>
>> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
>> index c8cc24e..924d846 100644
>> --- a/net/sctp/endpointola.c
>> +++ b/net/sctp/endpointola.c
>> @@ -268,12 +268,13 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep)
>>
>>       /* Cleanup. */
>>       sctp_inq_free(&ep->base.inqueue);
>> -     sctp_bind_addr_free(&ep->base.bind_addr);
>>
>>       /* Remove and free the port */
>>       if (sctp_sk(ep->base.sk)->bind_hash)
>>               sctp_put_port(ep->base.sk);
>>
>> +     sctp_bind_addr_free(&ep->base.bind_addr);
>> +
>>       /* Give up our hold on the sock. */
>>       if (ep->base.sk)
>>               sock_put(ep->base.sk);
>>
>>
>
> I am not sure that this will guarantee avoidance of this crash, even though it is the right
> thing to do in general.
>
> We simply make the race condition much smaller and much harder to hit.  Potentially, one
> cpu may be doing lookup of the socket while the other is doing the destroy.  If the lookup cpu
> finds the socket just as this code removes the socket from the hash, we still have this potential
> race condition.
>
> I agree with Eric, rcu_read_lock() is not strictly necessary, as what we are really after is call_rcu()
> based destruction.  We need to delay memory destruction for the rcu grace period.
>
> Thinking a little more about how bind_addr_clean() is used, it would probably benefit from getting
> converted to call_rcu().  That function is used as local clean-up in case of malloc failure; however,
> that doesn't preclude a potential race.  The fact that we do not have this race simply points out that
> we don't have any kind of parallel lookup that can hit it (and the code confirms it).  This doesn't
> mean that we wouldn't have it in the future.
>

OK then, at the end what Eric suggested is IMO valid:

diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index faf71d1..0025d90 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -144,10 +144,9 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
        struct list_head *pos, *temp;

        /* Empty the bind address list. */
-       list_for_each_safe(pos, temp, &bp->address_list) {
-               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
-               list_del(pos);
-               kfree(addr);
+       list_for_each_entry(pos, &bp->address_list, list) {
+               list_del_rcu(&pos->list);
+               call_rcu(&pos->rcu, sctp_local_addr_free);
                SCTP_DBG_OBJCNT_DEC(addr);
        }
 }


I will test this. Should be safe, avoid race not only in that case and
it consistent.

-Jacek

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 12:47           ` Jacek Luczak
@ 2011-05-18 12:50             ` Eric Dumazet
  2011-05-18 13:11               ` Jacek Luczak
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2011-05-18 12:50 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Vladislav Yasevich, Wei Yongjun, netdev

Le mercredi 18 mai 2011 à 14:47 +0200, Jacek Luczak a écrit :

> OK then, at the end what Eric suggested is IMO valid:
> 
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index faf71d1..0025d90 100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -144,10 +144,9 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>         struct list_head *pos, *temp;
> 
>         /* Empty the bind address list. */
> -       list_for_each_safe(pos, temp, &bp->address_list) {
> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
> -               list_del(pos);
> -               kfree(addr);
> +       list_for_each_entry(pos, &bp->address_list, list) {

a 'safe' version is needed here, since we remove items in iterator.

> +               list_del_rcu(&pos->list);
> +               call_rcu(&pos->rcu, sctp_local_addr_free);
>                 SCTP_DBG_OBJCNT_DEC(addr);
>         }
>  }
> 
> 
> I will test this. Should be safe, avoid race not only in that case and
> it consistent.
> 
> -Jacek



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 12:50             ` Eric Dumazet
@ 2011-05-18 13:11               ` Jacek Luczak
  2011-05-18 13:20                 ` Eric Dumazet
  2011-05-18 13:32                 ` Vladislav Yasevich
  0 siblings, 2 replies; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 13:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Vladislav Yasevich, Wei Yongjun, netdev

2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
> Le mercredi 18 mai 2011 à 14:47 +0200, Jacek Luczak a écrit :
>
>> OK then, at the end what Eric suggested is IMO valid:
>>
>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>> index faf71d1..0025d90 100644
>> --- a/net/sctp/bind_addr.c
>> +++ b/net/sctp/bind_addr.c
>> @@ -144,10 +144,9 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>         struct list_head *pos, *temp;
>>
>>         /* Empty the bind address list. */
>> -       list_for_each_safe(pos, temp, &bp->address_list) {
>> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
>> -               list_del(pos);
>> -               kfree(addr);
>> +       list_for_each_entry(pos, &bp->address_list, list) {
>
> a 'safe' version is needed here, since we remove items in iterator.

Yep.

diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
index faf71d1..6150ac5 100644
--- a/net/sctp/bind_addr.c
+++ b/net/sctp/bind_addr.c
@@ -140,14 +140,12 @@ void sctp_bind_addr_init(struct sctp_bind_addr
*bp, __u16 port)
 /* Dispose of the address list. */
 static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
 {
-       struct sctp_sockaddr_entry *addr;
-       struct list_head *pos, *temp;
+       struct sctp_sockaddr_entry *addr, *temp;

        /* Empty the bind address list. */
-       list_for_each_safe(pos, temp, &bp->address_list) {
-               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
-               list_del(pos);
-               kfree(addr);
+       list_for_each_entry_safe(addr, temp, &bp->address_list, list) {
+               list_del_rcu(&addr->list);
+               call_rcu(&addr->rcu, sctp_local_addr_free);
                SCTP_DBG_OBJCNT_DEC(addr);
        }
 }

Does it now look good?

-Jacek

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 13:11               ` Jacek Luczak
@ 2011-05-18 13:20                 ` Eric Dumazet
  2011-05-18 13:32                 ` Vladislav Yasevich
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2011-05-18 13:20 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Vladislav Yasevich, Wei Yongjun, netdev

Le mercredi 18 mai 2011 à 15:11 +0200, Jacek Luczak a écrit :

> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index faf71d1..6150ac5 100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -140,14 +140,12 @@ void sctp_bind_addr_init(struct sctp_bind_addr
> *bp, __u16 port)
>  /* Dispose of the address list. */
>  static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>  {
> -       struct sctp_sockaddr_entry *addr;
> -       struct list_head *pos, *temp;
> +       struct sctp_sockaddr_entry *addr, *temp;
> 
>         /* Empty the bind address list. */
> -       list_for_each_safe(pos, temp, &bp->address_list) {
> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
> -               list_del(pos);
> -               kfree(addr);
> +       list_for_each_entry_safe(addr, temp, &bp->address_list, list) {
> +               list_del_rcu(&addr->list);
> +               call_rcu(&addr->rcu, sctp_local_addr_free);
>                 SCTP_DBG_OBJCNT_DEC(addr);
>         }
>  }
> 
> Does it now look good?
> 

It seems fine yes !




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 13:11               ` Jacek Luczak
  2011-05-18 13:20                 ` Eric Dumazet
@ 2011-05-18 13:32                 ` Vladislav Yasevich
  2011-05-18 13:39                   ` Jacek Luczak
  1 sibling, 1 reply; 16+ messages in thread
From: Vladislav Yasevich @ 2011-05-18 13:32 UTC (permalink / raw)
  To: Jacek Luczak; +Cc: Eric Dumazet, Wei Yongjun, netdev

On 05/18/2011 09:11 AM, Jacek Luczak wrote:
> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>> Le mercredi 18 mai 2011 à 14:47 +0200, Jacek Luczak a écrit :
>>
>>> OK then, at the end what Eric suggested is IMO valid:
>>>
>>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>>> index faf71d1..0025d90 100644
>>> --- a/net/sctp/bind_addr.c
>>> +++ b/net/sctp/bind_addr.c
>>> @@ -144,10 +144,9 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>>         struct list_head *pos, *temp;
>>>
>>>         /* Empty the bind address list. */
>>> -       list_for_each_safe(pos, temp, &bp->address_list) {
>>> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
>>> -               list_del(pos);
>>> -               kfree(addr);
>>> +       list_for_each_entry(pos, &bp->address_list, list) {
>>
>> a 'safe' version is needed here, since we remove items in iterator.
> 
> Yep.
> 
> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
> index faf71d1..6150ac5 100644
> --- a/net/sctp/bind_addr.c
> +++ b/net/sctp/bind_addr.c
> @@ -140,14 +140,12 @@ void sctp_bind_addr_init(struct sctp_bind_addr
> *bp, __u16 port)
>  /* Dispose of the address list. */
>  static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>  {
> -       struct sctp_sockaddr_entry *addr;
> -       struct list_head *pos, *temp;
> +       struct sctp_sockaddr_entry *addr, *temp;
> 
>         /* Empty the bind address list. */
> -       list_for_each_safe(pos, temp, &bp->address_list) {
> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
> -               list_del(pos);
> -               kfree(addr);
> +       list_for_each_entry_safe(addr, temp, &bp->address_list, list) {
> +               list_del_rcu(&addr->list);
> +               call_rcu(&addr->rcu, sctp_local_addr_free);
>                 SCTP_DBG_OBJCNT_DEC(addr);
>         }
>  }
> 
> Does it now look good?

Yes.  It should the fix the race.

-vlad

> 
> -Jacek
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict()
  2011-05-18 13:32                 ` Vladislav Yasevich
@ 2011-05-18 13:39                   ` Jacek Luczak
  0 siblings, 0 replies; 16+ messages in thread
From: Jacek Luczak @ 2011-05-18 13:39 UTC (permalink / raw)
  To: Vladislav Yasevich; +Cc: Eric Dumazet, Wei Yongjun, netdev

2011/5/18 Vladislav Yasevich <vladislav.yasevich@hp.com>:
> On 05/18/2011 09:11 AM, Jacek Luczak wrote:
>> 2011/5/18 Eric Dumazet <eric.dumazet@gmail.com>:
>>> Le mercredi 18 mai 2011 à 14:47 +0200, Jacek Luczak a écrit :
>>>
>>>> OK then, at the end what Eric suggested is IMO valid:
>>>>
>>>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>>>> index faf71d1..0025d90 100644
>>>> --- a/net/sctp/bind_addr.c
>>>> +++ b/net/sctp/bind_addr.c
>>>> @@ -144,10 +144,9 @@ static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>>>         struct list_head *pos, *temp;
>>>>
>>>>         /* Empty the bind address list. */
>>>> -       list_for_each_safe(pos, temp, &bp->address_list) {
>>>> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
>>>> -               list_del(pos);
>>>> -               kfree(addr);
>>>> +       list_for_each_entry(pos, &bp->address_list, list) {
>>>
>>> a 'safe' version is needed here, since we remove items in iterator.
>>
>> Yep.
>>
>> diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c
>> index faf71d1..6150ac5 100644
>> --- a/net/sctp/bind_addr.c
>> +++ b/net/sctp/bind_addr.c
>> @@ -140,14 +140,12 @@ void sctp_bind_addr_init(struct sctp_bind_addr
>> *bp, __u16 port)
>>  /* Dispose of the address list. */
>>  static void sctp_bind_addr_clean(struct sctp_bind_addr *bp)
>>  {
>> -       struct sctp_sockaddr_entry *addr;
>> -       struct list_head *pos, *temp;
>> +       struct sctp_sockaddr_entry *addr, *temp;
>>
>>         /* Empty the bind address list. */
>> -       list_for_each_safe(pos, temp, &bp->address_list) {
>> -               addr = list_entry(pos, struct sctp_sockaddr_entry, list);
>> -               list_del(pos);
>> -               kfree(addr);
>> +       list_for_each_entry_safe(addr, temp, &bp->address_list, list) {
>> +               list_del_rcu(&addr->list);
>> +               call_rcu(&addr->rcu, sctp_local_addr_free);
>>                 SCTP_DBG_OBJCNT_DEC(addr);
>>         }
>>  }
>>
>> Does it now look good?
>
> Yes.  It should the fix the race.
>

Thanks guys then for your guidance. I will repost final patch.

-Jacek

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-05-18 13:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-18  7:01 [PATCH] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict() Jacek Luczak
2011-05-18  7:48 ` Eric Dumazet
2011-05-18  8:06   ` Jacek Luczak
2011-05-18  8:29     ` Eric Dumazet
2011-05-18  9:02       ` Wei Yongjun
2011-05-18 11:01         ` Jacek Luczak
2011-05-18 11:41           ` Eric Dumazet
2011-05-18 11:58             ` Jacek Luczak
2011-05-18 12:33         ` Vladislav Yasevich
2011-05-18 12:47           ` Jacek Luczak
2011-05-18 12:50             ` Eric Dumazet
2011-05-18 13:11               ` Jacek Luczak
2011-05-18 13:20                 ` Eric Dumazet
2011-05-18 13:32                 ` Vladislav Yasevich
2011-05-18 13:39                   ` Jacek Luczak
2011-05-18 12:06       ` Jacek Luczak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.