netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule
@ 2015-07-19  3:31 subashab
  2015-07-19  8:51 ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: subashab @ 2015-07-19  3:31 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet

I am seeing an issue with the reference count of time wait sockets which
leads to freeing of active timer object. This occurs in some data stress
test setups, so I am unable to determine the exact step when it occured.
However, I logged the refcount and was able to find out the code path
which leads to this problem.

//Initialize time wait socket and setup timer
inet_twsk_alloc() tw_refcnt = 0
__inet_twsk_hashdance() tw_refcnt = 3
inet_twsk_schedule() tw_refcnt = 4
inet_twsk_put() tw_refcnt = 3

//Receive packet 1 in timewait state
tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 3 (no change)
TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 2

//Receive packet 2 in timewait state
tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 2 (no change)
TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 1

//Receive packet 3 in timewait state
tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 1 (no change)
TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 0

After this step, the time wait socket is destroyed along with the active
timer object. This leads to a warning being printed which eventually leads
to a crash.

ODEBUG: free active (active state 0) object type: timer_list hint:
tw_timer_handler+0x0/0x68

It appears that inet_twsk_schedule needs to increment the reference count
unconditionally, otherwise the socket will be destroyed since reference
count will be decremented each time an ack is sent out as a response for
an incoming packet.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 net/ipv4/inet_timewait_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index cbeb022..99c349a 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -246,9 +246,9 @@ void inet_twsk_schedule(struct inet_timewait_sock *tw,
const int timeo)

 	tw->tw_kill = timeo <= 4*HZ;
 	if (!mod_timer_pinned(&tw->tw_timer, jiffies + timeo)) {
-		atomic_inc(&tw->tw_refcnt);
 		atomic_inc(&tw->tw_dr->tw_count);
 	}
+	atomic_inc(&tw->tw_refcnt);
 }
 EXPORT_SYMBOL_GPL(inet_twsk_schedule);

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule
  2015-07-19  3:31 [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule subashab
@ 2015-07-19  8:51 ` Eric Dumazet
  2015-07-20 19:14   ` subashab
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-07-19  8:51 UTC (permalink / raw)
  To: subashab; +Cc: netdev

On Sun, 2015-07-19 at 03:31 +0000, subashab@codeaurora.org wrote:
> I am seeing an issue with the reference count of time wait sockets which
> leads to freeing of active timer object. This occurs in some data stress
> test setups, so I am unable to determine the exact step when it occured.
> However, I logged the refcount and was able to find out the code path
> which leads to this problem.
> 
> //Initialize time wait socket and setup timer
> inet_twsk_alloc() tw_refcnt = 0
> __inet_twsk_hashdance() tw_refcnt = 3
> inet_twsk_schedule() tw_refcnt = 4
> inet_twsk_put() tw_refcnt = 3
> 
> //Receive packet 1 in timewait state
> tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 3 (no change)

This is obviously wrong.

If a timewait socket is found, do we increment its refcnt before
proceeding.

> TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 2
> 
> //Receive packet 2 in timewait state
> tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 2 (no change)
> TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 1
> 
> //Receive packet 3 in timewait state
> tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 1 (no change)
> TCP: tcp_v4_timewait_ack() -> inet_twsk_put() tw_refcnt = 0
> 
> After this step, the time wait socket is destroyed along with the active
> timer object. This leads to a warning being printed which eventually leads
> to a crash.
> 
> ODEBUG: free active (active state 0) object type: timer_list hint:
> tw_timer_handler+0x0/0x68
> 
> It appears that inet_twsk_schedule needs to increment the reference count
> unconditionally, otherwise the socket will be destroyed since reference
> count will be decremented each time an ack is sent out as a response for
> an incoming packet.
> 
> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
> ---
>  net/ipv4/inet_timewait_sock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
> index cbeb022..99c349a 100644
> --- a/net/ipv4/inet_timewait_sock.c
> +++ b/net/ipv4/inet_timewait_sock.c
> @@ -246,9 +246,9 @@ void inet_twsk_schedule(struct inet_timewait_sock *tw,
> const int timeo)
> 
>  	tw->tw_kill = timeo <= 4*HZ;
>  	if (!mod_timer_pinned(&tw->tw_timer, jiffies + timeo)) {
> -		atomic_inc(&tw->tw_refcnt);
>  		atomic_inc(&tw->tw_dr->tw_count);
>  	}
> +	atomic_inc(&tw->tw_refcnt);
>  }
>  EXPORT_SYMBOL_GPL(inet_twsk_schedule);


This is wrong. You simply add a memory leak here. It might solve your
crash, but is not the proper way.

I've received some private mails about tw issues, that turned to be
caused by buggy drivers or buggy arch specific code.

Are you crashed observed on x86 ?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule
  2015-07-19  8:51 ` Eric Dumazet
@ 2015-07-20 19:14   ` subashab
  2015-07-21  7:10     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: subashab @ 2015-07-20 19:14 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev

>> //Initialize time wait socket and setup timer
>> inet_twsk_alloc() tw_refcnt = 0
>> __inet_twsk_hashdance() tw_refcnt = 3
>> inet_twsk_schedule() tw_refcnt = 4
>> inet_twsk_put() tw_refcnt = 3
>>
>> //Receive packet 1 in timewait state
>> tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 3 (no
>> change)
>
> This is obviously wrong.
>
> If a timewait socket is found, do we increment its refcnt before
> proceeding.
We do not increment refcount currently when we find a timewait socket.

> I've received some private mails about tw issues, that turned to be
> caused by buggy drivers or buggy arch specific code.
>
> Are you crashed observed on x86 ?
>
This is observed on ARM devices. In the current debug, all time wait
socket refcount changes were happening in TCP stack only and there was no
platform / driver code involved.

According to my understanding, we would need to increment the time wait
socket refcount first before proceeding with any subsequent operations.
However, I request your expert opinion on this.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule
  2015-07-20 19:14   ` subashab
@ 2015-07-21  7:10     ` Eric Dumazet
  2015-07-24  1:49       ` subashab
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-07-21  7:10 UTC (permalink / raw)
  To: subashab; +Cc: netdev

On Mon, 2015-07-20 at 19:14 +0000, subashab@codeaurora.org wrote:
> >> //Initialize time wait socket and setup timer
> >> inet_twsk_alloc() tw_refcnt = 0
> >> __inet_twsk_hashdance() tw_refcnt = 3
> >> inet_twsk_schedule() tw_refcnt = 4
> >> inet_twsk_put() tw_refcnt = 3
> >>
> >> //Receive packet 1 in timewait state
> >> tcp_timewait_state_process() -> inet_twsk_schedule tw_refcnt = 3 (no
> >> change)
> >
> > This is obviously wrong.
> >
> > If a timewait socket is found, do we increment its refcnt before
> > proceeding.
> We do not increment refcount currently when we find a timewait socket.

Actually we do increment refcnt, for every socket found in ehash.

Carefully read again __inet_lookup_established()

This code is generic for ESTABLISH and TIME-WAIT sockets

If you found a code that performed the lookup without taking the refcnt,
please point me at it, this would be a serious bug.

> 
> > I've received some private mails about tw issues, that turned to be
> > caused by buggy drivers or buggy arch specific code.
> >
> > Are you crashed observed on x86 ?
> >
> This is observed on ARM devices. In the current debug, all time wait
> socket refcount changes were happening in TCP stack only and there was no
> platform / driver code involved.
> 
> According to my understanding, we would need to increment the time wait
> socket refcount first before proceeding with any subsequent operations.
> However, I request your expert opinion on this.

Is it some Android kernel ?

Android had private modules that needed an update in 3.18

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule
  2015-07-21  7:10     ` Eric Dumazet
@ 2015-07-24  1:49       ` subashab
  0 siblings, 0 replies; 5+ messages in thread
From: subashab @ 2015-07-24  1:49 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev

> Actually we do increment refcnt, for every socket found in ehash.
>
> Carefully read again __inet_lookup_established()
>
> This code is generic for ESTABLISH and TIME-WAIT sockets
>
> If you found a code that performed the lookup without taking the refcnt,
> please point me at it, this would be a serious bug.

>From my previous observations, it appears as if
1. this check is bypassed
2. the refcount is incremented here but is decremented before it reaches
the packet processing in tcp_timewait_state_process()

I will try to debug this and update.

> Is it some Android kernel ?
>
> Android had private modules that needed an update in 3.18

Yes, the kernel is based on Android 3.18.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-24  1:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-19  3:31 [PATCH net-next] inet: Always increment refcount in inet_twsk_schedule subashab
2015-07-19  8:51 ` Eric Dumazet
2015-07-20 19:14   ` subashab
2015-07-21  7:10     ` Eric Dumazet
2015-07-24  1:49       ` subashab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).