All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at kernel/timer.c:748!
@ 2012-09-05  4:35 Dave Jones
  2012-09-05 16:04 ` Lin Ming
  2012-09-05 20:48 ` Julian Anastasov
  0 siblings, 2 replies; 20+ messages in thread
From: Dave Jones @ 2012-09-05  4:35 UTC (permalink / raw)
  To: netdev

Just hit this bug on 3.6-rc4.

The BUG is..

	BUG_ON(!timer->function);


Not much to go on... Any thoughts on what I could add to get
more debug info on which protocol etc this was ?

	Dave


kernel BUG at kernel/timer.c:748!
invalid opcode: 0000 [#1] SMP 
Modules linked in: tun fuse ipt_ULOG binfmt_misc nfnetlink nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 nfsv3 nfs_acl nfs fscache lockd sunrpc bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr i2c_i801 e1000e uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
CPU 3 
Pid: 12330, comm: trinity-child3 Not tainted 3.6.0-rc4+ #36
RIP: 0010:[<ffffffff810813f5>]  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
RSP: 0018:ffff88000dfd7e08  EFLAGS: 00010246
RAX: 000000000000001a RBX: ffff880122d62948 RCX: 000000000000001a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000dfd7e10
RBP: ffff88000dfd7e48 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000001517000 R11: 0000000000000246 R12: 000000016c000000
R13: 000000016c12bcb1 R14: ffff8801236cee00 R15: 00000000ffffff01
FS:  00007fa96745f740(0000) GS:ffff880148200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000100ff000 CR3: 0000000099344000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process trinity-child3 (pid: 12330, threadinfo ffff88000dfd6000, task ffff880090890000)
Stack:
 ffffffff8154cb6d 0000000007b5edf7 ffff88000dfd7e28 ffff880122d62520
 0000000000000009 0000000000000004 ffff8801236cee00 00000000ffffff01
 ffff88000dfd7e68 ffffffff8154c79c ffffffff81550e6c ffff880122d62520
Call Trace:
 [<ffffffff8154cb6d>] ? lock_sock_nested+0x8d/0xa0
 [<ffffffff8154c79c>] sk_reset_timer+0x1c/0x30
 [<ffffffff81550e6c>] ? sock_setsockopt+0x8c/0x960
 [<ffffffff815a84a0>] inet_csk_reset_keepalive_timer+0x20/0x30
 [<ffffffff815c018d>] tcp_set_keepalive+0x3d/0x50
 [<ffffffff81551703>] sock_setsockopt+0x923/0x960
 [<ffffffff810ddf76>] ? trace_hardirqs_on_caller+0x16/0x1e0
 [<ffffffff811db0ac>] ? fget_light+0x24c/0x520
 [<ffffffff8154af86>] sys_setsockopt+0xc6/0xe0
 [<ffffffff816a50ed>] system_call_fastpath+0x1a/0x1f
Code: 00 74 43 9c 58 0f 1f 44 00 00 f6 c4 02 0f 84 14 ff ff ff eb 93 48 c7 c7 20 48 c3 81 e8 f5 70 05 00 85 c0 0f 85 fe fe ff ff eb b7 <0f> 0b 48 8b 75 08 48 89 df e8 3d f6 ff ff e9 b2 fd ff ff 4d 89 
RIP  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
 RSP <ffff88000dfd7e08>
---[ end trace 7e7b5910138e49a3 ]---

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05  4:35 kernel BUG at kernel/timer.c:748! Dave Jones
@ 2012-09-05 16:04 ` Lin Ming
  2012-09-05 16:37   ` Yuchung Cheng
  2012-09-05 21:18   ` Jerry Chu
  2012-09-05 20:48 ` Julian Anastasov
  1 sibling, 2 replies; 20+ messages in thread
From: Lin Ming @ 2012-09-05 16:04 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev

On Wed, Sep 5, 2012 at 12:35 PM, Dave Jones <davej@redhat.com> wrote:
> Just hit this bug on 3.6-rc4.
>
> The BUG is..
>
>         BUG_ON(!timer->function);

TCP keepalive timer is setup when the socket is created.

__sock_create
inet_create
tcp_v4_init_sock
tcp_init_sock
tcp_init_xmit_timers
inet_csk_init_xmit_timers

timer->function should not be NULL when set keepalive option.

Strange...have bug somewhere.

Lin Ming

>
>
> Not much to go on... Any thoughts on what I could add to get
> more debug info on which protocol etc this was ?
>
>         Dave
>
>
> kernel BUG at kernel/timer.c:748!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: tun fuse ipt_ULOG binfmt_misc nfnetlink nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 nfsv3 nfs_acl nfs fscache lockd sunrpc bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr i2c_i801 e1000e uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
> CPU 3
> Pid: 12330, comm: trinity-child3 Not tainted 3.6.0-rc4+ #36
> RIP: 0010:[<ffffffff810813f5>]  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
> RSP: 0018:ffff88000dfd7e08  EFLAGS: 00010246
> RAX: 000000000000001a RBX: ffff880122d62948 RCX: 000000000000001a
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000dfd7e10
> RBP: ffff88000dfd7e48 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001517000 R11: 0000000000000246 R12: 000000016c000000
> R13: 000000016c12bcb1 R14: ffff8801236cee00 R15: 00000000ffffff01
> FS:  00007fa96745f740(0000) GS:ffff880148200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000100ff000 CR3: 0000000099344000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process trinity-child3 (pid: 12330, threadinfo ffff88000dfd6000, task ffff880090890000)
> Stack:
>  ffffffff8154cb6d 0000000007b5edf7 ffff88000dfd7e28 ffff880122d62520
>  0000000000000009 0000000000000004 ffff8801236cee00 00000000ffffff01
>  ffff88000dfd7e68 ffffffff8154c79c ffffffff81550e6c ffff880122d62520
> Call Trace:
>  [<ffffffff8154cb6d>] ? lock_sock_nested+0x8d/0xa0
>  [<ffffffff8154c79c>] sk_reset_timer+0x1c/0x30
>  [<ffffffff81550e6c>] ? sock_setsockopt+0x8c/0x960
>  [<ffffffff815a84a0>] inet_csk_reset_keepalive_timer+0x20/0x30
>  [<ffffffff815c018d>] tcp_set_keepalive+0x3d/0x50
>  [<ffffffff81551703>] sock_setsockopt+0x923/0x960
>  [<ffffffff810ddf76>] ? trace_hardirqs_on_caller+0x16/0x1e0
>  [<ffffffff811db0ac>] ? fget_light+0x24c/0x520
>  [<ffffffff8154af86>] sys_setsockopt+0xc6/0xe0
>  [<ffffffff816a50ed>] system_call_fastpath+0x1a/0x1f
> Code: 00 74 43 9c 58 0f 1f 44 00 00 f6 c4 02 0f 84 14 ff ff ff eb 93 48 c7 c7 20 48 c3 81 e8 f5 70 05 00 85 c0 0f 85 fe fe ff ff eb b7 <0f> 0b 48 8b 75 08 48 89 df e8 3d f6 ff ff e9 b2 fd ff ff 4d 89
> RIP  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>  RSP <ffff88000dfd7e08>
> ---[ end trace 7e7b5910138e49a3 ]---
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05 16:04 ` Lin Ming
@ 2012-09-05 16:37   ` Yuchung Cheng
  2012-09-05 17:08     ` Dave Jones
  2012-09-05 21:18   ` Jerry Chu
  1 sibling, 1 reply; 20+ messages in thread
From: Yuchung Cheng @ 2012-09-05 16:37 UTC (permalink / raw)
  To: Lin Ming; +Cc: Dave Jones, netdev

On Wed, Sep 5, 2012 at 9:04 AM, Lin Ming <mlin@ss.pku.edu.cn> wrote:
> On Wed, Sep 5, 2012 at 12:35 PM, Dave Jones <davej@redhat.com> wrote:
>> Just hit this bug on 3.6-rc4.
>>
>> The BUG is..
>>
>>         BUG_ON(!timer->function);
>
> TCP keepalive timer is setup when the socket is created.
>
> __sock_create
> inet_create
> tcp_v4_init_sock
> tcp_init_sock
> tcp_init_xmit_timers
> inet_csk_init_xmit_timers
>
> timer->function should not be NULL when set keepalive option.
>
> Strange...have bug somewhere.

is this a passively opened socket or actively opened one?


>
> Lin Ming
>
>>
>>
>> Not much to go on... Any thoughts on what I could add to get
>> more debug info on which protocol etc this was ?
>>
>>         Dave
>>
>>
>> kernel BUG at kernel/timer.c:748!
>> invalid opcode: 0000 [#1] SMP
>> Modules linked in: tun fuse ipt_ULOG binfmt_misc nfnetlink nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 nfsv3 nfs_acl nfs fscache lockd sunrpc bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr i2c_i801 e1000e uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
>> CPU 3
>> Pid: 12330, comm: trinity-child3 Not tainted 3.6.0-rc4+ #36
>> RIP: 0010:[<ffffffff810813f5>]  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>> RSP: 0018:ffff88000dfd7e08  EFLAGS: 00010246
>> RAX: 000000000000001a RBX: ffff880122d62948 RCX: 000000000000001a
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000dfd7e10
>> RBP: ffff88000dfd7e48 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000001517000 R11: 0000000000000246 R12: 000000016c000000
>> R13: 000000016c12bcb1 R14: ffff8801236cee00 R15: 00000000ffffff01
>> FS:  00007fa96745f740(0000) GS:ffff880148200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00000000100ff000 CR3: 0000000099344000 CR4: 00000000001407e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process trinity-child3 (pid: 12330, threadinfo ffff88000dfd6000, task ffff880090890000)
>> Stack:
>>  ffffffff8154cb6d 0000000007b5edf7 ffff88000dfd7e28 ffff880122d62520
>>  0000000000000009 0000000000000004 ffff8801236cee00 00000000ffffff01
>>  ffff88000dfd7e68 ffffffff8154c79c ffffffff81550e6c ffff880122d62520
>> Call Trace:
>>  [<ffffffff8154cb6d>] ? lock_sock_nested+0x8d/0xa0
>>  [<ffffffff8154c79c>] sk_reset_timer+0x1c/0x30
>>  [<ffffffff81550e6c>] ? sock_setsockopt+0x8c/0x960
>>  [<ffffffff815a84a0>] inet_csk_reset_keepalive_timer+0x20/0x30
>>  [<ffffffff815c018d>] tcp_set_keepalive+0x3d/0x50
>>  [<ffffffff81551703>] sock_setsockopt+0x923/0x960
>>  [<ffffffff810ddf76>] ? trace_hardirqs_on_caller+0x16/0x1e0
>>  [<ffffffff811db0ac>] ? fget_light+0x24c/0x520
>>  [<ffffffff8154af86>] sys_setsockopt+0xc6/0xe0
>>  [<ffffffff816a50ed>] system_call_fastpath+0x1a/0x1f
>> Code: 00 74 43 9c 58 0f 1f 44 00 00 f6 c4 02 0f 84 14 ff ff ff eb 93 48 c7 c7 20 48 c3 81 e8 f5 70 05 00 85 c0 0f 85 fe fe ff ff eb b7 <0f> 0b 48 8b 75 08 48 89 df e8 3d f6 ff ff e9 b2 fd ff ff 4d 89
>> RIP  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>>  RSP <ffff88000dfd7e08>
>> ---[ end trace 7e7b5910138e49a3 ]---
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05 16:37   ` Yuchung Cheng
@ 2012-09-05 17:08     ` Dave Jones
  0 siblings, 0 replies; 20+ messages in thread
From: Dave Jones @ 2012-09-05 17:08 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Lin Ming, netdev

On Wed, Sep 05, 2012 at 09:37:12AM -0700, Yuchung Cheng wrote:
 > On Wed, Sep 5, 2012 at 9:04 AM, Lin Ming <mlin@ss.pku.edu.cn> wrote:
 > > On Wed, Sep 5, 2012 at 12:35 PM, Dave Jones <davej@redhat.com> wrote:
 > >> Just hit this bug on 3.6-rc4.
 > >>
 > >> The BUG is..
 > >>
 > >>         BUG_ON(!timer->function);
 > >
 > > TCP keepalive timer is setup when the socket is created.
 > >
 > > __sock_create
 > > inet_create
 > > tcp_v4_init_sock
 > > tcp_init_sock
 > > tcp_init_xmit_timers
 > > inet_csk_init_xmit_timers
 > >
 > > timer->function should not be NULL when set keepalive option.
 > >
 > > Strange...have bug somewhere.
 > 
 > is this a passively opened socket or actively opened one?

I have no idea. That fuzzer ran for around 10 hours before it triggered
this.  In that time, millions of syscalls were done.

Even though I had logs of every syscall, the volume of data to sift
through to try and map back to the socket that caused this is firmly
in 'needle in haystick' territory.

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05  4:35 kernel BUG at kernel/timer.c:748! Dave Jones
  2012-09-05 16:04 ` Lin Ming
@ 2012-09-05 20:48 ` Julian Anastasov
  2012-09-14 21:29   ` Dave Jones
  1 sibling, 1 reply; 20+ messages in thread
From: Julian Anastasov @ 2012-09-05 20:48 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev


	Hello,

On Wed, 5 Sep 2012, Dave Jones wrote:

> Just hit this bug on 3.6-rc4.
> 
> The BUG is..
> 
> 	BUG_ON(!timer->function);
> 
> 
> Not much to go on... Any thoughts on what I could add to get
> more debug info on which protocol etc this was ?
> 
> 	Dave
> 
> 
> kernel BUG at kernel/timer.c:748!
> invalid opcode: 0000 [#1] SMP 
> Modules linked in: tun fuse ipt_ULOG binfmt_misc nfnetlink nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 nfsv3 nfs_acl nfs fscache lockd sunrpc bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr i2c_i801 e1000e uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
> CPU 3 
> Pid: 12330, comm: trinity-child3 Not tainted 3.6.0-rc4+ #36
> RIP: 0010:[<ffffffff810813f5>]  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
> RSP: 0018:ffff88000dfd7e08  EFLAGS: 00010246
> RAX: 000000000000001a RBX: ffff880122d62948 RCX: 000000000000001a
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000dfd7e10
> RBP: ffff88000dfd7e48 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000001517000 R11: 0000000000000246 R12: 000000016c000000
> R13: 000000016c12bcb1 R14: ffff8801236cee00 R15: 00000000ffffff01
> FS:  00007fa96745f740(0000) GS:ffff880148200000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000100ff000 CR3: 0000000099344000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process trinity-child3 (pid: 12330, threadinfo ffff88000dfd6000, task ffff880090890000)
> Stack:
>  ffffffff8154cb6d 0000000007b5edf7 ffff88000dfd7e28 ffff880122d62520
>  0000000000000009 0000000000000004 ffff8801236cee00 00000000ffffff01
>  ffff88000dfd7e68 ffffffff8154c79c ffffffff81550e6c ffff880122d62520
> Call Trace:
>  [<ffffffff8154cb6d>] ? lock_sock_nested+0x8d/0xa0
>  [<ffffffff8154c79c>] sk_reset_timer+0x1c/0x30
>  [<ffffffff81550e6c>] ? sock_setsockopt+0x8c/0x960
>  [<ffffffff815a84a0>] inet_csk_reset_keepalive_timer+0x20/0x30
>  [<ffffffff815c018d>] tcp_set_keepalive+0x3d/0x50
>  [<ffffffff81551703>] sock_setsockopt+0x923/0x960
>  [<ffffffff810ddf76>] ? trace_hardirqs_on_caller+0x16/0x1e0
>  [<ffffffff811db0ac>] ? fget_light+0x24c/0x520
>  [<ffffffff8154af86>] sys_setsockopt+0xc6/0xe0
>  [<ffffffff816a50ed>] system_call_fastpath+0x1a/0x1f
> Code: 00 74 43 9c 58 0f 1f 44 00 00 f6 c4 02 0f 84 14 ff ff ff eb 93 48 c7 c7 20 48 c3 81 e8 f5 70 05 00 85 c0 0f 85 fe fe ff ff eb b7 <0f> 0b 48 8b 75 08 48 89 df e8 3d f6 ff ff e9 b2 fd ff ff 4d 89 
> RIP  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>  RSP <ffff88000dfd7e08>
> ---[ end trace 7e7b5910138e49a3 ]---

	Can this help? In case you see ICMPV6_PKT_TOOBIG...

[PATCH] tcp: fix possible socket refcount problem for ipv6

	commit 144d56e91044181ec0ef67aeca91e9a8b5718348
("tcp: fix possible socket refcount problem") is missing
the IPv6 part. As tcp_release_cb is shared by both protocols
we should hold sock reference for the TCP_MTU_REDUCED_DEFERRED
bit.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 net/ipv6/tcp_ipv6.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 09078b9..f3bfb8b 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -403,8 +403,9 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		tp->mtu_info = ntohl(info);
 		if (!sock_owned_by_user(sk))
 			tcp_v6_mtu_reduced(sk);
-		else
-			set_bit(TCP_MTU_REDUCED_DEFERRED, &tp->tsq_flags);
+		else if (!test_and_set_bit(TCP_MTU_REDUCED_DEFERRED,
+					   &tp->tsq_flags))
+			sock_hold(sk);
 		goto out;
 	}
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05 16:04 ` Lin Ming
  2012-09-05 16:37   ` Yuchung Cheng
@ 2012-09-05 21:18   ` Jerry Chu
  1 sibling, 0 replies; 20+ messages in thread
From: Jerry Chu @ 2012-09-05 21:18 UTC (permalink / raw)
  To: Lin Ming; +Cc: Dave Jones, netdev

On Wed, Sep 5, 2012 at 9:04 AM, Lin Ming <mlin@ss.pku.edu.cn> wrote:
> On Wed, Sep 5, 2012 at 12:35 PM, Dave Jones <davej@redhat.com> wrote:
>> Just hit this bug on 3.6-rc4.
>>
>> The BUG is..
>>
>>         BUG_ON(!timer->function);
>
> TCP keepalive timer is setup when the socket is created.
>
> __sock_create
> inet_create
> tcp_v4_init_sock
> tcp_init_sock
> tcp_init_xmit_timers
> inet_csk_init_xmit_timers
>
> timer->function should not be NULL when set keepalive option.

And tcp_init_xmit_timers() is called on the passive open side as well, v4
as well as v6. I don't see any code explicitly set timer->function back to NULL
(unless through set_timer(..., NULL,...). This may be a corrupted sock (already
released?)

Jerry

>
> Strange...have bug somewhere.
>
> Lin Ming
>
>>
>>
>> Not much to go on... Any thoughts on what I could add to get
>> more debug info on which protocol etc this was ?
>>
>>         Dave
>>
>>
>> kernel BUG at kernel/timer.c:748!
>> invalid opcode: 0000 [#1] SMP
>> Modules linked in: tun fuse ipt_ULOG binfmt_misc nfnetlink nfc caif_socket caif phonet can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 nfsv3 nfs_acl nfs fscache lockd sunrpc bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr i2c_i801 e1000e uinput i915 video i2c_algo_bit drm_kms_helper drm i2c_core
>> CPU 3
>> Pid: 12330, comm: trinity-child3 Not tainted 3.6.0-rc4+ #36
>> RIP: 0010:[<ffffffff810813f5>]  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>> RSP: 0018:ffff88000dfd7e08  EFLAGS: 00010246
>> RAX: 000000000000001a RBX: ffff880122d62948 RCX: 000000000000001a
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000dfd7e10
>> RBP: ffff88000dfd7e48 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000001517000 R11: 0000000000000246 R12: 000000016c000000
>> R13: 000000016c12bcb1 R14: ffff8801236cee00 R15: 00000000ffffff01
>> FS:  00007fa96745f740(0000) GS:ffff880148200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00000000100ff000 CR3: 0000000099344000 CR4: 00000000001407e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process trinity-child3 (pid: 12330, threadinfo ffff88000dfd6000, task ffff880090890000)
>> Stack:
>>  ffffffff8154cb6d 0000000007b5edf7 ffff88000dfd7e28 ffff880122d62520
>>  0000000000000009 0000000000000004 ffff8801236cee00 00000000ffffff01
>>  ffff88000dfd7e68 ffffffff8154c79c ffffffff81550e6c ffff880122d62520
>> Call Trace:
>>  [<ffffffff8154cb6d>] ? lock_sock_nested+0x8d/0xa0
>>  [<ffffffff8154c79c>] sk_reset_timer+0x1c/0x30
>>  [<ffffffff81550e6c>] ? sock_setsockopt+0x8c/0x960
>>  [<ffffffff815a84a0>] inet_csk_reset_keepalive_timer+0x20/0x30
>>  [<ffffffff815c018d>] tcp_set_keepalive+0x3d/0x50
>>  [<ffffffff81551703>] sock_setsockopt+0x923/0x960
>>  [<ffffffff810ddf76>] ? trace_hardirqs_on_caller+0x16/0x1e0
>>  [<ffffffff811db0ac>] ? fget_light+0x24c/0x520
>>  [<ffffffff8154af86>] sys_setsockopt+0xc6/0xe0
>>  [<ffffffff816a50ed>] system_call_fastpath+0x1a/0x1f
>> Code: 00 74 43 9c 58 0f 1f 44 00 00 f6 c4 02 0f 84 14 ff ff ff eb 93 48 c7 c7 20 48 c3 81 e8 f5 70 05 00 85 c0 0f 85 fe fe ff ff eb b7 <0f> 0b 48 8b 75 08 48 89 df e8 3d f6 ff ff e9 b2 fd ff ff 4d 89
>> RIP  [<ffffffff810813f5>] mod_timer+0x2c5/0x2f0
>>  RSP <ffff88000dfd7e08>
>> ---[ end trace 7e7b5910138e49a3 ]---
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-05 20:48 ` Julian Anastasov
@ 2012-09-14 21:29   ` Dave Jones
  2012-09-15 18:16     ` Yuchung Cheng
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-14 21:29 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev

On Wed, Sep 05, 2012 at 11:48:29PM +0300, Julian Anastasov wrote:
 
 > > kernel BUG at kernel/timer.c:748!
 > > Call Trace:
 > >  ? lock_sock_nested+0x8d/0xa0
 > >  sk_reset_timer+0x1c/0x30
 > >  ? sock_setsockopt+0x8c/0x960
 > >  inet_csk_reset_keepalive_timer+0x20/0x30
 > >  tcp_set_keepalive+0x3d/0x50
 > >  sock_setsockopt+0x923/0x960
 > >  ? trace_hardirqs_on_caller+0x16/0x1e0
 > >  ? fget_light+0x24c/0x520
 > >  sys_setsockopt+0xc6/0xe0
 > >  system_call_fastpath+0x1a/0x1f
 > 
 > 	Can this help? In case you see ICMPV6_PKT_TOOBIG...
 > 
 > [PATCH] tcp: fix possible socket refcount problem for ipv6

I just managed to reproduce this bug on rc5 with this patch,
so it doesn't seem to help.

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-14 21:29   ` Dave Jones
@ 2012-09-15 18:16     ` Yuchung Cheng
  2012-09-19 21:10       ` Dave Jones
  0 siblings, 1 reply; 20+ messages in thread
From: Yuchung Cheng @ 2012-09-15 18:16 UTC (permalink / raw)
  To: Dave Jones; +Cc: Julian Anastasov, netdev

On Fri, Sep 14, 2012 at 2:29 PM, Dave Jones <davej@redhat.com> wrote:
> On Wed, Sep 05, 2012 at 11:48:29PM +0300, Julian Anastasov wrote:
>
>  > > kernel BUG at kernel/timer.c:748!
>  > > Call Trace:
>  > >  ? lock_sock_nested+0x8d/0xa0
>  > >  sk_reset_timer+0x1c/0x30
>  > >  ? sock_setsockopt+0x8c/0x960
>  > >  inet_csk_reset_keepalive_timer+0x20/0x30
>  > >  tcp_set_keepalive+0x3d/0x50
>  > >  sock_setsockopt+0x923/0x960
>  > >  ? trace_hardirqs_on_caller+0x16/0x1e0
>  > >  ? fget_light+0x24c/0x520
>  > >  sys_setsockopt+0xc6/0xe0
>  > >  system_call_fastpath+0x1a/0x1f
>  >
>  >      Can this help? In case you see ICMPV6_PKT_TOOBIG...
>  >
>  > [PATCH] tcp: fix possible socket refcount problem for ipv6
>
> I just managed to reproduce this bug on rc5 with this patch,
> so it doesn't seem to help.
Could you post some tcpdump traces?

>
>         Dave
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-15 18:16     ` Yuchung Cheng
@ 2012-09-19 21:10       ` Dave Jones
  2012-09-19 22:01         ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-19 21:10 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Julian Anastasov, netdev

On Sat, Sep 15, 2012 at 11:16:52AM -0700, Yuchung Cheng wrote:
 > On Fri, Sep 14, 2012 at 2:29 PM, Dave Jones <davej@redhat.com> wrote:
 > > On Wed, Sep 05, 2012 at 11:48:29PM +0300, Julian Anastasov wrote:
 > >
 > >  > > kernel BUG at kernel/timer.c:748!
 > >  > > Call Trace:
 > >  > >  ? lock_sock_nested+0x8d/0xa0
 > >  > >  sk_reset_timer+0x1c/0x30
 > >  > >  ? sock_setsockopt+0x8c/0x960
 > >  > >  inet_csk_reset_keepalive_timer+0x20/0x30
 > >  > >  tcp_set_keepalive+0x3d/0x50
 > >  > >  sock_setsockopt+0x923/0x960
 > >  > >  ? trace_hardirqs_on_caller+0x16/0x1e0
 > >  > >  ? fget_light+0x24c/0x520
 > >  > >  sys_setsockopt+0xc6/0xe0
 > >  > >  system_call_fastpath+0x1a/0x1f
 > >  >
 > >  >      Can this help? In case you see ICMPV6_PKT_TOOBIG...
 > >  >
 > >  > [PATCH] tcp: fix possible socket refcount problem for ipv6
 > >
 > > I just managed to reproduce this bug on rc5 with this patch,
 > > so it doesn't seem to help.
 > Could you post some tcpdump traces?

It's likely that there aren't any packets.  The fuzzer isn't smart
enough (yet) to do anything too clever to the sockets it creates.

More likely is that this is some race where thread A is doing a setsockopt
while thread B is doing a tear-down of the same socket.

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-19 21:10       ` Dave Jones
@ 2012-09-19 22:01         ` Eric Dumazet
  2012-09-20  2:02           ` Dave Jones
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2012-09-19 22:01 UTC (permalink / raw)
  To: Dave Jones; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Wed, 2012-09-19 at 17:10 -0400, Dave Jones wrote:
> On Sat, Sep 15, 2012 at 11:16:52AM -0700, Yuchung Cheng wrote:
>  > On Fri, Sep 14, 2012 at 2:29 PM, Dave Jones <davej@redhat.com> wrote:
>  > > On Wed, Sep 05, 2012 at 11:48:29PM +0300, Julian Anastasov wrote:
>  > >
>  > >  > > kernel BUG at kernel/timer.c:748!
>  > >  > > Call Trace:
>  > >  > >  ? lock_sock_nested+0x8d/0xa0
>  > >  > >  sk_reset_timer+0x1c/0x30
>  > >  > >  ? sock_setsockopt+0x8c/0x960
>  > >  > >  inet_csk_reset_keepalive_timer+0x20/0x30
>  > >  > >  tcp_set_keepalive+0x3d/0x50
>  > >  > >  sock_setsockopt+0x923/0x960
>  > >  > >  ? trace_hardirqs_on_caller+0x16/0x1e0
>  > >  > >  ? fget_light+0x24c/0x520
>  > >  > >  sys_setsockopt+0xc6/0xe0
>  > >  > >  system_call_fastpath+0x1a/0x1f
>  > >  >
>  > >  >      Can this help? In case you see ICMPV6_PKT_TOOBIG...
>  > >  >
>  > >  > [PATCH] tcp: fix possible socket refcount problem for ipv6
>  > >
>  > > I just managed to reproduce this bug on rc5 with this patch,
>  > > so it doesn't seem to help.
>  > Could you post some tcpdump traces?
> 
> It's likely that there aren't any packets.  The fuzzer isn't smart
> enough (yet) to do anything too clever to the sockets it creates.
> 
> More likely is that this is some race where thread A is doing a setsockopt
> while thread B is doing a tear-down of the same socket.

I spent some time trying to track this bug, but found nothing so far.

The timer->function are never cleared by TCP stack at tear down, and
should be set before fd is installed and can be caught by other threads.

Most likely its a refcounting issue...

Following debugging patch might trigger a bug sooner ?

diff --git a/include/net/sock.h b/include/net/sock.h
index 84bdaec..5d3ad5b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -511,18 +511,18 @@ static inline void sock_hold(struct sock *sk)
  */
 static inline void __sock_put(struct sock *sk)
 {
-	atomic_dec(&sk->sk_refcnt);
+	int newcnt = atomic_dec_return(&sk->sk_refcnt);
+
+	WARN_ON(newcnt <= 0);
 }
 
 static inline bool sk_del_node_init(struct sock *sk)
 {
 	bool rc = __sk_del_node_init(sk);
 
-	if (rc) {
-		/* paranoid for a while -acme */
-		WARN_ON(atomic_read(&sk->sk_refcnt) == 1);
+	if (rc)
 		__sock_put(sk);
-	}
+
 	return rc;
 }
 #define sk_del_node_init_rcu(sk)	sk_del_node_init(sk)
@@ -1620,7 +1620,10 @@ static inline void sk_filter_charge(struct sock *sk, struct sk_filter *fp)
 /* Ungrab socket and destroy it, if it was the last reference. */
 static inline void sock_put(struct sock *sk)
 {
-	if (atomic_dec_and_test(&sk->sk_refcnt))
+	int newcnt = atomic_dec_return(&sk->sk_refcnt);
+
+	WARN_ON(newcnt < 0);
+	if (newcnt == 0)
 		sk_free(sk);
 }
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-19 22:01         ` Eric Dumazet
@ 2012-09-20  2:02           ` Dave Jones
  2012-09-24 15:39             ` Dave Jones
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-20  2:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Thu, Sep 20, 2012 at 12:01:22AM +0200, Eric Dumazet wrote:

 > I spent some time trying to track this bug, but found nothing so far.
 > 
 > The timer->function are never cleared by TCP stack at tear down, and
 > should be set before fd is installed and can be caught by other threads.
 > 
 > Most likely its a refcounting issue...
 > 
 > Following debugging patch might trigger a bug sooner ?

4 hours so far, nothing.. I'll leave it run overnight, but it doesn't seem to have
made much difference if any.

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-20  2:02           ` Dave Jones
@ 2012-09-24 15:39             ` Dave Jones
  2012-09-24 16:34               ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-24 15:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Wed, Sep 19, 2012 at 10:02:23PM -0400, Dave Jones wrote:
 > On Thu, Sep 20, 2012 at 12:01:22AM +0200, Eric Dumazet wrote:
 > 
 >  > I spent some time trying to track this bug, but found nothing so far.
 >  > 
 >  > The timer->function are never cleared by TCP stack at tear down, and
 >  > should be set before fd is installed and can be caught by other threads.
 >  > 
 >  > Most likely its a refcounting issue...
 >  > 
 >  > Following debugging patch might trigger a bug sooner ?
 > 
 > 4 hours so far, nothing.. I'll leave it run overnight, but it doesn't seem to have
 > made much difference if any.

One of my over-weekend runs hit this again, but it didn't trigger the WARN that
your patch added :-/

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 15:39             ` Dave Jones
@ 2012-09-24 16:34               ` Eric Dumazet
  2012-09-24 17:00                 ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2012-09-24 16:34 UTC (permalink / raw)
  To: Dave Jones; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Mon, 2012-09-24 at 11:39 -0400, Dave Jones wrote:
> On Wed, Sep 19, 2012 at 10:02:23PM -0400, Dave Jones wrote:
>  > On Thu, Sep 20, 2012 at 12:01:22AM +0200, Eric Dumazet wrote:
>  > 
>  >  > I spent some time trying to track this bug, but found nothing so far.
>  >  > 
>  >  > The timer->function are never cleared by TCP stack at tear down, and
>  >  > should be set before fd is installed and can be caught by other threads.
>  >  > 
>  >  > Most likely its a refcounting issue...
>  >  > 
>  >  > Following debugging patch might trigger a bug sooner ?
>  > 
>  > 4 hours so far, nothing.. I'll leave it run overnight, but it doesn't seem to have
>  > made much difference if any.
> 
> One of my over-weekend runs hit this again, but it didn't trigger the WARN that
> your patch added :-/
> 
> 	Dave
> 

OK, I believe I found the reason. I Will post a patch.

open a raw socket AF_INET, TCP_PROTO
+ connect() ->sk_state set to TCP_ESTABLISHED
+ setsockopt( SO_KEEPALIVE, &on)  -> crash


Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 16:34               ` Eric Dumazet
@ 2012-09-24 17:00                 ` Eric Dumazet
  2012-09-24 17:11                   ` Dave Jones
  2012-09-24 20:53                   ` David Miller
  0 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2012-09-24 17:00 UTC (permalink / raw)
  To: Dave Jones; +Cc: Yuchung Cheng, Julian Anastasov, netdev

Signed-off-by: Eric Dumazet <edumazet@google.com>

On Mon, 2012-09-24 at 18:34 +0200, Eric Dumazet wrote:

> OK, I believe I found the reason. I Will post a patch.
> 
> open a raw socket AF_INET, TCP_PROTO
> + connect() ->sk_state set to TCP_ESTABLISHED
> + setsockopt( SO_KEEPALIVE, &on)  -> crash

I confirm following patch fixes the problem for me.

Thanks again

[PATCH] net: guard tcp_set_keepalive() to tcp sockets

Its possible to use RAW sockets to get a crash in 
tcp_set_keepalive() / sk_reset_timer()

Fix is to make sure socket is a SOCK_STREAM one.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/sock.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 3057920..a6000fb 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -691,7 +691,8 @@ set_rcvbuf:
 
 	case SO_KEEPALIVE:
 #ifdef CONFIG_INET
-		if (sk->sk_protocol == IPPROTO_TCP)
+		if (sk->sk_protocol == IPPROTO_TCP &&
+		    sk->sk_type == SOCK_STREAM)
 			tcp_set_keepalive(sk, valbool);
 #endif
 		sock_valbool_flag(sk, SOCK_KEEPOPEN, valbool);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 17:00                 ` Eric Dumazet
@ 2012-09-24 17:11                   ` Dave Jones
  2012-09-24 17:31                     ` Eric Dumazet
  2012-09-24 20:53                   ` David Miller
  1 sibling, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-24 17:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Mon, Sep 24, 2012 at 07:00:11PM +0200, Eric Dumazet wrote:
 > Signed-off-by: Eric Dumazet <edumazet@google.com>
 > 
 > On Mon, 2012-09-24 at 18:34 +0200, Eric Dumazet wrote:
 > 
 > > OK, I believe I found the reason. I Will post a patch.
 > > 
 > > open a raw socket AF_INET, TCP_PROTO
 > > + connect() ->sk_state set to TCP_ESTABLISHED
 > > + setsockopt( SO_KEEPALIVE, &on)  -> crash
 > 
 > I confirm following patch fixes the problem for me.
 > 
 > Thanks again
 > 
 > [PATCH] net: guard tcp_set_keepalive() to tcp sockets
 > 
 > Its possible to use RAW sockets to get a crash in 
 > tcp_set_keepalive() / sk_reset_timer()
 > 
 > Fix is to make sure socket is a SOCK_STREAM one.

Great, I'll give this a shot.

Any idea why this only just started triggering ?
(Ie, do we need this for stable too?)

thanks,

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 17:11                   ` Dave Jones
@ 2012-09-24 17:31                     ` Eric Dumazet
  2012-09-24 18:11                       ` Dave Jones
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2012-09-24 17:31 UTC (permalink / raw)
  To: Dave Jones; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Mon, 2012-09-24 at 13:11 -0400, Dave Jones wrote:

> Great, I'll give this a shot.
> 
> Any idea why this only just started triggering ?
> (Ie, do we need this for stable too?)

Seems this is a very old bug.

I guess your trinity tool gets better ?

I used following program to trigger the bug :

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
	int on = 1;
	struct sockaddr_in addr;
	int raw_sock = socket(AF_INET, SOCK_RAW, IPPROTO_TCP);

	memset(&addr, 0, sizeof(addr));
	addr.sin_family = AF_INET;
	addr.sin_addr.s_addr = htonl(0x7f000001);
	addr.sin_port = 123;
	connect(raw_sock, (struct sockaddr *)&addr, sizeof(addr));
	setsockopt(raw_sock, SOL_SOCKET, SO_KEEPALIVE, &on, sizeof(on));
	return 0;
}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 17:31                     ` Eric Dumazet
@ 2012-09-24 18:11                       ` Dave Jones
  2012-09-24 20:53                         ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Jones @ 2012-09-24 18:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yuchung Cheng, Julian Anastasov, netdev

On Mon, Sep 24, 2012 at 07:31:18PM +0200, Eric Dumazet wrote:
 > On Mon, 2012-09-24 at 13:11 -0400, Dave Jones wrote:
 > 
 > > Great, I'll give this a shot.
 > > 
 > > Any idea why this only just started triggering ?
 > > (Ie, do we need this for stable too?)
 > 
 > Seems this is a very old bug.

Indeed. As far as I can tell, this goes back to 2.1.8,
back in November 1996.

 > I guess your trinity tool gets better ?

Possible. Or it may be that it was finding other bugs first
that are now fixed, that prevented us getting far enough along
to try this.

 > I used following program to trigger the bug :

I'm sure reproducers like that will be appreciated by anyone
running QA tests on older kernels.

tangent: Is there any kind of networking correctness suite
other than fuzz testers like isic etc ? 

thanks,

	Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 17:00                 ` Eric Dumazet
  2012-09-24 17:11                   ` Dave Jones
@ 2012-09-24 20:53                   ` David Miller
  2012-09-24 21:01                     ` Eric Dumazet
  1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2012-09-24 20:53 UTC (permalink / raw)
  To: eric.dumazet; +Cc: davej, ycheng, ja, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 24 Sep 2012 19:00:11 +0200

> Signed-off-by: Eric Dumazet <edumazet@google.com>

I know you meant "From: Eric Dumazet <edumazet@google.com>"
here :-)

> [PATCH] net: guard tcp_set_keepalive() to tcp sockets
> 
> Its possible to use RAW sockets to get a crash in 
> tcp_set_keepalive() / sk_reset_timer()
> 
> Fix is to make sure socket is a SOCK_STREAM one.
> 
> Reported-by: Dave Jones <davej@redhat.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 18:11                       ` Dave Jones
@ 2012-09-24 20:53                         ` David Miller
  0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2012-09-24 20:53 UTC (permalink / raw)
  To: davej; +Cc: eric.dumazet, ycheng, ja, netdev

From: Dave Jones <davej@redhat.com>
Date: Mon, 24 Sep 2012 14:11:52 -0400

> tangent: Is there any kind of networking correctness suite
> other than fuzz testers like isic etc ? 

If people want to work on this I'm willing to maintain the
tree.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: kernel BUG at kernel/timer.c:748!
  2012-09-24 20:53                   ` David Miller
@ 2012-09-24 21:01                     ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2012-09-24 21:01 UTC (permalink / raw)
  To: David Miller; +Cc: davej, ycheng, ja, netdev

On Mon, 2012-09-24 at 16:53 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 24 Sep 2012 19:00:11 +0200
> 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> I know you meant "From: Eric Dumazet <edumazet@google.com>"
> here :-)

Oh well, I should use some automatic tool ;)

Thanks

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-09-24 21:01 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-05  4:35 kernel BUG at kernel/timer.c:748! Dave Jones
2012-09-05 16:04 ` Lin Ming
2012-09-05 16:37   ` Yuchung Cheng
2012-09-05 17:08     ` Dave Jones
2012-09-05 21:18   ` Jerry Chu
2012-09-05 20:48 ` Julian Anastasov
2012-09-14 21:29   ` Dave Jones
2012-09-15 18:16     ` Yuchung Cheng
2012-09-19 21:10       ` Dave Jones
2012-09-19 22:01         ` Eric Dumazet
2012-09-20  2:02           ` Dave Jones
2012-09-24 15:39             ` Dave Jones
2012-09-24 16:34               ` Eric Dumazet
2012-09-24 17:00                 ` Eric Dumazet
2012-09-24 17:11                   ` Dave Jones
2012-09-24 17:31                     ` Eric Dumazet
2012-09-24 18:11                       ` Dave Jones
2012-09-24 20:53                         ` David Miller
2012-09-24 20:53                   ` David Miller
2012-09-24 21:01                     ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.