All of lore.kernel.org
 help / color / mirror / Atom feed
* oops in tcp_get_metrics, followed by lockup.
@ 2013-11-13 20:45 Dave Jones
  2013-11-13 22:40 ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Jones @ 2013-11-13 20:45 UTC (permalink / raw)
  To: netdev

My fuzzer just hit this on v3.12-7033-g42a2d923cc34

Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: fuse hidp tun snd_seq_dummy bnep nfnetlink rfcomm ipt_ULOG can_bcm nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can_raw can llc2 pppoe pppox ppp
_generic slhc irda crc_ccitt rds scsi_transport_iscsi af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_p
clmul crc32c_intel ghash_clmulni_intel usb_debug snd_hda_codec_realtek snd_hda_codec_hdmi microcode pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_ti
mer e1000e snd ptp shpchp soundcore pps_core serio_raw
CPU: 1 PID: 16002 Comm: trinity-child1 Not tainted 3.12.0+ #2
task: ffff88023cd75580 ti: ffff88009ee26000 task.ti: ffff88009ee26000
RIP: 0010:[<ffffffff81658dd2>]  [<ffffffff81658dd2>] tcp_get_metrics+0x62/0x420
RSP: 0018:ffff880244a03d28  EFLAGS: 00010246
RAX: 0000000000000002 RBX: ffff88009c77a4c0 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88009c77a4c0
RBP: ffff880244a03d78 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000000010ac R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880244a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 0000000001c0b000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 000000018165a6b5 ffffffff000010ac 0000000000000246 0000000044a00002
 ffffffff81c480a0 ffff88009c77a4c0 0000000000000000 0000000000000000
 0000000000000001 0000000000000000 ffff880244a03db8 ffffffff8165a740
Call Trace:
 <IRQ> 
 [<ffffffff8165a740>] tcp_fastopen_cache_set+0x90/0x280
 [<ffffffff8165a6b5>] ? tcp_fastopen_cache_set+0x5/0x280
 [<ffffffff8164f3a7>] tcp_retransmit_timer+0x1d7/0x930
 [<ffffffff8164fcb0>] ? tcp_write_timer_handler+0x1b0/0x1b0
 [<ffffffff8164fba0>] tcp_write_timer_handler+0xa0/0x1b0
 [<ffffffff8164fd2c>] tcp_write_timer+0x7c/0x80
 [<ffffffff81063c1a>] call_timer_fn+0x8a/0x340
 [<ffffffff81063b95>] ? call_timer_fn+0x5/0x340
 [<ffffffff8164fcb0>] ? tcp_write_timer_handler+0x1b0/0x1b0
 [<ffffffff81064114>] run_timer_softirq+0x244/0x3a0
 [<ffffffff8105aa9c>] __do_softirq+0xfc/0x490
 [<ffffffff8105b28d>] irq_exit+0x13d/0x160
 [<ffffffff8172fe25>] smp_apic_timer_interrupt+0x45/0x60
 [<ffffffff8172eaaf>] apic_timer_interrupt+0x6f/0x80
 <EOI> 
 [<ffffffff810d559d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff811560af>] ? free_hot_cold_page+0xff/0x180
 [<ffffffff81156176>] free_hot_cold_page_list+0x46/0x160
 [<ffffffff8115c21e>] release_pages+0x8e/0x1f0
 [<ffffffff8118c135>] free_pages_and_swap_cache+0x95/0xb0
 [<ffffffff81175acc>] tlb_flush_mmu.part.73+0x4c/0x90
 [<ffffffff81176115>] tlb_finish_mmu+0x55/0x60
 [<ffffffff81180d84>] exit_mmap+0xf4/0x170
 [<ffffffff8105108b>] mmput+0x6b/0x100
 [<ffffffff810559e8>] do_exit+0x278/0xcb0
 [<ffffffff817250e1>] ? _raw_spin_unlock+0x31/0x50
 [<ffffffff810d53c6>] ? trace_hardirqs_on_caller+0x16/0x1e0
 [<ffffffff810d559d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810577ec>] do_group_exit+0x4c/0xc0
 [<ffffffff81057874>] SyS_exit_group+0x14/0x20
 [<ffffffff8172e064>] tracesys+0xdd/0xe2
Code: 0a 0f 85 c2 01 00 00 48 8b 47 38 48 8b 57 40 48 89 44 24 08 48 8b 47 40 48 89 54 24 10 48 33 47 38 49 89 c6 49 c1 ee 20 41 31 c6 <49> 8b 45 18 b9 20 00 00 00 45 69 f6 01 00 37 9e 48 8b 80 d8 04 
RIP  [<ffffffff81658dd2>] tcp_get_metrics+0x62/0x420
 RSP <ffff880244a03d28>
CR2: 0000000000000018
---[ end trace c25bf4de9744120a ]---


The disassembly looks like it happened here :-


static inline u32 ipv6_addr_hash(const struct in6_addr *a)
{
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
        const unsigned long *ul = (const unsigned long *)a;
        unsigned long x = ul[0] ^ ul[1];
    10db:       48 8b 47 40             mov    0x40(%rdi),%rax
    10df:       48 89 54 24 10          mov    %rdx,0x10(%rsp)
    10e4:       48 33 47 38             xor    0x38(%rdi),%rax

        return (u32)(x ^ (x >> 32));
    10e8:       49 89 c6                mov    %rax,%r14
    10eb:       49 c1 ee 20             shr    $0x20,%r14
    10ef:       41 31 c6                xor    %eax,%r14d
    10f2:       49 8b 45 18             mov    0x18(%r13),%rax    <<<< Faulting instruction.
    10f6:       b9 20 00 00 00          mov    $0x20,%ecx
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: oops in tcp_get_metrics, followed by lockup.
  2013-11-13 20:45 oops in tcp_get_metrics, followed by lockup Dave Jones
@ 2013-11-13 22:40 ` Eric Dumazet
  2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2013-11-13 22:40 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev

On Wed, 2013-11-13 at 15:45 -0500, Dave Jones wrote:
> My fuzzer just hit this on v3.12-7033-g42a2d923cc34
> 
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in: fuse hidp tun snd_seq_dummy bnep nfnetlink rfcomm ipt_ULOG can_bcm nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can_raw can llc2 pppoe pppox ppp
> _generic slhc irda crc_ccitt rds scsi_transport_iscsi af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel kvm crct10dif_p
> clmul crc32c_intel ghash_clmulni_intel usb_debug snd_hda_codec_realtek snd_hda_codec_hdmi microcode pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_ti
> mer e1000e snd ptp shpchp soundcore pps_core serio_raw
> CPU: 1 PID: 16002 Comm: trinity-child1 Not tainted 3.12.0+ #2
> task: ffff88023cd75580 ti: ffff88009ee26000 task.ti: ffff88009ee26000
> RIP: 0010:[<ffffffff81658dd2>]  [<ffffffff81658dd2>] tcp_get_metrics+0x62/0x420
> RSP: 0018:ffff880244a03d28  EFLAGS: 00010246
> RAX: 0000000000000002 RBX: ffff88009c77a4c0 RCX: 0000000000000001
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88009c77a4c0
> RBP: ffff880244a03d78 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: 00000000000010ac R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff880244a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000018 CR3: 0000000001c0b000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Stack:
>  000000018165a6b5 ffffffff000010ac 0000000000000246 0000000044a00002
>  ffffffff81c480a0 ffff88009c77a4c0 0000000000000000 0000000000000000
>  0000000000000001 0000000000000000 ffff880244a03db8 ffffffff8165a740
> Call Trace:
>  <IRQ> 
>  [<ffffffff8165a740>] tcp_fastopen_cache_set+0x90/0x280
>  [<ffffffff8165a6b5>] ? tcp_fastopen_cache_set+0x5/0x280
>  [<ffffffff8164f3a7>] tcp_retransmit_timer+0x1d7/0x930
>  [<ffffffff8164fcb0>] ? tcp_write_timer_handler+0x1b0/0x1b0
>  [<ffffffff8164fba0>] tcp_write_timer_handler+0xa0/0x1b0
>  [<ffffffff8164fd2c>] tcp_write_timer+0x7c/0x80
>  [<ffffffff81063c1a>] call_timer_fn+0x8a/0x340
>  [<ffffffff81063b95>] ? call_timer_fn+0x5/0x340
>  [<ffffffff8164fcb0>] ? tcp_write_timer_handler+0x1b0/0x1b0
>  [<ffffffff81064114>] run_timer_softirq+0x244/0x3a0
>  [<ffffffff8105aa9c>] __do_softirq+0xfc/0x490
>  [<ffffffff8105b28d>] irq_exit+0x13d/0x160
>  [<ffffffff8172fe25>] smp_apic_timer_interrupt+0x45/0x60
>  [<ffffffff8172eaaf>] apic_timer_interrupt+0x6f/0x80
>  <EOI> 
>  [<ffffffff810d559d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff811560af>] ? free_hot_cold_page+0xff/0x180
>  [<ffffffff81156176>] free_hot_cold_page_list+0x46/0x160
>  [<ffffffff8115c21e>] release_pages+0x8e/0x1f0
>  [<ffffffff8118c135>] free_pages_and_swap_cache+0x95/0xb0
>  [<ffffffff81175acc>] tlb_flush_mmu.part.73+0x4c/0x90
>  [<ffffffff81176115>] tlb_finish_mmu+0x55/0x60
>  [<ffffffff81180d84>] exit_mmap+0xf4/0x170
>  [<ffffffff8105108b>] mmput+0x6b/0x100
>  [<ffffffff810559e8>] do_exit+0x278/0xcb0
>  [<ffffffff817250e1>] ? _raw_spin_unlock+0x31/0x50
>  [<ffffffff810d53c6>] ? trace_hardirqs_on_caller+0x16/0x1e0
>  [<ffffffff810d559d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff810577ec>] do_group_exit+0x4c/0xc0
>  [<ffffffff81057874>] SyS_exit_group+0x14/0x20
>  [<ffffffff8172e064>] tracesys+0xdd/0xe2
> Code: 0a 0f 85 c2 01 00 00 48 8b 47 38 48 8b 57 40 48 89 44 24 08 48 8b 47 40 48 89 54 24 10 48 33 47 38 49 89 c6 49 c1 ee 20 41 31 c6 <49> 8b 45 18 b9 20 00 00 00 45 69 f6 01 00 37 9e 48 8b 80 d8 04 
> RIP  [<ffffffff81658dd2>] tcp_get_metrics+0x62/0x420
>  RSP <ffff880244a03d28>
> CR2: 0000000000000018
> ---[ end trace c25bf4de9744120a ]---
> 
> 
> The disassembly looks like it happened here :-
> 
> 
> static inline u32 ipv6_addr_hash(const struct in6_addr *a)
> {
> #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && BITS_PER_LONG == 64
>         const unsigned long *ul = (const unsigned long *)a;
>         unsigned long x = ul[0] ^ ul[1];
>     10db:       48 8b 47 40             mov    0x40(%rdi),%rax
>     10df:       48 89 54 24 10          mov    %rdx,0x10(%rsp)
>     10e4:       48 33 47 38             xor    0x38(%rdi),%rax
> 
>         return (u32)(x ^ (x >> 32));
>     10e8:       49 89 c6                mov    %rax,%r14
>     10eb:       49 c1 ee 20             shr    $0x20,%r14
>     10ef:       41 31 c6                xor    %eax,%r14d
>     10f2:       49 8b 45 18             mov    0x18(%r13),%rax    <<<< Faulting instruction.
>     10f6:       b9 20 00 00 00          mov    $0x20,%ecx
> }

I do not think this is the ipv6_addr_hash()

%r13 here seems to be dst pointer

Trap on accessing dst->dev as in :

net = dev_net(dst->dev);

So we crash because socket has a NULL dst entry.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-13 22:40 ` Eric Dumazet
@ 2013-11-13 23:00   ` Eric Dumazet
  2013-11-13 23:08     ` Yuchung Cheng
                       ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Eric Dumazet @ 2013-11-13 23:00 UTC (permalink / raw)
  To: Dave Jones, David Miller; +Cc: netdev, Yuchung Cheng

From: Eric Dumazet <edumazet@google.com>

We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.

Issue is that tcp_get_metrics() should not be called with a NULL dst

Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_metrics.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
index 2ab09cbae74d..d3ee2e0c28b6 100644
--- a/net/ipv4/tcp_metrics.c
+++ b/net/ipv4/tcp_metrics.c
@@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
 void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
 			    struct tcp_fastopen_cookie *cookie, bool syn_lost)
 {
+	struct dst_entry *dst = __sk_dst_get(sk);
 	struct tcp_metrics_block *tm;
 
+	if (!dst)
+		return;
 	rcu_read_lock();
-	tm = tcp_get_metrics(sk, __sk_dst_get(sk), true);
+	tm = tcp_get_metrics(sk, dst, true);
 	if (tm) {
 		struct tcp_fastopen_metrics *tfom = &tm->tcpm_fastopen;
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
@ 2013-11-13 23:08     ` Yuchung Cheng
  2013-11-14 17:55     ` Dave Jones
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Yuchung Cheng @ 2013-11-13 23:08 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Jones, David Miller, netdev

On Wed, Nov 13, 2013 at 3:00 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> We had some reports of crashes using TCP fastopen, and Dave Jones
> gave a nice stack trace pointing to the error.
>
> Issue is that tcp_get_metrics() should not be called with a NULL dst
>
> Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Dave Jones <davej@redhat.com>
> Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

> ---
>  net/ipv4/tcp_metrics.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
> index 2ab09cbae74d..d3ee2e0c28b6 100644
> --- a/net/ipv4/tcp_metrics.c
> +++ b/net/ipv4/tcp_metrics.c
> @@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
>  void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
>                             struct tcp_fastopen_cookie *cookie, bool syn_lost)
>  {
> +       struct dst_entry *dst = __sk_dst_get(sk);
>         struct tcp_metrics_block *tm;
>
> +       if (!dst)
> +               return;
>         rcu_read_lock();
> -       tm = tcp_get_metrics(sk, __sk_dst_get(sk), true);
> +       tm = tcp_get_metrics(sk, dst, true);
>         if (tm) {
>                 struct tcp_fastopen_metrics *tfom = &tm->tcpm_fastopen;
>
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
  2013-11-13 23:08     ` Yuchung Cheng
@ 2013-11-14 17:55     ` Dave Jones
  2013-11-14 19:13     ` Johannes Berg
  2013-11-14 21:33     ` David Miller
  3 siblings, 0 replies; 11+ messages in thread
From: Dave Jones @ 2013-11-14 17:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Yuchung Cheng

On Wed, Nov 13, 2013 at 03:00:46PM -0800, Eric Dumazet wrote:
 > From: Eric Dumazet <edumazet@google.com>
 > 
 > We had some reports of crashes using TCP fastopen, and Dave Jones
 > gave a nice stack trace pointing to the error.
 > 
 > Issue is that tcp_get_metrics() should not be called with a NULL dst
 > 
 > Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
 > Signed-off-by: Eric Dumazet <edumazet@google.com>
 > Reported-by: Dave Jones <davej@redhat.com>
 > Cc: Yuchung Cheng <ycheng@google.com>

I let this run overnight, looks good.

Tested-by: Dave Jones <davej@fedoraproject.org>

thanks Eric.
 
	Dave

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
  2013-11-13 23:08     ` Yuchung Cheng
  2013-11-14 17:55     ` Dave Jones
@ 2013-11-14 19:13     ` Johannes Berg
  2013-11-14 19:36       ` Eric Dumazet
  2013-11-14 21:33     ` David Miller
  3 siblings, 1 reply; 11+ messages in thread
From: Johannes Berg @ 2013-11-14 19:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Jones, David Miller, netdev, Yuchung Cheng

On Wed, 2013-11-13 at 15:00 -0800, Eric Dumazet wrote:

> --- a/net/ipv4/tcp_metrics.c
> +++ b/net/ipv4/tcp_metrics.c
> @@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
>  void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
>  			    struct tcp_fastopen_cookie *cookie, bool syn_lost)
>  {
> +	struct dst_entry *dst = __sk_dst_get(sk);
>  	struct tcp_metrics_block *tm;
>  
> +	if (!dst)
> +		return;
>  	rcu_read_lock();
> -	tm = tcp_get_metrics(sk, __sk_dst_get(sk), true);

Doesn't that __sk_dst_get() have to go inside the rcu_read_lock()?

Then again, I guess we hold the socket. Still looks a bit weird to be
moving it out.

johannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-14 19:13     ` Johannes Berg
@ 2013-11-14 19:36       ` Eric Dumazet
  2013-11-14 19:38         ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2013-11-14 19:36 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Dave Jones, David Miller, netdev, Yuchung Cheng

On Thu, 2013-11-14 at 20:13 +0100, Johannes Berg wrote:
> On Wed, 2013-11-13 at 15:00 -0800, Eric Dumazet wrote:
> 
> > --- a/net/ipv4/tcp_metrics.c
> > +++ b/net/ipv4/tcp_metrics.c
> > @@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
> >  void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
> >  			    struct tcp_fastopen_cookie *cookie, bool syn_lost)
> >  {
> > +	struct dst_entry *dst = __sk_dst_get(sk);
> >  	struct tcp_metrics_block *tm;
> >  
> > +	if (!dst)
> > +		return;
> >  	rcu_read_lock();
> > -	tm = tcp_get_metrics(sk, __sk_dst_get(sk), true);
> 
> Doesn't that __sk_dst_get() have to go inside the rcu_read_lock()?
> 
> Then again, I guess we hold the socket. Still looks a bit weird to be
> moving it out.

Yep, in fact this rcu_read_lock() is not needed. I'll send a v2.

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-14 19:36       ` Eric Dumazet
@ 2013-11-14 19:38         ` Eric Dumazet
  2013-11-14 20:53           ` Johannes Berg
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2013-11-14 19:38 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Dave Jones, David Miller, netdev, Yuchung Cheng

On Thu, 2013-11-14 at 11:36 -0800, Eric Dumazet wrote:
> On Thu, 2013-11-14 at 20:13 +0100, Johannes Berg wrote:
> > On Wed, 2013-11-13 at 15:00 -0800, Eric Dumazet wrote:
> > 
> > > --- a/net/ipv4/tcp_metrics.c
> > > +++ b/net/ipv4/tcp_metrics.c
> > > @@ -663,10 +663,13 @@ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
> > >  void tcp_fastopen_cache_set(struct sock *sk, u16 mss,
> > >  			    struct tcp_fastopen_cookie *cookie, bool syn_lost)
> > >  {
> > > +	struct dst_entry *dst = __sk_dst_get(sk);
> > >  	struct tcp_metrics_block *tm;
> > >  
> > > +	if (!dst)
> > > +		return;
> > >  	rcu_read_lock();
> > > -	tm = tcp_get_metrics(sk, __sk_dst_get(sk), true);
> > 
> > Doesn't that __sk_dst_get() have to go inside the rcu_read_lock()?
> > 
> > Then again, I guess we hold the socket. Still looks a bit weird to be
> > moving it out.
> 
> Yep, in fact this rcu_read_lock() is not needed. I'll send a v2.

I take it back.

the rcu_read_lock() protects the tcp_get_metrics(), not the
__sk_dst_get(sk)

So the patch is correct, unless you disagree of course ;)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-14 19:38         ` Eric Dumazet
@ 2013-11-14 20:53           ` Johannes Berg
  2013-11-14 21:22             ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Berg @ 2013-11-14 20:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Dave Jones, David Miller, netdev, Yuchung Cheng

On Thu, 2013-11-14 at 11:38 -0800, Eric Dumazet wrote:

> > > Doesn't that __sk_dst_get() have to go inside the rcu_read_lock()?
> > > 
> > > Then again, I guess we hold the socket. Still looks a bit weird to be
> > > moving it out.
> > 
> > Yep, in fact this rcu_read_lock() is not needed. I'll send a v2.
> 
> I take it back.
> 
> the rcu_read_lock() protects the tcp_get_metrics(), not the
> __sk_dst_get(sk)
> 
> So the patch is correct, unless you disagree of course ;)

Heh. I have no idea, it just seemed a little odd on first look given
that __sk_dst_get() *can* actually use RCU protection. :)

johannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-14 20:53           ` Johannes Berg
@ 2013-11-14 21:22             ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2013-11-14 21:22 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Dave Jones, David Miller, netdev, Yuchung Cheng

On Thu, 2013-11-14 at 21:53 +0100, Johannes Berg wrote:

> Heh. I have no idea, it just seemed a little odd on first look given
> that __sk_dst_get() *can* actually use RCU protection. :)

Yep, the deal is that if you own socket lock, you do not need
rcu_read_lock()

Look for dst_negative_advice() as another example :

We call __sk_dst_get(sk) without rcu_read_lock() protection.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set()
  2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
                       ` (2 preceding siblings ...)
  2013-11-14 19:13     ` Johannes Berg
@ 2013-11-14 21:33     ` David Miller
  3 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2013-11-14 21:33 UTC (permalink / raw)
  To: eric.dumazet; +Cc: davej, netdev, ycheng

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 13 Nov 2013 15:00:46 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> We had some reports of crashes using TCP fastopen, and Dave Jones
> gave a nice stack trace pointing to the error.
> 
> Issue is that tcp_get_metrics() should not be called with a NULL dst
> 
> Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Dave Jones <davej@redhat.com>

Applied and queued up for -stable, thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-11-14 21:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 20:45 oops in tcp_get_metrics, followed by lockup Dave Jones
2013-11-13 22:40 ` Eric Dumazet
2013-11-13 23:00   ` [PATCH] net-tcp: fix panic in tcp_fastopen_cache_set() Eric Dumazet
2013-11-13 23:08     ` Yuchung Cheng
2013-11-14 17:55     ` Dave Jones
2013-11-14 19:13     ` Johannes Berg
2013-11-14 19:36       ` Eric Dumazet
2013-11-14 19:38         ` Eric Dumazet
2013-11-14 20:53           ` Johannes Berg
2013-11-14 21:22             ` Eric Dumazet
2013-11-14 21:33     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.