netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* oops with ip6_rt_cache_alloc
@ 2018-08-24 22:26 Yonghong Song
  2018-08-24 23:04 ` David Ahern
  0 siblings, 1 reply; 3+ messages in thread
From: Yonghong Song @ 2018-08-24 22:26 UTC (permalink / raw)
  To: David Ahern, netdev, Alexei Starovoitov, Martin Lau, Dave Jones

Hi,

We got a kernel oops with the following stack trace:

CPU: 24 PID: 0 Comm: swapper/24 Not tainted 
4.16.0-10_fbk1_1183_g7e4ee4c8171c #10
"Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A16.DDR3 
11/19/2015"
RIP: 0010:ip6_rt_get_dev_rcu+0x6/0x60
RSP: 0018:ffff88046fb03c78 EFLAGS: 00010286
RAX: 0000000040000003 RBX: ffff88035a6c1500 RCX: ffffffff81ec5dc0
RDX: ffff88033192a090 RSI: ffff88033192a0a0 RDI: 0000000000000000
RBP: ffff88046fb03cb0 R08: 0000000040000003 R09: ffff8803eb770d00
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88033192a0a0
R13: ffff88033192a090 R14: 0000000000000000 R15: ffff8803d748d700
FS:  0000000000000000(0000) GS:ffff88046fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000054 CR3: 000000000220a002 CR4: 00000000001606e0
Call Trace:
  <IRQ>
  ip6_rt_cache_alloc+0x20/0x100
  __ip6_rt_update_pmtu+0xae/0x180
  ip6_tnl_xmit+0x330/0x970 [ip6_tunnel]
  ? __gre6_xmit+0x2d5/0x540 [ip6_gre]
  ? ip6_forward+0x522/0x7e0
  ? ip6_tnl_parse_tlv_enc_lim+0x59/0x190 [ip6_tunnel]
  ? ip6gre_tunnel_xmit+0xe3/0x320 [ip6_gre]
  ip6gre_tunnel_xmit+0xe3/0x320 [ip6_gre]
  dev_hard_start_xmit+0x9e/0x200
  sch_direct_xmit+0xeb/0x250
  __qdisc_run+0x146/0x510
  net_tx_action+0xde/0x210
  __do_softirq+0xd8/0x2a8
  irq_exit+0xa8/0xb0
  smp_apic_timer_interrupt+0x6c/0x120
  apic_timer_interrupt+0xf/0x20
  </IRQ>
RIP: 0010:poll_idle+0x31/0x61
RSP: 0018:ffffc9000328fed8 EFLAGS: 00000246
  ORIG_RAX: ffffffffffffff12
RAX: 0000000000000000 RBX: ffffffff822da9e0 RCX: ffff88046d4e7000
RDX: 0000000000000000 RSI: ffffffff822da9e0 RDI: ffffe8fc00301c00
RBP: ffffe8fc00301c00 R08: 0000000000000f1a R09: 0000000000000001
R10: ffffc9000328fec8 R11: 0000000000000f15 R12: 0000000000000000
R13: ffffffff822da9f8 R14: 0000000000000000 R15: 00002e37d560bb8e
  ? acpi_idle_do_entry+0x40/0x40
  cpuidle_enter_state+0x70/0x2a0
  do_idle+0xdf/0x170
  cpu_startup_entry+0x19/0x20
  secondary_startup_64+0xa5/0xb0
Code: d7 be 01 00 00 00 48 83 e0 fe 48 8b 00 48 89 42 10 ba 0f 00 00 00 
e9 7a fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 <f7> 47 
54 00 00 10 80 48 8b 9f a8 00 00 00 74 22 8b 83 0c 02 00
RIP: ip6_rt_get_dev_rcu+0x6/0x60 RSP: ffff88046fb03c78
CR2: 0000000000000054

Our internal experiments showed that an early version of 4.16 works fine
and after backporting some ipv6 route related changes and the above
problem showed up.

Have anybody seen this issue?

Thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: oops with ip6_rt_cache_alloc
  2018-08-24 22:26 oops with ip6_rt_cache_alloc Yonghong Song
@ 2018-08-24 23:04 ` David Ahern
  2018-08-27  4:57   ` Yonghong Song
  0 siblings, 1 reply; 3+ messages in thread
From: David Ahern @ 2018-08-24 23:04 UTC (permalink / raw)
  To: Yonghong Song, netdev, Alexei Starovoitov, Martin Lau, Dave Jones

On 8/24/18 4:26 PM, Yonghong Song wrote:
> Hi,
> 
> We got a kernel oops with the following stack trace:
> 
> CPU: 24 PID: 0 Comm: swapper/24 Not tainted
> 4.16.0-10_fbk1_1183_g7e4ee4c8171c #10
> "Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A16.DDR3
> 11/19/2015"
> RIP: 0010:ip6_rt_get_dev_rcu+0x6/0x60
> RSP: 0018:ffff88046fb03c78 EFLAGS: 00010286
> RAX: 0000000040000003 RBX: ffff88035a6c1500 RCX: ffffffff81ec5dc0
> RDX: ffff88033192a090 RSI: ffff88033192a0a0 RDI: 0000000000000000

RDI = 0 means the rt passed to ip6_rt_get_dev_rcu is NULL. I believe
that can't happen prior to the fib6_info changes. After the fib6_info
changes, it means the 'from' is NULL and that is not expected.

...

> Our internal experiments showed that an early version of 4.16 works fine
> and after backporting some ipv6 route related changes and the above
> problem showed up.

Can you run the test on 4.18?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: oops with ip6_rt_cache_alloc
  2018-08-24 23:04 ` David Ahern
@ 2018-08-27  4:57   ` Yonghong Song
  0 siblings, 0 replies; 3+ messages in thread
From: Yonghong Song @ 2018-08-27  4:57 UTC (permalink / raw)
  To: David Ahern, netdev, Alexei Starovoitov, Martin Lau, Dave Jones



On 8/24/18 4:04 PM, David Ahern wrote:
> On 8/24/18 4:26 PM, Yonghong Song wrote:
>> Hi,
>>
>> We got a kernel oops with the following stack trace:
>>
>> CPU: 24 PID: 0 Comm: swapper/24 Not tainted
>> 4.16.0-10_fbk1_1183_g7e4ee4c8171c #10
>> "Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A16.DDR3
>> 11/19/2015"
>> RIP: 0010:ip6_rt_get_dev_rcu+0x6/0x60
>> RSP: 0018:ffff88046fb03c78 EFLAGS: 00010286
>> RAX: 0000000040000003 RBX: ffff88035a6c1500 RCX: ffffffff81ec5dc0
>> RDX: ffff88033192a090 RSI: ffff88033192a0a0 RDI: 0000000000000000
> 
> RDI = 0 means the rt passed to ip6_rt_get_dev_rcu is NULL. I believe
> that can't happen prior to the fib6_info changes. After the fib6_info
> changes, it means the 'from' is NULL and that is not expected.
> 
> ...
> 
>> Our internal experiments showed that an early version of 4.16 works fine
>> and after backporting some ipv6 route related changes and the above
>> problem showed up.
> 
> Can you run the test on 4.18?

We will give a try with 4.18. Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-08-27  8:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-24 22:26 oops with ip6_rt_cache_alloc Yonghong Song
2018-08-24 23:04 ` David Ahern
2018-08-27  4:57   ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).