All of lore.kernel.org
 help / color / mirror / Atom feed
* INFO: rcu detected stall in xfrm_hash_rebuild
@ 2018-12-10  7:50 syzbot
  2018-12-10 12:47 ` Florian Westphal
  0 siblings, 1 reply; 10+ messages in thread
From: syzbot @ 2018-12-10  7:50 UTC (permalink / raw)
  To: davem, herbert, linux-kernel, netdev, steffen.klassert, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    74c4a24df7ca Add linux-next specific files for 20181207
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17dbb3d5400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6e9413388bf37bed
dashboard link: https://syzkaller.appspot.com/bug?extid=592df0494801b6648ec6
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+592df0494801b6648ec6@syzkaller.appspotmail.com

device bridge_slave_1 entered promiscuous mode
IPv6: ADDRCONF(NETDEV_UP): veth0_to_bridge: link is not ready
IPv6: ADDRCONF(NETDEV_UP): veth1_to_bridge: link is not ready
bond0: Enslaving bond_slave_0 as an active interface with an up link
bond0: Enslaving bond_slave_1 as an active interface with an up link
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 	1-....: (1 GPs behind) idle=3e2/1/0x4000000000000002  
softirq=136882/136885 fqs=5250
rcu: 	 (t=10501 jiffies g=189801 q=2008)
NMI backtrace for cpu 1
CPU: 1 PID: 6200 Comm: kworker/1:2 Not tainted 4.20.0-rc5-next-20181207+  
#163
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: events xfrm_hash_rebuild
Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x244/0x39d lib/dump_stack.c:113
  nmi_cpu_backtrace.cold.2+0x5c/0xa1 lib/nmi_backtrace.c:101
  nmi_trigger_cpumask_backtrace+0x1e8/0x22a lib/nmi_backtrace.c:62
  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
  trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
  rcu_dump_cpu_stacks+0x16f/0x1bc kernel/rcu/tree.c:1211
  print_cpu_stall.cold.70+0x218/0x40a kernel/rcu/tree.c:1348
  check_cpu_stall kernel/rcu/tree.c:1422 [inline]
  rcu_pending kernel/rcu/tree.c:3018 [inline]
  rcu_check_callbacks+0xf3b/0x13f0 kernel/rcu/tree.c:2521
  update_process_times+0x2d/0x70 kernel/time/timer.c:1635
  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:161
  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1271
  __run_hrtimer kernel/time/hrtimer.c:1389 [inline]
  __hrtimer_run_queues+0x41c/0x10d0 kernel/time/hrtimer.c:1451
  hrtimer_interrupt+0x313/0x780 kernel/time/hrtimer.c:1509
  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1034 [inline]
  smp_apic_timer_interrupt+0x1a1/0x760 arch/x86/kernel/apic/apic.c:1059
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:804
  </IRQ>
RIP: 0010:__read_once_size include/linux/compiler.h:182 [inline]
RIP: 0010:check_kcov_mode kernel/kcov.c:69 [inline]
RIP: 0010:write_comp_data+0x22/0x70 kernel/kcov.c:122
Code: 90 90 90 90 90 90 90 90 55 48 89 e5 65 4c 8b 04 25 40 ee 01 00 65 8b  
05 7c f6 81 7e a9 00 01 1f 00 75 51 41 8b 80 d8 12 00 00 <83> f8 03 75 45  
49 8b 80 e0 12 00 00 45 8b 80 dc 12 00 00 4c 8b 08
RSP: 0018:ffff8881872f7320 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff86d1d8db
RDX: 0000000000000003 RSI: 000000000000000e RDI: 0000000000000005
RBP: ffff8881872f7320 R08: ffff8881872ea1c0 R09: 0000000000000008
R10: 0000000000000005 R11: ffff8881872ea1c0 R12: dffffc0000000000
R13: ffff8881b628ca14 R14: ffff8881b4050114 R15: 0000000000000000
  __sanitizer_cov_trace_const_cmp4+0x16/0x20 kernel/kcov.c:188
  selector_cmp net/xfrm/xfrm_policy.c:1390 [inline]
  xfrm_policy_insert_list+0x62b/0x1020 net/xfrm/xfrm_policy.c:1534
  xfrm_policy_inexact_insert+0x166/0xee0 net/xfrm/xfrm_policy.c:1195
  xfrm_hash_rebuild+0xfba/0x1380 net/xfrm/xfrm_policy.c:1317
  process_one_work+0xc90/0x1c40 kernel/workqueue.c:2153
  worker_thread+0x17f/0x1390 kernel/workqueue.c:2296
  kthread+0x35a/0x440 kernel/kthread.c:246
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-10  7:50 INFO: rcu detected stall in xfrm_hash_rebuild syzbot
@ 2018-12-10 12:47 ` Florian Westphal
  2018-12-10 17:58   ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2018-12-10 12:47 UTC (permalink / raw)
  To: syzbot
  Cc: davem, herbert, linux-kernel, netdev, steffen.klassert, syzkaller-bugs

syzbot <syzbot+592df0494801b6648ec6@syzkaller.appspotmail.com> wrote:
> Hello,
> 
> syzbot found the following crash on:
[..]

> Workqueue: events xfrm_hash_rebuild

Ignoring this report for a second -- I think it makes sense to see
if we can just remove the entire hash table rebuild/resize code.

After recent tree conversion, we could probably make the exact policies
part of the 'inexact tree' (which would be renamed to 'policy tree' or
some such).

Special-casing the exact policies made a lot of sense when we had
a single list for the inexact policies (to keep its length down).

But now I think we could try to unify all of this and only maintain
the existing tree-based storage.

Would also remove the need to do lookups in two different
data structures (bydst-hash-then-inexact-tree).

What do you think?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-10 12:47 ` Florian Westphal
@ 2018-12-10 17:58   ` David Miller
  2018-12-14 13:11     ` Wolfgang Walter
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2018-12-10 17:58 UTC (permalink / raw)
  To: fw
  Cc: syzbot+592df0494801b6648ec6, herbert, linux-kernel, netdev,
	steffen.klassert, syzkaller-bugs

From: Florian Westphal <fw@strlen.de>
Date: Mon, 10 Dec 2018 13:47:24 +0100

> After recent tree conversion, we could probably make the exact policies
> part of the 'inexact tree' (which would be renamed to 'policy tree' or
> some such).
> 
> Special-casing the exact policies made a lot of sense when we had
> a single list for the inexact policies (to keep its length down).
> 
> But now I think we could try to unify all of this and only maintain
> the existing tree-based storage.
> 
> Would also remove the need to do lookups in two different
> data structures (bydst-hash-then-inexact-tree).
> 
> What do you think?

I think this makes a lot of sense.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-10 17:58   ` David Miller
@ 2018-12-14 13:11     ` Wolfgang Walter
  2018-12-14 14:35       ` Florian Westphal
  0 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Walter @ 2018-12-14 13:11 UTC (permalink / raw)
  To: Florian Westphal
  Cc: David Miller, herbert, linux-kernel, netdev, steffen.klassert

Am Montag, 10. Dezember 2018, 09:58:56 schrieb David Miller:
> From: Florian Westphal <fw@strlen.de>
> Date: Mon, 10 Dec 2018 13:47:24 +0100
> 
> > After recent tree conversion, we could probably make the exact policies
> > part of the 'inexact tree' (which would be renamed to 'policy tree' or
> > some such).
> > 
> > Special-casing the exact policies made a lot of sense when we had
> > a single list for the inexact policies (to keep its length down).
> > 
> > But now I think we could try to unify all of this and only maintain
> > the existing tree-based storage.
> > 
> > Would also remove the need to do lookups in two different
> > data structures (bydst-hash-then-inexact-tree).
> > 
> > What do you think?
> 
> I think this makes a lot of sense.

Sites mainly using tunnel mode this certainly makes sense.

I'm not so sure for transport mode. With transport mode the netmask usually is 
/32 or /128, respectively (there may also be trap-rules). So a site only using 
transport mode (road warrior scenario, for example) may see a large 
performance regression if this is changed. They may do not have many entries 
in the inexact list if any at all.

Maybe there are a lot more transport mode users than tunnel mode users, this 
would explain why the removal of the flowcache did not hit that many people.

We do not use transport mode, so I'm not familiar how strongswan for example 
handles that. I think that since 5.3 or so strongwans allows a catch rule 
(inexact) and then inserts exact policy rules on the fly. But I don't know for 
sure. There are a lot of tests on strongswan for different scenarios which 
also demonstrate how policy and state table finally will look like on all 
hosts.

Here is one with such a scenario (transport mode trap policy on a gateway, 
three road warriors):

https://www.strongswan.org/testing/testresults/ikev2/trap-any/

So I would try to find users who are heavy users of transport mode and see how 
this change would impact there performance.

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 13:11     ` Wolfgang Walter
@ 2018-12-14 14:35       ` Florian Westphal
  2018-12-14 14:56         ` Herbert Xu
  2018-12-14 16:04         ` Christophe Gouault
  0 siblings, 2 replies; 10+ messages in thread
From: Florian Westphal @ 2018-12-14 14:35 UTC (permalink / raw)
  To: Wolfgang Walter
  Cc: Florian Westphal, David Miller, herbert, linux-kernel, netdev,
	steffen.klassert, christophe.gouault

Wolfgang Walter <linux@stwm.de> wrote:

[ CCing Christophe ]

> Am Montag, 10. Dezember 2018, 09:58:56 schrieb David Miller:
> > From: Florian Westphal <fw@strlen.de>
> > Date: Mon, 10 Dec 2018 13:47:24 +0100
> > 
> > > After recent tree conversion, we could probably make the exact policies
> > > part of the 'inexact tree' (which would be renamed to 'policy tree' or
> > > some such).
> > > 
> > > Special-casing the exact policies made a lot of sense when we had
> > > a single list for the inexact policies (to keep its length down).
> > > 
> > > But now I think we could try to unify all of this and only maintain
> > > the existing tree-based storage.
> > > 
> > > Would also remove the need to do lookups in two different
> > > data structures (bydst-hash-then-inexact-tree).
> > > 
> > > What do you think?
> > 
> > I think this makes a lot of sense.
> 
> Sites mainly using tunnel mode this certainly makes sense.

Ok.  An alternative would be to remove the support for
policy hash table thresholds (which decide what kinds of policies
go to exact table and which ones go into inexact ones), i.e.
partially revert 880a6fab8f6ba5b5abe59ea6
("xfrm: configure policy hash table thresholds by netlink").

This would remove the need for the rehashing support that
re-sorts the policies into either exact/inexact lists) when the
those tunables are changed.

We could also easily convert the exact table to an rhashtable
then if we wanted to.

I guess we should probably wait to get some operational feedback on the
inexact storage first to see if it really improves things enough to
make threshold tuning unneccessary.

Christophe, whats your take?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 14:35       ` Florian Westphal
@ 2018-12-14 14:56         ` Herbert Xu
  2018-12-14 16:04         ` Christophe Gouault
  1 sibling, 0 replies; 10+ messages in thread
From: Herbert Xu @ 2018-12-14 14:56 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Wolfgang Walter, David Miller, linux-kernel, netdev,
	steffen.klassert, christophe.gouault

On Fri, Dec 14, 2018 at 03:35:32PM +0100, Florian Westphal wrote:
>
> Ok.  An alternative would be to remove the support for
> policy hash table thresholds (which decide what kinds of policies
> go to exact table and which ones go into inexact ones), i.e.
> partially revert 880a6fab8f6ba5b5abe59ea6
> ("xfrm: configure policy hash table thresholds by netlink").
> 
> This would remove the need for the rehashing support that
> re-sorts the policies into either exact/inexact lists) when the
> those tunables are changed.
> 
> We could also easily convert the exact table to an rhashtable
> then if we wanted to.

We could also do both.  In fact that was the reason why I started
working on rhashtable in the first place.  The idea is to extend
the run-time rehashing to include both parts of the database.

So you would look up the old table/list combo and then move onto
the new one.

Of course I never had time to finish this and I think the entity
asking for this has moved onto something else.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 14:35       ` Florian Westphal
  2018-12-14 14:56         ` Herbert Xu
@ 2018-12-14 16:04         ` Christophe Gouault
  2018-12-14 16:23           ` Florian Westphal
  1 sibling, 1 reply; 10+ messages in thread
From: Christophe Gouault @ 2018-12-14 16:04 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Wolfgang Walter, David Miller, Herbert Xu, linux-kernel, netdev,
	Steffen Klassert

Le ven. 14 déc. 2018 à 15:35, Florian Westphal <fw@strlen.de> a écrit :
> Ok.  An alternative would be to remove the support for
> policy hash table thresholds (which decide what kinds of policies
> go to exact table and which ones go into inexact ones), i.e.
> partially revert 880a6fab8f6ba5b5abe59ea6
> ("xfrm: configure policy hash table thresholds by netlink").
>
> This would remove the need for the rehashing support that
> re-sorts the policies into either exact/inexact lists) when the
> those tunables are changed.
>
> We could also easily convert the exact table to an rhashtable
> then if we wanted to.
>
> I guess we should probably wait to get some operational feedback on the
> inexact storage first to see if it really improves things enough to
> make threshold tuning unneccessary.
>
> Christophe, whats your take?

Hi Florian,

The main use cases I have encountered and tried to address with the
hash-based lookup were network operator use cases:
- a lot of dynamic /32 <=> /32 policies (protecting GTP tunnels)
- or a lot of dynamic policies with the same prefix lengths (e.g. /16 <=> /24)
and a few non-hashed policies stored in the linked list.

This solutions gives good performance for both lookup *and*
configuration with a big number of SPs. The lookup time is essential,
but so is the IKE negotiation rate. And the configuration time has a
direct impact on it.

I have not yet had time to benchmark the new tree-based implementation.

Could you verify how the configuration time would behave in such use
cases if the hash-based lookup was replaced by a tree-based lookup?

Regards
Christophe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 16:04         ` Christophe Gouault
@ 2018-12-14 16:23           ` Florian Westphal
  2018-12-14 16:28             ` Christophe Gouault
  2018-12-14 19:07             ` David Miller
  0 siblings, 2 replies; 10+ messages in thread
From: Florian Westphal @ 2018-12-14 16:23 UTC (permalink / raw)
  To: Christophe Gouault
  Cc: Florian Westphal, Wolfgang Walter, David Miller, Herbert Xu,
	linux-kernel, netdev, Steffen Klassert

Christophe Gouault <christophe.gouault@6wind.com> wrote:
> The main use cases I have encountered and tried to address with the
> hash-based lookup were network operator use cases:
> - a lot of dynamic /32 <=> /32 policies (protecting GTP tunnels)
> - or a lot of dynamic policies with the same prefix lengths (e.g. /16 <=> /24)
> and a few non-hashed policies stored in the linked list.

Thanks for the detailed information.

[..]

> Could you verify how the configuration time would behave in such use
> cases if the hash-based lookup was replaced by a tree-based lookup?

I won't send a patch to remove your work, at least not at this time.

In case I'd do this removal (thresholds, hash table, or both)
i will make these tests to see how large the impact is.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 16:23           ` Florian Westphal
@ 2018-12-14 16:28             ` Christophe Gouault
  2018-12-14 19:07             ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: Christophe Gouault @ 2018-12-14 16:28 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Wolfgang Walter, David Miller, Herbert Xu, linux-kernel, netdev,
	Steffen Klassert

Le ven. 14 déc. 2018 à 17:23, Florian Westphal <fw@strlen.de> a écrit :
> I won't send a patch to remove your work, at least not at this time.
>
> In case I'd do this removal (thresholds, hash table, or both)
> i will make these tests to see how large the impact is.

Perfect, thanks
Christophe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: INFO: rcu detected stall in xfrm_hash_rebuild
  2018-12-14 16:23           ` Florian Westphal
  2018-12-14 16:28             ` Christophe Gouault
@ 2018-12-14 19:07             ` David Miller
  1 sibling, 0 replies; 10+ messages in thread
From: David Miller @ 2018-12-14 19:07 UTC (permalink / raw)
  To: fw
  Cc: christophe.gouault, linux, herbert, linux-kernel, netdev,
	steffen.klassert

From: Florian Westphal <fw@strlen.de>
Date: Fri, 14 Dec 2018 17:23:33 +0100

> Christophe Gouault <christophe.gouault@6wind.com> wrote:
>> The main use cases I have encountered and tried to address with the
>> hash-based lookup were network operator use cases:
>> - a lot of dynamic /32 <=> /32 policies (protecting GTP tunnels)
>> - or a lot of dynamic policies with the same prefix lengths (e.g. /16 <=> /24)
>> and a few non-hashed policies stored in the linked list.
> 
> Thanks for the detailed information.
> 
> [..]
> 
>> Could you verify how the configuration time would behave in such use
>> cases if the hash-based lookup was replaced by a tree-based lookup?
> 
> I won't send a patch to remove your work, at least not at this time.
> 
> In case I'd do this removal (thresholds, hash table, or both)
> i will make these tests to see how large the impact is.

Florian, thanks for working so hard on this.

The operational feedback is absolutely essential for figuring out how
to move forward on this issue, otherwise we'll just keep going back
and forth on the approach.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-12-14 19:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-10  7:50 INFO: rcu detected stall in xfrm_hash_rebuild syzbot
2018-12-10 12:47 ` Florian Westphal
2018-12-10 17:58   ` David Miller
2018-12-14 13:11     ` Wolfgang Walter
2018-12-14 14:35       ` Florian Westphal
2018-12-14 14:56         ` Herbert Xu
2018-12-14 16:04         ` Christophe Gouault
2018-12-14 16:23           ` Florian Westphal
2018-12-14 16:28             ` Christophe Gouault
2018-12-14 19:07             ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.