netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel panic in qdisc_match_from_root starting openvswitch
@ 2016-08-16 18:27 David Ahern
  2016-08-16 18:35 ` Cong Wang
  2016-08-16 19:18 ` Jiri Kosina
  0 siblings, 2 replies; 5+ messages in thread
From: David Ahern @ 2016-08-16 18:27 UTC (permalink / raw)
  To: jkosina; +Cc: netdev, Simon Horman

Jiri:

I am hitting a kernel panic with '/etc/init.d/openvswitch-switch restart' Stack trace below. Reverting 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") clears the problem.


    [   30.664503] BUG: unable to handle kernel NULL pointer dereference at 0000000000000418
    [   30.665960] IP: [<ffffffff813ecf65>] qdisc_match_from_root+0x2a/0x58
    [   30.667146] PGD 134bfd067 PUD 1377d2067 PMD 0
    [   30.667983] Oops: 0000 [#1] SMP
    [   30.668574] Modules linked in: openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c crc32c_intel 8021q garp mrp stp llc vrf
    [   30.672770] CPU: 3 PID: 888 Comm: ovs-vswitchd Not tainted 4.7.0+ #69
    [   30.674167] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
    [   30.676353] task: ffff880134031240 task.stack: ffff880134bd4000
    [   30.677633] RIP: 0010:[<ffffffff813ecf65>]  [<ffffffff813ecf65>] qdisc_match_from_root+0x2a/0x58
    [   30.679555] RSP: 0018:ffff880134bd7ae0  EFLAGS: 00010202
    [   30.680689] RAX: 0000000000000410 RBX: 0000000000010000 RCX: ffff8801340bd800
    [   30.682222] RDX: ffffffff81a8e380 RSI: 0000000000010000 RDI: ffffffff81a8e240
    [   30.683751] RBP: ffff880134bd7ae0 R08: 0000000000000000 R09: ffff880134031240
    [   30.685281] R10: 0000000000000024 R11: 000000007ffff000 R12: ffff8801340bd800
    [   30.686807] R13: ffff8801340bd800 R14: ffff880139976a00 R15: ffffffff81a886c0
    [   30.688338] FS:  00007f4753c6b940(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
    [   30.690074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   30.691321] CR2: 0000000000000418 CR3: 00000001340f2000 CR4: 00000000000406e0
    [   30.692863] Stack:
    [   30.693310]  ffff880134bd7b00 ffffffff813eec03 ffff8801399b0e00 0000000000000000
    [   30.695013]  ffff880134bd7b98 ffffffff813ef0c1 ffff880134bd7b80 ffffffff811e2309
    [   30.696696]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
    [   30.698408] Call Trace:
    [   30.698958]  [<ffffffff813eec03>] qdisc_lookup+0x18/0x3a
    [   30.700113]  [<ffffffff813ef0c1>] tc_get_qdisc+0x102/0x19d
    [   30.701323]  [<ffffffff811e2309>] ? jbd2_journal_stop+0x6c/0x33e
    [   30.702616]  [<ffffffff813dd4ef>] rtnetlink_rcv_msg+0x178/0x187
    [   30.703911]  [<ffffffff81131d4a>] ? __kmalloc_track_caller+0xb1/0xc3
    [   30.705285]  [<ffffffff813bc885>] ? __alloc_skb+0x7b/0x19c
    [   30.706483]  [<ffffffff813f3be6>] ? rht_key_hashfn.isra.20.constprop.55+0x18/0x24
    [   30.708088]  [<ffffffff813dd377>] ? rtnl_newlink+0x6cf/0x6cf
    [   30.709327]  [<ffffffff813f60bd>] netlink_rcv_skb+0x45/0x8b
    [   30.710536]  [<ffffffff813d8fb4>] rtnetlink_rcv+0x1e/0x25
    [   30.711699]  [<ffffffff813f5c26>] netlink_unicast+0x114/0x192
    [   30.712948]  [<ffffffff813f5f71>] netlink_sendmsg+0x2cd/0x2ed
    [   30.714195]  [<ffffffff813b544b>] sock_sendmsg_nosec+0x12/0x1d
    [   30.715448]  [<ffffffff813b547a>] sock_sendmsg+0x24/0x29
    [   30.716602]  [<ffffffff813b5f30>] ___sys_sendmsg+0x1ad/0x21b
    [   30.717840]  [<ffffffff813c67c6>] ? dev_name_hash+0x27/0x40
    [   30.719039]  [<ffffffff813c6866>] ? dev_get_by_name_rcu+0x34/0x53
    [   30.720348]  [<ffffffff813e2062>] ? dev_ioctl+0x48f/0x5ab
    [   30.721532]  [<ffffffff813b32a5>] ? sock_do_ioctl+0x34/0x3d
    [   30.722734]  [<ffffffff813b32a5>] ? sock_do_ioctl+0x34/0x3d
    [   30.723947]  [<ffffffff81157dd3>] ? __fget+0x2a/0x64
    [   30.725018]  [<ffffffff813b6252>] __sys_sendmsg+0x40/0x5e
    [   30.726193]  [<ffffffff813b6252>] ? __sys_sendmsg+0x40/0x5e
    [   30.727389]  [<ffffffff813b6284>] SyS_sendmsg+0x14/0x16
    [   30.728523]  [<ffffffff814ae032>] entry_SYSCALL_64_fastpath+0x1a/0xa4
    [   30.729667] Code: c3 f6 47 10 01 55 48 89 e5 75 08 39 77 38 48 89 f8 74 44 69 c6 47 86 c8 61 48 8b 57 48 c1 e8 1c 48 8d 04 c5 d0 03 00 00 48 03 02 <48> 8b 50 08 31 c0 48 8d 4a d8 48 85 d2 48 0f 45 c1 48 85 c0 74
    [   30.733052] RIP  [<ffffffff813ecf65>] qdisc_match_from_root+0x2a/0x58
    [   30.733877]  RSP <ffff880134bd7ae0>
    [   30.734313] CR2: 0000000000000418

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel panic in qdisc_match_from_root starting openvswitch
  2016-08-16 18:27 kernel panic in qdisc_match_from_root starting openvswitch David Ahern
@ 2016-08-16 18:35 ` Cong Wang
  2016-08-16 19:18 ` Jiri Kosina
  1 sibling, 0 replies; 5+ messages in thread
From: Cong Wang @ 2016-08-16 18:35 UTC (permalink / raw)
  To: David Ahern; +Cc: Jiri Kosina, netdev, Simon Horman

On Tue, Aug 16, 2016 at 11:27 AM, David Ahern <dsa@cumulusnetworks.com> wrote:
> Jiri:
>
> I am hitting a kernel panic with '/etc/init.d/openvswitch-switch restart' Stack trace below. Reverting 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") clears the problem.
>

Jiri is working on it:
https://patchwork.ozlabs.org/patch/659633/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel panic in qdisc_match_from_root starting openvswitch
  2016-08-16 18:27 kernel panic in qdisc_match_from_root starting openvswitch David Ahern
  2016-08-16 18:35 ` Cong Wang
@ 2016-08-16 19:18 ` Jiri Kosina
  2016-08-16 19:51   ` Jiri Kosina
  1 sibling, 1 reply; 5+ messages in thread
From: Jiri Kosina @ 2016-08-16 19:18 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, Simon Horman

On Tue, 16 Aug 2016, David Ahern wrote:

> Jiri:
> 
> I am hitting a kernel panic with '/etc/init.d/openvswitch-switch 
> restart' Stack trace below. Reverting 59cc1f61f09c ("net: sched: convert 
> qdisc linked list to hashtable") clears the problem.

Thanks a lot for the report. Could you please test either with the 
attached two patches applied on top of net-next, or alternatively with 
'might-rebase/net-sched-qdiscs' branch of

	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/jikos.git

(HEAD of that branch is 60b3487ad7f) and report back to me whether it 
fixes any issues you are seeing? Applying the attached patches on top of 
net-next or testing with the git branch above should be equivalent test 
wrt. qdiscs.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel panic in qdisc_match_from_root starting openvswitch
  2016-08-16 19:18 ` Jiri Kosina
@ 2016-08-16 19:51   ` Jiri Kosina
  2016-08-16 20:06     ` David Ahern
  0 siblings, 1 reply; 5+ messages in thread
From: Jiri Kosina @ 2016-08-16 19:51 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, Simon Horman

[-- Attachment #1: Type: TEXT/PLAIN, Size: 828 bytes --]

On Tue, 16 Aug 2016, Jiri Kosina wrote:

> > I am hitting a kernel panic with '/etc/init.d/openvswitch-switch 
> > restart' Stack trace below. Reverting 59cc1f61f09c ("net: sched: convert 
> > qdisc linked list to hashtable") clears the problem.
> 
> Thanks a lot for the report. Could you please test either with the 
> attached two patches applied on top of net-next, or alternatively with 
> 'might-rebase/net-sched-qdiscs' branch of
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/jikos.git
> 
> (HEAD of that branch is 60b3487ad7f) and report back to me whether it 
> fixes any issues you are seeing? Applying the attached patches on top of 
> net-next or testing with the git branch above should be equivalent test 
> wrt. qdiscs.

And this time hopefully with the actual patches ... 

-- 
Jiri Kosina
SUSE Labs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-patch; name=0001-net-sched-fix-handling-of-singleton-qdiscs-with-qdis.patch, Size: 4776 bytes --]

From 39e6390a50335e88e1bd566f3a5d2f1b09272896 Mon Sep 17 00:00:00 2001
From: Jiri Kosina <jikos@kernel.org>
Date: Fri, 12 Aug 2016 15:53:04 +0200
Subject: [PATCH 1/2] net: sched: fix handling of singleton qdiscs with
 qdisc_hash

qdisc_match_from_root() is now iterating over per-netdevice qdisc hashtable
instead of going through a linked-list of qdiscs (independently on the actual
underlying netdev), which was the case before the switch to hashtable for
qdiscs.

For singleton qdiscs, there is no underlying netdev associated though, and
therefore dumping a singleton qdisc will panic, as qdisc_dev(root) will
always be NULL.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000410
 IP: [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
 PGD 1aceba067 PUD 1aceb7067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP
[ ... ]
 task: ffff8801ec996e00 task.stack: ffff8801ec934000
 RIP: 0010:[<ffffffff8167efac>]  [<ffffffff8167efac>] qdisc_match_from_root+0x2c/0x70
 RSP: 0018:ffff8801ec937ab0  EFLAGS: 00010203
 RAX: 0000000000000408 RBX: ffff88025e612000 RCX: ffffffffffffffd8
 RDX: 0000000000000000 RSI: 00000000ffff0000 RDI: ffffffff81cf8100
 RBP: ffff8801ec937ab0 R08: 000000000001c160 R09: ffff8802668032c0
 R10: ffffffff81cf8100 R11: 0000000000000030 R12: 00000000ffff0000
 R13: ffff88025e612000 R14: ffffffff81cf3140 R15: 0000000000000000
 FS:  00007f24b9af6740(0000) GS:ffff88026f280000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000410 CR3: 00000001aceec000 CR4: 00000000001406e0
 Stack:
  ffff8801ec937ad0 ffffffff81681210 ffff88025dd51a00 00000000fffffff1
  ffff8801ec937b88 ffffffff81681e4e ffffffff81c42bc0 ffff880262431500
  ffffffff81cf3140 ffff88025dd51a10 ffff88025dd51a24 00000000ec937b38
 Call Trace:
  [<ffffffff81681210>] qdisc_lookup+0x40/0x50
  [<ffffffff81681e4e>] tc_modify_qdisc+0x21e/0x550
  [<ffffffff8166ae25>] rtnetlink_rcv_msg+0x95/0x220
  [<ffffffff81209602>] ? __kmalloc_track_caller+0x172/0x230
  [<ffffffff8166ad90>] ? rtnl_newlink+0x870/0x870
  [<ffffffff816897b7>] netlink_rcv_skb+0xa7/0xc0
  [<ffffffff816657c8>] rtnetlink_rcv+0x28/0x30
  [<ffffffff8168919b>] netlink_unicast+0x15b/0x210
  [<ffffffff81689569>] netlink_sendmsg+0x319/0x390
  [<ffffffff816379f8>] sock_sendmsg+0x38/0x50
  [<ffffffff81638296>] ___sys_sendmsg+0x256/0x260
  [<ffffffff811b1275>] ? __pagevec_lru_add_fn+0x135/0x280
  [<ffffffff811b1a90>] ? pagevec_lru_move_fn+0xd0/0xf0
  [<ffffffff811b1140>] ? trace_event_raw_event_mm_lru_insertion+0x180/0x180
  [<ffffffff811b1b85>] ? __lru_cache_add+0x75/0xb0
  [<ffffffff817708a6>] ? _raw_spin_unlock+0x16/0x40
  [<ffffffff811d8dff>] ? handle_mm_fault+0x39f/0x1160
  [<ffffffff81638b15>] __sys_sendmsg+0x45/0x80
  [<ffffffff81638b62>] SyS_sendmsg+0x12/0x20
  [<ffffffff810038e7>] do_syscall_64+0x57/0xb0

Fix this by special-casing singleton qdiscs (those that don't have underlying
netdevice) and introduce immediate handling of those rather than trying to
go over an underlying netdevice. We're in the same situation in
tc_dump_qdisc_root() and tc_dump_tclass_root().

Ultimately, this will have to be slightly reworked so that we are actually
able to show singleton qdiscs (noop) in the dump properly; but we're not
currently doing that anyway, so no regression there, and better do this
in a gradual manner.

Fixes: 59cc1f61f ("net: sched: convert qdisc linked list to hashtable")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 net/sched/sch_api.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index c093d32..2b75d68 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -262,6 +262,9 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle)
 {
 	struct Qdisc *q;
 
+	if (!qdisc_dev(root))
+		return (root->handle == handle ? root : NULL);
+
 	if (!(root->flags & TCQ_F_BUILTIN) &&
 	    root->handle == handle)
 		return root;
@@ -1456,6 +1459,10 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
 			goto done;
 		q_idx++;
 	}
+
+	if (!qdisc_dev(root))
+		goto out;
+
 	hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
 		if (q_idx < s_q_idx) {
 			q_idx++;
@@ -1781,6 +1788,9 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb,
 	if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0)
 		return -1;
 
+	if (!qdisc_dev(root))
+		return 0;
+
 	hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
 		if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0)
 			return -1;
-- 
1.9.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: Type: TEXT/x-patch; name=0002-net-sched-avoid-duplicates-in-qdisc-dump.patch, Size: 1871 bytes --]

From 60b3487ad7fad232f9e89973fee409c33a1f65a1 Mon Sep 17 00:00:00 2001
From: Jiri Kosina <jkosina@suse.cz>
Date: Tue, 16 Aug 2016 18:14:35 +0200
Subject: [PATCH 2/2] net: sched: avoid duplicates in qdisc dump

FIXME: add changelog

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 net/sched/sch_api.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2b75d68..bb569ac 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1440,7 +1440,7 @@ err_out:
 
 static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
 			      struct netlink_callback *cb,
-			      int *q_idx_p, int s_q_idx)
+			      int *q_idx_p, int s_q_idx, bool one)
 {
 	int ret = 0, q_idx = *q_idx_p;
 	struct Qdisc *q;
@@ -1460,7 +1460,13 @@ static int tc_dump_qdisc_root(struct Qdisc *root, struct sk_buff *skb,
 		q_idx++;
 	}
 
-	if (!qdisc_dev(root))
+	/* If dumping singletons, there is no qdisc_dev(root) and the singleton
+	 * itself has already been dumped.
+	 *
+	 * If we've already dumped the top-level (ingress) qdisc above and the global
+	 * qdisc hashtable, we don't want to hit it again
+	 */
+	if (!qdisc_dev(root) || one)
 		goto out;
 
 	hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) {
@@ -1504,13 +1510,13 @@ static int tc_dump_qdisc(struct sk_buff *skb, struct netlink_callback *cb)
 			s_q_idx = 0;
 		q_idx = 0;
 
-		if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx) < 0)
+		if (tc_dump_qdisc_root(dev->qdisc, skb, cb, &q_idx, s_q_idx, false) < 0)
 			goto done;
 
 		dev_queue = dev_ingress_queue(dev);
 		if (dev_queue &&
 		    tc_dump_qdisc_root(dev_queue->qdisc_sleeping, skb, cb,
-				       &q_idx, s_q_idx) < 0)
+				       &q_idx, s_q_idx, true) < 0)
 			goto done;
 
 cont:
-- 
1.9.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: kernel panic in qdisc_match_from_root starting openvswitch
  2016-08-16 19:51   ` Jiri Kosina
@ 2016-08-16 20:06     ` David Ahern
  0 siblings, 0 replies; 5+ messages in thread
From: David Ahern @ 2016-08-16 20:06 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: netdev, Simon Horman

On 8/16/16 1:51 PM, Jiri Kosina wrote:
> On Tue, 16 Aug 2016, Jiri Kosina wrote:
> 
>>> I am hitting a kernel panic with '/etc/init.d/openvswitch-switch 
>>> restart' Stack trace below. Reverting 59cc1f61f09c ("net: sched: convert 
>>> qdisc linked list to hashtable") clears the problem.
>>
>> Thanks a lot for the report. Could you please test either with the 
>> attached two patches applied on top of net-next, or alternatively with 
>> 'might-rebase/net-sched-qdiscs' branch of
>>
>> 	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/jikos.git
>>
>> (HEAD of that branch is 60b3487ad7f) and report back to me whether it 
>> fixes any issues you are seeing? Applying the attached patches on top of 
>> net-next or testing with the git branch above should be equivalent test 
>> wrt. qdiscs.
> 
> And this time hopefully with the actual patches ... 
> 

Applied both and no longer see the panic starting openvswitch.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-16 20:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-16 18:27 kernel panic in qdisc_match_from_root starting openvswitch David Ahern
2016-08-16 18:35 ` Cong Wang
2016-08-16 19:18 ` Jiri Kosina
2016-08-16 19:51   ` Jiri Kosina
2016-08-16 20:06     ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).