linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
@ 2012-07-19 16:27 Srivatsa S. Bhat
  2012-07-19 16:44 ` Neil Horman
  2012-07-23  1:15 ` Gao feng
  0 siblings, 2 replies; 10+ messages in thread
From: Srivatsa S. Bhat @ 2012-07-19 16:27 UTC (permalink / raw)
  To: Srivatsa S. Bhat, gaofeng, eric.dumazet, nhorman, davem
  Cc: Srivatsa S. Bhat, linux-kernel, netdev, mark.d.rustad,
	john.r.fastabend, lizefan

After commit ef209f15 (net: cgroup: fix access the unallocated memory in
netprio cgroup), boot fails with the following NULL pointer dereference:

Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys net_cls
Initializing cgroup subsys blkio
Initializing cgroup subsys perf_event
Initializing cgroup subsys net_prio
BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
PGD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
Stack:
 ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
 ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
 0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
Call Trace:
 [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
 [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
 [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
 [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
 [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
 [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
 RSP <ffffffff81a01ea8>
CR2: 0000000000000698
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill the idle task!

The code corresponds to:

update_netdev_tables():
        for_each_netdev(&init_net, dev) {
                map = rtnl_dereference(dev->priomap);  <---- HERE


The list head is initialized in netdev_init(), which is called much
later than cgrp_create(). So the problem is that we are calling
update_netdev_tables() way too early (in cgrp_create()), which will
end up traversing the not-yet-circular linked list. So at some point,
the dev pointer will become NULL and hence dev->priomap becomes an
invalid access.

To fix this, just remove the update_netdev_tables() function entirely,
since it appears that write_update_netdev_table() will handle things
just fine.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

Requesting a thorough review of this patch, since I am not sure whether
removing update_netdev_tables() is perfectly OK and whether that is the
right thing to do.

 net/core/netprio_cgroup.c |   32 --------------------------------
 1 files changed, 0 insertions(+), 32 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index b2e9caa..68fe239 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -109,32 +109,6 @@ static int write_update_netdev_table(struct net_device *dev)
 	return ret;
 }
 
-static int update_netdev_tables(void)
-{
-	int ret = 0;
-	struct net_device *dev;
-	u32 max_len;
-	struct netprio_map *map;
-
-	rtnl_lock();
-	max_len = atomic_read(&max_prioidx) + 1;
-	for_each_netdev(&init_net, dev) {
-		map = rtnl_dereference(dev->priomap);
-		/*
-		 * don't allocate priomap if we didn't
-		 * change net_prio.ifpriomap (map == NULL),
-		 * this will speed up skb_update_prio.
-		 */
-		if (map && map->priomap_len < max_len) {
-			ret = extend_netdev_table(dev, max_len);
-			if (ret < 0)
-				break;
-		}
-	}
-	rtnl_unlock();
-	return ret;
-}
-
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
@@ -153,12 +127,6 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 		goto out;
 	}
 
-	ret = update_netdev_tables();
-	if (ret < 0) {
-		put_prioidx(cs->prioidx);
-		goto out;
-	}
-
 	return &cs->css;
 out:
 	kfree(cs);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-19 16:27 [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list Srivatsa S. Bhat
@ 2012-07-19 16:44 ` Neil Horman
  2012-07-20 10:04   ` Srivatsa S. Bhat
  2012-07-23  1:15 ` Gao feng
  1 sibling, 1 reply; 10+ messages in thread
From: Neil Horman @ 2012-07-19 16:44 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: gaofeng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On Thu, Jul 19, 2012 at 09:57:37PM +0530, Srivatsa S. Bhat wrote:
> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> netprio cgroup), boot fails with the following NULL pointer dereference:
> 
> Initializing cgroup subsys devices
> Initializing cgroup subsys freezer
> Initializing cgroup subsys net_cls
> Initializing cgroup subsys blkio
> Initializing cgroup subsys perf_event
> Initializing cgroup subsys net_prio
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> PGD 0
> Oops: 0000 [#1] SMP
> CPU 0
> Modules linked in:
> 
> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> Stack:
>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
> Call Trace:
>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>  RSP <ffffffff81a01ea8>
> CR2: 0000000000000698
> ---[ end trace a7919e7f17c0a725 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> 
> The code corresponds to:
> 
> update_netdev_tables():
>         for_each_netdev(&init_net, dev) {
>                 map = rtnl_dereference(dev->priomap);  <---- HERE
> 
> 
> The list head is initialized in netdev_init(), which is called much
> later than cgrp_create(). So the problem is that we are calling
> update_netdev_tables() way too early (in cgrp_create()), which will
> end up traversing the not-yet-circular linked list. So at some point,
> the dev pointer will become NULL and hence dev->priomap becomes an
> invalid access.
> 
> To fix this, just remove the update_netdev_tables() function entirely,
> since it appears that write_update_netdev_table() will handle things
> just fine.
> 
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
> 
> Requesting a thorough review of this patch, since I am not sure whether
> removing update_netdev_tables() is perfectly OK and whether that is the
> right thing to do.
> 
We could do this I suppose, but this has already been fixed by
734b65417b24d6eea3e3d7457e1f11493890ee1d
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-19 16:44 ` Neil Horman
@ 2012-07-20 10:04   ` Srivatsa S. Bhat
  2012-07-20 11:00     ` Neil Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Srivatsa S. Bhat @ 2012-07-20 10:04 UTC (permalink / raw)
  To: Neil Horman
  Cc: gaofeng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On 07/19/2012 10:14 PM, Neil Horman wrote:
> On Thu, Jul 19, 2012 at 09:57:37PM +0530, Srivatsa S. Bhat wrote:
>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
>> netprio cgroup), boot fails with the following NULL pointer dereference:
>>
>> Initializing cgroup subsys devices
>> Initializing cgroup subsys freezer
>> Initializing cgroup subsys net_cls
>> Initializing cgroup subsys blkio
>> Initializing cgroup subsys perf_event
>> Initializing cgroup subsys net_prio
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
>> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>> PGD 0
>> Oops: 0000 [#1] SMP
>> CPU 0
>> Modules linked in:
>>
>> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
>> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
>> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
>> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
>> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
>> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
>> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
>> Stack:
>>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
>>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
>>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
>> Call Trace:
>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
>> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>  RSP <ffffffff81a01ea8>
>> CR2: 0000000000000698
>> ---[ end trace a7919e7f17c0a725 ]---
>> Kernel panic - not syncing: Attempted to kill the idle task!
>>
>> The code corresponds to:
>>
>> update_netdev_tables():
>>         for_each_netdev(&init_net, dev) {
>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
>>
>>
>> The list head is initialized in netdev_init(), which is called much
>> later than cgrp_create(). So the problem is that we are calling
>> update_netdev_tables() way too early (in cgrp_create()), which will
>> end up traversing the not-yet-circular linked list. So at some point,
>> the dev pointer will become NULL and hence dev->priomap becomes an
>> invalid access.
>>
>> To fix this, just remove the update_netdev_tables() function entirely,
>> since it appears that write_update_netdev_table() will handle things
>> just fine.
>>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> ---
>>
>> Requesting a thorough review of this patch, since I am not sure whether
>> removing update_netdev_tables() is perfectly OK and whether that is the
>> right thing to do.
>>
> We could do this I suppose, but this has already been fixed by
> 734b65417b24d6eea3e3d7457e1f11493890ee1d

Oh good! But don't you think that my patch looks cleaner than that fix?
(Of course, provided that my patch is correct!)

Anyway, I'm happy to see that the boot failure is fixed. But if anyone feels
that the approach of removing the update_netdev_tables() function is correct
and better, I'll be happy to provide a patch on top of the boot-fix that
went upstream.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-20 10:04   ` Srivatsa S. Bhat
@ 2012-07-20 11:00     ` Neil Horman
  2012-07-20 11:18       ` Srivatsa S. Bhat
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Horman @ 2012-07-20 11:00 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: gaofeng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On Fri, Jul 20, 2012 at 03:34:47PM +0530, Srivatsa S. Bhat wrote:
> On 07/19/2012 10:14 PM, Neil Horman wrote:
> > On Thu, Jul 19, 2012 at 09:57:37PM +0530, Srivatsa S. Bhat wrote:
> >> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> >> netprio cgroup), boot fails with the following NULL pointer dereference:
> >>
> >> Initializing cgroup subsys devices
> >> Initializing cgroup subsys freezer
> >> Initializing cgroup subsys net_cls
> >> Initializing cgroup subsys blkio
> >> Initializing cgroup subsys perf_event
> >> Initializing cgroup subsys net_prio
> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
> >> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >> PGD 0
> >> Oops: 0000 [#1] SMP
> >> CPU 0
> >> Modules linked in:
> >>
> >> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
> >> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
> >> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
> >> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
> >> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
> >> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
> >> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
> >> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> >> Stack:
> >>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
> >>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
> >>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
> >> Call Trace:
> >>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
> >>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
> >>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
> >>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
> >>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
> >>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> >> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
> >> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >>  RSP <ffffffff81a01ea8>
> >> CR2: 0000000000000698
> >> ---[ end trace a7919e7f17c0a725 ]---
> >> Kernel panic - not syncing: Attempted to kill the idle task!
> >>
> >> The code corresponds to:
> >>
> >> update_netdev_tables():
> >>         for_each_netdev(&init_net, dev) {
> >>                 map = rtnl_dereference(dev->priomap);  <---- HERE
> >>
> >>
> >> The list head is initialized in netdev_init(), which is called much
> >> later than cgrp_create(). So the problem is that we are calling
> >> update_netdev_tables() way too early (in cgrp_create()), which will
> >> end up traversing the not-yet-circular linked list. So at some point,
> >> the dev pointer will become NULL and hence dev->priomap becomes an
> >> invalid access.
> >>
> >> To fix this, just remove the update_netdev_tables() function entirely,
> >> since it appears that write_update_netdev_table() will handle things
> >> just fine.
> >>
> >> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> >> ---
> >>
> >> Requesting a thorough review of this patch, since I am not sure whether
> >> removing update_netdev_tables() is perfectly OK and whether that is the
> >> right thing to do.
> >>
> > We could do this I suppose, but this has already been fixed by
> > 734b65417b24d6eea3e3d7457e1f11493890ee1d
> 
> Oh good! But don't you think that my patch looks cleaner than that fix?
> (Of course, provided that my patch is correct!)
> 
> Anyway, I'm happy to see that the boot failure is fixed. But if anyone feels
> that the approach of removing the update_netdev_tables() function is correct
> and better, I'll be happy to provide a patch on top of the boot-fix that
> went upstream.
> 
We're almost at the end of a release.  The fix that went in has been tesetd and
fixes the specific problem that was reported, with almost zero likelyhood of
causing other regressions. While this fix looks like it might be preferable,
this isn't a time to go doing something like this without alot more testing, as
it may cause unforseen problems.

Theres also a larger issue of initalization order that I'll be looking at in the
next few weeks.  Based on the outcome of that I may roll this change in.

Thanks!
Neil

> Regards,
> Srivatsa S. Bhat
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-20 11:00     ` Neil Horman
@ 2012-07-20 11:18       ` Srivatsa S. Bhat
  0 siblings, 0 replies; 10+ messages in thread
From: Srivatsa S. Bhat @ 2012-07-20 11:18 UTC (permalink / raw)
  To: Neil Horman
  Cc: gaofeng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On 07/20/2012 04:30 PM, Neil Horman wrote:
> On Fri, Jul 20, 2012 at 03:34:47PM +0530, Srivatsa S. Bhat wrote:
>> On 07/19/2012 10:14 PM, Neil Horman wrote:
>>> On Thu, Jul 19, 2012 at 09:57:37PM +0530, Srivatsa S. Bhat wrote:
>>>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
>>>> netprio cgroup), boot fails with the following NULL pointer dereference:
>>>>
>>>> Initializing cgroup subsys devices
>>>> Initializing cgroup subsys freezer
>>>> Initializing cgroup subsys net_cls
>>>> Initializing cgroup subsys blkio
>>>> Initializing cgroup subsys perf_event
>>>> Initializing cgroup subsys net_prio
>>>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
>>>> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>> PGD 0
>>>> Oops: 0000 [#1] SMP
>>>> CPU 0
>>>> Modules linked in:
>>>>
>>>> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
>>>> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
>>>> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
>>>> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
>>>> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
>>>> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
>>>> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
>>>> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
>>>> Stack:
>>>>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
>>>>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
>>>>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
>>>> Call Trace:
>>>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>>>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>>>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>>>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>>>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>>>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
>>>> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
>>>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>>  RSP <ffffffff81a01ea8>
>>>> CR2: 0000000000000698
>>>> ---[ end trace a7919e7f17c0a725 ]---
>>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>>
>>>> The code corresponds to:
>>>>
>>>> update_netdev_tables():
>>>>         for_each_netdev(&init_net, dev) {
>>>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
>>>>
>>>>
>>>> The list head is initialized in netdev_init(), which is called much
>>>> later than cgrp_create(). So the problem is that we are calling
>>>> update_netdev_tables() way too early (in cgrp_create()), which will
>>>> end up traversing the not-yet-circular linked list. So at some point,
>>>> the dev pointer will become NULL and hence dev->priomap becomes an
>>>> invalid access.
>>>>
>>>> To fix this, just remove the update_netdev_tables() function entirely,
>>>> since it appears that write_update_netdev_table() will handle things
>>>> just fine.
>>>>
>>>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>>>> ---
>>>>
>>>> Requesting a thorough review of this patch, since I am not sure whether
>>>> removing update_netdev_tables() is perfectly OK and whether that is the
>>>> right thing to do.
>>>>
>>> We could do this I suppose, but this has already been fixed by
>>> 734b65417b24d6eea3e3d7457e1f11493890ee1d
>>
>> Oh good! But don't you think that my patch looks cleaner than that fix?
>> (Of course, provided that my patch is correct!)
>>
>> Anyway, I'm happy to see that the boot failure is fixed. But if anyone feels
>> that the approach of removing the update_netdev_tables() function is correct
>> and better, I'll be happy to provide a patch on top of the boot-fix that
>> went upstream.
>>
> We're almost at the end of a release.  The fix that went in has been tesetd and
> fixes the specific problem that was reported, with almost zero likelyhood of
> causing other regressions. While this fix looks like it might be preferable,
> this isn't a time to go doing something like this without alot more testing, as
> it may cause unforseen problems.
> 

Oh definitely! I didn't mean to suggest doing these changes right away.
It can surely wait.. :)

> Theres also a larger issue of initalization order that I'll be looking at in the
> next few weeks.  Based on the outcome of that I may roll this change in.
> 

Sure, thanks!

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-19 16:27 [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list Srivatsa S. Bhat
  2012-07-19 16:44 ` Neil Horman
@ 2012-07-23  1:15 ` Gao feng
  2012-07-23 11:40   ` Neil Horman
  1 sibling, 1 reply; 10+ messages in thread
From: Gao feng @ 2012-07-23  1:15 UTC (permalink / raw)
  To: Srivatsa S. Bhat, eric.dumazet, davem
  Cc: nhorman, linux-kernel, netdev, mark.d.rustad, john.r.fastabend, lizefan

于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> netprio cgroup), boot fails with the following NULL pointer dereference:
> 
> Initializing cgroup subsys devices
> Initializing cgroup subsys freezer
> Initializing cgroup subsys net_cls
> Initializing cgroup subsys blkio
> Initializing cgroup subsys perf_event
> Initializing cgroup subsys net_prio
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
> IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> PGD 0
> Oops: 0000 [#1] SMP
> CPU 0
> Modules linked in:
> 
> Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
> RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
> RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
> RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
> R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
> R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
> FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> Stack:
>  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
>  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
>  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
> Call Trace:
>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>  RSP <ffffffff81a01ea8>
> CR2: 0000000000000698
> ---[ end trace a7919e7f17c0a725 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> 
> The code corresponds to:
> 
> update_netdev_tables():
>         for_each_netdev(&init_net, dev) {
>                 map = rtnl_dereference(dev->priomap);  <---- HERE
> 
> 
> The list head is initialized in netdev_init(), which is called much
> later than cgrp_create(). So the problem is that we are calling
> update_netdev_tables() way too early (in cgrp_create()), which will
> end up traversing the not-yet-circular linked list. So at some point,
> the dev pointer will become NULL and hence dev->priomap becomes an
> invalid access.
> 
> To fix this, just remove the update_netdev_tables() function entirely,
> since it appears that write_update_netdev_table() will handle things
> just fine.

The reason I add update_netdev_tables in cgrp_create is to avoid additional
bound checkings when we accessing the dev->priomap.priomap.

Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
I think it's safe enough to access priomap without bound check.

Thanks

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-23  1:15 ` Gao feng
@ 2012-07-23 11:40   ` Neil Horman
  2012-09-10  9:29     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Horman @ 2012-07-23 11:40 UTC (permalink / raw)
  To: Gao feng
  Cc: Srivatsa S. Bhat, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On Mon, Jul 23, 2012 at 09:15:05AM +0800, Gao feng wrote:
> 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
> > After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> > netprio cgroup), boot fails with the following NULL pointer dereference:
> > 
> > Initializing cgroup subsys devices
> > Initializing cgroup subsys freezer
> > Initializing cgroup subsys net_cls
> > Initializing cgroup subsys blkio
> > Initializing cgroup subsys perf_event
> > Initializing cgroup subsys net_prio
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
> > IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> > PGD 0
> > Oops: 0000 [#1] SMP
> > CPU 0
> > Modules linked in:
> > 
> > Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
> > RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> > RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
> > RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
> > RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
> > R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
> > R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
> > FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> > Stack:
> >  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
> >  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
> >  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
> > Call Trace:
> >  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
> >  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
> >  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
> >  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
> >  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
> >  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> > Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
> > RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >  RSP <ffffffff81a01ea8>
> > CR2: 0000000000000698
> > ---[ end trace a7919e7f17c0a725 ]---
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > 
> > The code corresponds to:
> > 
> > update_netdev_tables():
> >         for_each_netdev(&init_net, dev) {
> >                 map = rtnl_dereference(dev->priomap);  <---- HERE
> > 
> > 
> > The list head is initialized in netdev_init(), which is called much
> > later than cgrp_create(). So the problem is that we are calling
> > update_netdev_tables() way too early (in cgrp_create()), which will
> > end up traversing the not-yet-circular linked list. So at some point,
> > the dev pointer will become NULL and hence dev->priomap becomes an
> > invalid access.
> > 
> > To fix this, just remove the update_netdev_tables() function entirely,
> > since it appears that write_update_netdev_table() will handle things
> > just fine.
> 
> The reason I add update_netdev_tables in cgrp_create is to avoid additional
> bound checkings when we accessing the dev->priomap.priomap.
> 
> Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
> I think it's safe enough to access priomap without bound check.
> 
> Thanks
> 

I think its probably safe, yes, but lets leave it there for just a bit.  Its not
hurting anything, and I'd like to look into getting Srivatsa' patch in first.
Neil


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-07-23 11:40   ` Neil Horman
@ 2012-09-10  9:29     ` Srivatsa S. Bhat
  2012-09-10 13:16       ` Neil Horman
  0 siblings, 1 reply; 10+ messages in thread
From: Srivatsa S. Bhat @ 2012-09-10  9:29 UTC (permalink / raw)
  To: Neil Horman
  Cc: Gao feng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On 07/23/2012 05:10 PM, Neil Horman wrote:
> On Mon, Jul 23, 2012 at 09:15:05AM +0800, Gao feng wrote:
>> 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
>>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
>>> netprio cgroup), boot fails with the following NULL pointer dereference:
>>>
[...]
>>> Call Trace:
>>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
>>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>  RSP <ffffffff81a01ea8>
>>> CR2: 0000000000000698
>>> ---[ end trace a7919e7f17c0a725 ]---
>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>
>>> The code corresponds to:
>>>
>>> update_netdev_tables():
>>>         for_each_netdev(&init_net, dev) {
>>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
>>>
>>>
>>> The list head is initialized in netdev_init(), which is called much
>>> later than cgrp_create(). So the problem is that we are calling
>>> update_netdev_tables() way too early (in cgrp_create()), which will
>>> end up traversing the not-yet-circular linked list. So at some point,
>>> the dev pointer will become NULL and hence dev->priomap becomes an
>>> invalid access.
>>>
>>> To fix this, just remove the update_netdev_tables() function entirely,
>>> since it appears that write_update_netdev_table() will handle things
>>> just fine.
>>
>> The reason I add update_netdev_tables in cgrp_create is to avoid additional
>> bound checkings when we accessing the dev->priomap.priomap.
>>
>> Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
>> I think it's safe enough to access priomap without bound check.
>>
> 
> I think its probably safe, yes, but lets leave it there for just a bit.  Its not
> hurting anything, and I'd like to look into getting Srivatsa' patch in first.

Hi Neil,

Did you get around to look into this again?

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-09-10  9:29     ` Srivatsa S. Bhat
@ 2012-09-10 13:16       ` Neil Horman
  2012-09-11 11:21         ` Srivatsa S. Bhat
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Horman @ 2012-09-10 13:16 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Gao feng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On Mon, Sep 10, 2012 at 02:59:18PM +0530, Srivatsa S. Bhat wrote:
> On 07/23/2012 05:10 PM, Neil Horman wrote:
> > On Mon, Jul 23, 2012 at 09:15:05AM +0800, Gao feng wrote:
> >> 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
> >>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> >>> netprio cgroup), boot fails with the following NULL pointer dereference:
> >>>
> [...]
> >>> Call Trace:
> >>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
> >>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
> >>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
> >>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
> >>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
> >>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> >>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >>>  RSP <ffffffff81a01ea8>
> >>> CR2: 0000000000000698
> >>> ---[ end trace a7919e7f17c0a725 ]---
> >>> Kernel panic - not syncing: Attempted to kill the idle task!
> >>>
> >>> The code corresponds to:
> >>>
> >>> update_netdev_tables():
> >>>         for_each_netdev(&init_net, dev) {
> >>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
> >>>
> >>>
> >>> The list head is initialized in netdev_init(), which is called much
> >>> later than cgrp_create(). So the problem is that we are calling
> >>> update_netdev_tables() way too early (in cgrp_create()), which will
> >>> end up traversing the not-yet-circular linked list. So at some point,
> >>> the dev pointer will become NULL and hence dev->priomap becomes an
> >>> invalid access.
> >>>
> >>> To fix this, just remove the update_netdev_tables() function entirely,
> >>> since it appears that write_update_netdev_table() will handle things
> >>> just fine.
> >>
> >> The reason I add update_netdev_tables in cgrp_create is to avoid additional
> >> bound checkings when we accessing the dev->priomap.priomap.
> >>
> >> Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
> >> I think it's safe enough to access priomap without bound check.
> >>
> > 
> > I think its probably safe, yes, but lets leave it there for just a bit.  Its not
> > hurting anything, and I'd like to look into getting Srivatsa' patch in first.
> 
> Hi Neil,
> 
> Did you get around to look into this again?
> 
I haven't looked at it specifically no, I apologize.  That said I think the
other changes that went in back in that time frame have had time to soak, and
looking at the way we current update the priomap table, I think its safe for us
to remove the update_netdev_table call and definition.  If you repost your
patch, I'll ack it. 

Thanks!
Neil

> Regards,
> Srivatsa S. Bhat
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
  2012-09-10 13:16       ` Neil Horman
@ 2012-09-11 11:21         ` Srivatsa S. Bhat
  0 siblings, 0 replies; 10+ messages in thread
From: Srivatsa S. Bhat @ 2012-09-11 11:21 UTC (permalink / raw)
  To: Neil Horman
  Cc: Gao feng, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan

On 09/10/2012 06:46 PM, Neil Horman wrote:
> On Mon, Sep 10, 2012 at 02:59:18PM +0530, Srivatsa S. Bhat wrote:
>> On 07/23/2012 05:10 PM, Neil Horman wrote:
>>> On Mon, Jul 23, 2012 at 09:15:05AM +0800, Gao feng wrote:
>>>> 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
>>>>> After commit ef209f15 (net: cgroup: fix access the unallocated memory in
>>>>> netprio cgroup), boot fails with the following NULL pointer dereference:
>>>>>
>> [...]
>>>>> Call Trace:
>>>>>  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
>>>>>  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
>>>>>  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
>>>>>  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
>>>>>  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
>>>>>  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
>>>>> RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
>>>>>  RSP <ffffffff81a01ea8>
>>>>> CR2: 0000000000000698
>>>>> ---[ end trace a7919e7f17c0a725 ]---
>>>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>>>
>>>>> The code corresponds to:
>>>>>
>>>>> update_netdev_tables():
>>>>>         for_each_netdev(&init_net, dev) {
>>>>>                 map = rtnl_dereference(dev->priomap);  <---- HERE
>>>>>
>>>>>
>>>>> The list head is initialized in netdev_init(), which is called much
>>>>> later than cgrp_create(). So the problem is that we are calling
>>>>> update_netdev_tables() way too early (in cgrp_create()), which will
>>>>> end up traversing the not-yet-circular linked list. So at some point,
>>>>> the dev pointer will become NULL and hence dev->priomap becomes an
>>>>> invalid access.
>>>>>
>>>>> To fix this, just remove the update_netdev_tables() function entirely,
>>>>> since it appears that write_update_netdev_table() will handle things
>>>>> just fine.
>>>>
>>>> The reason I add update_netdev_tables in cgrp_create is to avoid additional
>>>> bound checkings when we accessing the dev->priomap.priomap.
>>>>
>>>> Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
>>>> I think it's safe enough to access priomap without bound check.
>>>>
>>>
>>> I think its probably safe, yes, but lets leave it there for just a bit.  Its not
>>> hurting anything, and I'd like to look into getting Srivatsa' patch in first.
>>
>> Hi Neil,
>>
>> Did you get around to look into this again?
>>
> I haven't looked at it specifically no, I apologize.  That said I think the
> other changes that went in back in that time frame have had time to soak, and
> looking at the way we current update the priomap table, I think its safe for us
> to remove the update_netdev_table call and definition.  If you repost your
> patch, I'll ack it. 
> 

Cool! I'll repost the patch, along with another small improvement that I happened
to observe, as a separate patch. Thanks!

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-09-11 11:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-19 16:27 [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list Srivatsa S. Bhat
2012-07-19 16:44 ` Neil Horman
2012-07-20 10:04   ` Srivatsa S. Bhat
2012-07-20 11:00     ` Neil Horman
2012-07-20 11:18       ` Srivatsa S. Bhat
2012-07-23  1:15 ` Gao feng
2012-07-23 11:40   ` Neil Horman
2012-09-10  9:29     ` Srivatsa S. Bhat
2012-09-10 13:16       ` Neil Horman
2012-09-11 11:21         ` Srivatsa S. Bhat

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).