* iproute2-2.6.20-070313 bug ? @ 2007-03-21 18:00 Denys 2007-03-22 11:23 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-03-21 18:00 UTC (permalink / raw) To: netdev Possible i discovered bug, but maybe specific to my setup. In your sources (tc/tc_core.h) i notice #define TIME_UNITS_PER_SEC 1000000000 When i change it to #define TIME_UNITS_PER_SEC 1000000.0 (it was value before in sources) everythign works fine. Otherwise tbf not working at all, it is dropping all packets. Did anyone test new iproute2 with tbf? -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-21 18:00 iproute2-2.6.20-070313 bug ? Denys @ 2007-03-22 11:23 ` Patrick McHardy 2007-03-22 12:46 ` Denys 0 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-03-22 11:23 UTC (permalink / raw) To: Denys; +Cc: netdev Denys wrote: > Possible i discovered bug, but maybe specific to my setup. > > In your sources (tc/tc_core.h) i notice > #define TIME_UNITS_PER_SEC 1000000000 > When i change it to > #define TIME_UNITS_PER_SEC 1000000.0 > (it was value before in sources) > everythign works fine. Otherwise tbf not working at all, it is dropping all > packets. > > Did anyone test new iproute2 with tbf? Yes, please send the commands you use. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 11:23 ` Patrick McHardy @ 2007-03-22 12:46 ` Denys 2007-03-22 13:09 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-03-22 12:46 UTC (permalink / raw) To: Patrick McHardy; +Cc: netdev Dear Sir Already i sent. I will copy here also Normal "patched by me" iproute2 /sbin/tc qdisc del dev ppp0 root /sbin/tc qdisc add dev ppp0 root handle 1: prio /sbin/tc qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 500ms rate 128kbit peakrate 256kbit minburst 16384 /sbin/tc filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst 0.0.0.0/0 flowid 2:1 tc(patched) monitor output deleted qdisc prio 1: dev ppp0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc prio 1: dev ppp0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc tbf 2: dev ppp0 parent 1:1 rate 128000bit burst 1024Kb peakrate 256000bit minburst 16Kb lat 500.0ms filter dev ppp0 parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 2:1 match 00000000/00000000 at 16 VISP-Office ~ #cat /proc/net/psched 00000001 00000001 000f4240 000003e8 Now running tc2, it is "stock" version /sbin/tc2 qdisc del dev ppp0 root /sbin/tc2 qdisc add dev ppp0 root handle 1: prio /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency 500ms rate 128kbit peakrate 256kbit minburst 16384 /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst 0.0.0.0/0 flowid 2:1 Monitor output: deleted qdisc prio 1: dev ppp0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc prio 1: dev ppp0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc tbf 2: dev ppp0 parent 1:1 rate 128000bit burst 1024Kb peakrate 256000bit minburst 16Kb lat 500.0ms filter dev ppp0 parent 1: protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 2:1 match 00000000/00000000 at 16 VISP-Office ~ #cat /proc/net/psched 00000001 00000001 000f4240 000003e8 Sure when i run tc2 - i see in stats, when it stopped (tc - normal, tc2 - buggy): VISP-Office ~ #tc -s qdisc show dev ppp0 qdisc ingress ffff: ---------------- Sent 184893 bytes 2311 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc prio 1: bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 7765 bytes 64 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 64p requeues 0 qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate 256000bit minburst 16Kb lat 4.2s Sent 7765 bytes 64 pkt (dropped 0, overlimits 64 requeues 0) rate 0bit 0pps backlog 0b 64p requeues 0 VISP-Office ~ #tc2 -s qdisc show dev ppp0 qdisc ingress ffff: ---------------- Sent 186423 bytes 2324 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 qdisc prio 1: bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 8677 bytes 77 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 77p requeues 0 qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate 256000bit minburst 16Kb lat 4.2s Sent 8677 bytes 77 pkt (dropped 0, overlimits 77 requeues 0) rate 0bit 0pps backlog 0b 77p requeues 0 I wish this will be enough information. Thanks for your help! On Thu, 22 Mar 2007 12:23:03 +0100, Patrick McHardy wrote > Denys wrote: > > Possible i discovered bug, but maybe specific to my setup. > > > > In your sources (tc/tc_core.h) i notice > > #define TIME_UNITS_PER_SEC 1000000000 > > When i change it to > > #define TIME_UNITS_PER_SEC 1000000.0 > > (it was value before in sources) > > everythign works fine. Otherwise tbf not working at all, it is dropping all > > packets. > > > > Did anyone test new iproute2 with tbf? > > Yes, please send the commands you use. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 12:46 ` Denys @ 2007-03-22 13:09 ` Patrick McHardy [not found] ` <20070322131245.M85528@visp.net.lb> ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-22 13:09 UTC (permalink / raw) To: Denys; +Cc: netdev, Stephen Hemminger Denys wrote: > /sbin/tc2 qdisc del dev ppp0 root > /sbin/tc2 qdisc add dev ppp0 root handle 1: prio > /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency > 500ms rate 128kbit peakrate 256kbit minburst 16384 > /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst > 0.0.0.0/0 flowid 2:1 That is an incredible huge buffer value. > qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate 256000bit minburst 16Kb lat 4.2s And it causes an overflow. The limit for the TBF burst value with nanosecond resolution is ~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting in a worst-case latency of 4 seconds. I think this limit is in the reasonable range. Your configuration results in a worst-case queuing delay of 64s, and I doubt that you really want that. Obviously its not good to break existing configurations, but I would argue that this configuration is broken. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <20070322131245.M85528@visp.net.lb>]
* Re: iproute2-2.6.20-070313 bug ? [not found] ` <20070322131245.M85528@visp.net.lb> @ 2007-03-22 13:23 ` Patrick McHardy 2007-03-22 13:35 ` Denys [not found] ` <20070322132637.M88445@visp.net.lb> 0 siblings, 2 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-22 13:23 UTC (permalink / raw) To: Denys; +Cc: Linux Netdev List, Stephen Hemminger Please don't remove CCs. Denys wrote: > 1024kb (if i am not wrong 1Mbyte) is huge? > > For me it is ok, as soon as i have RAM. Its not about the memory, its about the resulting queueing delay. If you buffer packets for 64 seconds the sender will retransmit them and you end up wasting bandwidth. > Another thing, it is working well > with old tc. Just really if i have plenty of RAM's and i want 32second > buffer, why i cannot have that, and if i see it is really possible before? I know it worked before. But I can't think of a reason why anyone would want a buffer that large. Why do you want to queue packets for up to 64 seconds? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 13:23 ` Patrick McHardy @ 2007-03-22 13:35 ` Denys [not found] ` <20070322132637.M88445@visp.net.lb> 1 sibling, 0 replies; 41+ messages in thread From: Denys @ 2007-03-22 13:35 UTC (permalink / raw) To: Patrick McHardy; +Cc: netdev, Stephen Hemminger On Thu, 22 Mar 2007 14:23:01 +0100, Patrick McHardy wrote > Please don't remove CCs. > > Denys wrote: > > 1024kb (if i am not wrong 1Mbyte) is huge? > > > > For me it is ok, as soon as i have RAM. > > Its not about the memory, its about the resulting queueing delay. > If you buffer packets for 64 seconds the sender will retransmit > them and you end up wasting bandwidth. > > > Another thing, it is working well > > with old tc. Just really if i have plenty of RAM's and i want 32second > > buffer, why i cannot have that, and if i see it is really possible before? > > I know it worked before. But I can't think of a reason why anyone > would want a buffer that large. Why do you want to queue packets > for up to 64 seconds? Seems i misunderstand how it works. If i am not wrong, till buffer available, bandwidth will be given on "peakrate" speed, and when buffer is empty - on "rate" speed. I am wrong? At least it was working like this before. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
[parent not found: <20070322132637.M88445@visp.net.lb>]
* Re: iproute2-2.6.20-070313 bug ? [not found] ` <20070322132637.M88445@visp.net.lb> @ 2007-03-22 13:43 ` Patrick McHardy 2007-03-22 13:47 ` Denys 0 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-03-22 13:43 UTC (permalink / raw) To: Denys; +Cc: Linux Netdev List, Stephen Hemminger Denys wrote: >>>Another thing, it is working well >>>with old tc. Just really if i have plenty of RAM's and i want 32second >>>buffer, why i cannot have that, and if i see it is really possible before? >> >>I know it worked before. But I can't think of a reason why anyone >>would want a buffer that large. Why do you want to queue packets >>for up to 64 seconds? > > Seems i misunderstand how it works. If i am not wrong, till buffer available, > bandwidth will be given on "peakrate" speed, and when buffer is empty - on > "rate" speed. I am wrong? No, I got confused, sorry about that. Your configuration allows bursts up to 64 seconds long. I guess there's nothing wrong with that. I already asked Stephen to revert that patch, it was not meant to be included yet, unfortunately it made it into the release. Even more unfortunate is that it looks like we need larger types in the ABI to properly support nano-second resolution. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 13:43 ` Patrick McHardy @ 2007-03-22 13:47 ` Denys 0 siblings, 0 replies; 41+ messages in thread From: Denys @ 2007-03-22 13:47 UTC (permalink / raw) To: Patrick McHardy; +Cc: Linux Netdev List, Stephen Hemminger Hi again On Thu, 22 Mar 2007 14:43:43 +0100, Patrick McHardy wrote > Denys wrote: > >>>Another thing, it is working well > >>>with old tc. Just really if i have plenty of RAM's and i want 32second > >>>buffer, why i cannot have that, and if i see it is really possible before? > >> > >>I know it worked before. But I can't think of a reason why anyone > >>would want a buffer that large. Why do you want to queue packets > >>for up to 64 seconds? > > > > Seems i misunderstand how it works. If i am not wrong, till buffer available, > > bandwidth will be given on "peakrate" speed, and when buffer is empty - on > > "rate" speed. I am wrong? > > No, I got confused, sorry about that. Your configuration allows > bursts up to 64 seconds long. I guess there's nothing wrong with that. > > I already asked Stephen to revert that patch, it was not meant to > be included yet, unfortunately it made it into the release. Even > more unfortunate is that it looks like we need larger types in the > ABI to properly support nano-second resolution. Thanks sir for your great help and great software and sorry that i took time, by not enough good explanation. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 13:09 ` Patrick McHardy [not found] ` <20070322131245.M85528@visp.net.lb> @ 2007-03-22 13:26 ` Denys 2007-03-22 17:12 ` Stephen Hemminger 2 siblings, 0 replies; 41+ messages in thread From: Denys @ 2007-03-22 13:26 UTC (permalink / raw) To: Patrick McHardy; +Cc: netdev, Stephen Hemminger Dear sir Sorry, i forgot to CC other members of discussion. 1024kb (if i am not wrong 1Mbyte) is huge? For me it is ok, as soon as i have RAM. Another thing, it is working well with old tc. Just really if i have plenty of RAM's and i want 32second buffer, why i cannot have that, and if i see it is really possible before? Possible i am misunderstanding something... In real world i am seeing reasonable to have much bigger buffers, especially if there is no problem in resources (RAM, timer resolution, CPU). For example, as i remember we had failure on one of our STM-1, and Cisco's on Teleglobe was buffering about 20-30seconds of data without major packetloss. Another thing, why i was using buffer, and possible i use it wrong: For example customer have 128Kbit/s account, and i want to give him burst to open web-pages fast (256Kbit/s), but if he use bandwidth non-stop, he will pass this buffer, and will be throttled back to 128Kbit/s. Now seems i cannot give such functionality. On Thu, 22 Mar 2007 14:09:06 +0100, Patrick McHardy wrote > Denys wrote: > > /sbin/tc2 qdisc del dev ppp0 root > > /sbin/tc2 qdisc add dev ppp0 root handle 1: prio > > /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency > > 500ms rate 128kbit peakrate 256kbit minburst 16384 > > /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst > > 0.0.0.0/0 flowid 2:1 > > That is an incredible huge buffer value. > > > qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate > 256000bit minburst 16Kb lat 4.2s > > And it causes an overflow. > > The limit for the TBF burst value with nanosecond resolution is > ~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting > in a worst-case latency of 4 seconds. I think this limit is in the > reasonable range. Your configuration results in a worst-case > queuing delay of 64s, and I doubt that you really want that. > > Obviously its not good to break existing configurations, but I > would argue that this configuration is broken. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 13:09 ` Patrick McHardy [not found] ` <20070322131245.M85528@visp.net.lb> 2007-03-22 13:26 ` Denys @ 2007-03-22 17:12 ` Stephen Hemminger 2007-03-22 17:14 ` Patrick McHardy ` (2 more replies) 2 siblings, 3 replies; 41+ messages in thread From: Stephen Hemminger @ 2007-03-22 17:12 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, netdev On Thu, 22 Mar 2007 14:09:06 +0100 Patrick McHardy <kaber@trash.net> wrote: > Denys wrote: > > /sbin/tc2 qdisc del dev ppp0 root > > /sbin/tc2 qdisc add dev ppp0 root handle 1: prio > > /sbin/tc2 qdisc add dev ppp0 parent 1:1 handle 2: tbf buffer 1024kb latency > > 500ms rate 128kbit peakrate 256kbit minburst 16384 > > /sbin/tc2 filter add dev ppp0 parent 1:0 protocol ip prio 10 u32 match ip dst > > 0.0.0.0/0 flowid 2:1 > > > That is an incredible huge buffer value. > > > qdisc tbf 2: parent 1:1 rate 128000bit burst 4294932937b peakrate > 256000bit minburst 16Kb lat 4.2s > > And it causes an overflow. > > The limit for the TBF burst value with nanosecond resolution is > ~ 4 * rate (10^9 * burst / rate < 2^32 needs to hold), resoluting > in a worst-case latency of 4 seconds. I think this limit is in the > reasonable range. Your configuration results in a worst-case > queuing delay of 64s, and I doubt that you really want that. > > Obviously its not good to break existing configurations, but I > would argue that this configuration is broken. > tc should check for overflows and doesn't. Do you want to make a patch for the obvious cases? -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: iproute2-2.6.20-070313 bug ? 2007-03-22 17:12 ` Stephen Hemminger @ 2007-03-22 17:14 ` Patrick McHardy 2007-03-26 18:49 ` more... iproute2/htb/whatever critical " Denys 2007-03-31 2:26 ` more iproute2 issues (not critical) Denys 2007-04-04 0:03 ` one more... iproute commands lockup whole system Denys 2 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-03-22 17:14 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Denys, netdev Stephen Hemminger wrote: > tc should check for overflows and doesn't. Do you want to make a patch > for the obvious cases? I agree. I'll take care of it once I'm done with my patches for 2.6.22. ^ permalink raw reply [flat|nested] 41+ messages in thread
* more... iproute2/htb/whatever critical bug ? 2007-03-22 17:14 ` Patrick McHardy @ 2007-03-26 18:49 ` Denys 2007-03-27 15:10 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-03-26 18:49 UTC (permalink / raw) To: Patrick McHardy, Stephen Hemminger; +Cc: netdev Hello again one more bug, and seems even more painful, kernel panic... kernel 2.6.20.3 It is kind of complicated setup: Init of shaper on ifb0 device $TC qdisc add dev ifb0 root handle 1: htb default 1000 $TC filter add dev ifb0 parent 1:0 prio 1000 protocol ip u32 match ip dst 0.0.0.0/0 classid 1:131 Then when interface go up $TC class add dev ifb0 classid 1:${id} parent 1: htb rate ${rate}bit quantum 1600 $TC qdisc add dev ifb0 parent 1:${id} handle ${id}: sfq perturb 5 $TC filter add dev ifb0 protocol ip pref ${id} parent 1: handle ${id} fw classid 1:${id} $TC qdisc del dev $2 ingress 1>/dev/null 2>/dev/null $TC qdisc add dev $2 ingress $TC filter add dev $2 parent ffff: protocol ip prio 10 u32 \ match u32 0 0 flowid 1:1 \ action ipt -j MARK --set-mark ${id} \ action mirred egress redirect dev ifb0 It happens when interface (pppN) go down, commands executed in ip-down $TC filter del dev ifb0 protocol ip pref ${id} $TC class del dev ifb0 classid 1:${id} Mar 26 23:21:11 ROUTER-75 pppd[22773]: Connection terminated. Mar 26 21:20:04 ROUTER-75 [ 551.481081] BUG: unable to handle kernel NULL pointer dereference Mar 26 21:20:04 ROUTER-75 at virtual address 00000074 Mar 26 21:20:04 ROUTER-75 [ 551.481187] printing eip: Mar 26 21:20:04 ROUTER-75 [ 551.481236] f8a11df1 Mar 26 21:20:04 ROUTER-75 [ 551.481289] *pde = 00000000 Mar 26 21:20:04 ROUTER-75 [ 551.481340] Oops: 0000 [#1] Mar 26 21:20:04 ROUTER-75 [ 551.481384] Mar 26 21:20:04 ROUTER-75 SMP Mar 26 21:20:04 ROUTER-75 Mar 26 21:20:04 ROUTER-75 [ 551.481549] Modules linked in: .... long module list ar 26 21:20:04 ROUTER-75 [ 551.485237] CPU: 0 Mar 26 21:20:04 ROUTER-75 [ 551.485238] EIP: 0060:[<f8a11df1>] Not tainted VLI Mar 26 21:20:04 ROUTER-75 [ 551.485239] EFLAGS: 00010282 (2.6.20.3-build- 0001 #4) Mar 26 21:20:04 ROUTER-75 [ 551.485438] EIP is at htb_qlen_notify+0x9/0x79 [sch_htb] Mar 26 21:20:04 ROUTER-75 [ 551.485487] eax: f3aa9800 ebx: 00000000 ecx: 00000000 edx: 00000000 Mar 26 21:20:04 ROUTER-75 [ 551.485594] esi: f3aa9800 edi: f8a13420 ebp: 0000000c esp: f22c3c40 Mar 26 21:20:04 ROUTER-75 [ 551.485644] ds: 007b es: 007b ss: 0068 Mar 26 21:20:04 ROUTER-75 [ 551.485691] Process tc (pid: 24816, ti=f22c2000 task=f204a030 task.ti=f22c2000) Mar 26 21:20:04 ROUTER-75 Mar 26 21:20:04 ROUTER-75 [ 551.485741] Stack: Mar 26 21:20:04 ROUTER-75 f2214000 Mar 26 21:20:04 ROUTER-75 f220e000 Mar 26 21:20:04 ROUTER-75 00000000 Mar 26 21:20:04 ROUTER-75 00010263 Mar 26 21:20:04 ROUTER-75 00000000 Mar 26 21:20:04 ROUTER-75 f3aa9800 Mar 26 21:20:04 ROUTER-75 c021b0b5 Mar 26 21:20:04 ROUTER-75 0000000c Mar 26 21:20:04 ROUTER-75 Mar 26 21:20:04 ROUTER-75 [ 551.486247] Mar 26 21:20:04 ROUTER-75 f2a4c600 Mar 26 21:20:04 ROUTER-75 00000000 Mar 26 21:20:04 ROUTER-75 f3aa9800 Mar 26 21:20:04 ROUTER-75 Mar 26 21:20:04 ROUTER-75 [ 551.486705] Mar 26 21:20:04 ROUTER-75 00000000 Mar 26 21:20:04 ROUTER-75 f2a4c600 Mar 26 21:20:04 ROUTER-75 ffffffed Mar 26 21:20:04 ROUTER-75 00010263 Mar 26 21:20:04 ROUTER-75 c02e5318 Mar 26 21:20:04 ROUTER-75 [ 551.487226] Call Trace: Mar 26 21:20:04 ROUTER-75 tc_ctl_tclass+0x144/0x1fd Mar 26 21:20:04 ROUTER-75 rtnetlink_rcv_msg+0x0/0x1d2 Mar 26 21:20:04 ROUTER-75 [ 551.488183] [<c02144d5>] Mar 26 21:20:04 ROUTER-75 [ 551.488266] [<c0220aae>] Mar 26 21:20:04 ROUTER-75 netlink_data_ready+0x12/0x4c Mar 26 21:20:04 ROUTER-75 [ 551.488415] [<c021fab9>] Mar 26 21:20:04 ROUTER-75 [ 551.488498] [<c0220a90>] Mar 26 21:20:04 ROUTER-75 netlink_sendmsg+0x23b/0x247 Mar 26 21:20:04 ROUTER-75 [ 551.488658] [<c0204f0c>] Mar 26 21:20:04 ROUTER-75 sock_sendmsg+0xbc/0xd4 Mar 26 21:20:04 ROUTER-75 autoremove_wake_function+0x0/0x35 Mar 26 21:20:04 ROUTER-75 [ 551.488913] [<c0126545>] Mar 26 21:20:04 ROUTER-75 autoremove_wake_function+0x0/0x35 Mar 26 21:20:04 ROUTER-75 [ 551.489006] [<c021ad27>] Mar 26 21:20:04 ROUTER-75 __qdisc_run+0x2e/0x182 Mar 26 21:20:04 ROUTER-75 [ 551.489158] [<f8a0a03e>] Mar 26 21:20:04 ROUTER-75 tcf_mirred+0x0/0x14d [act_mirred] Mar 26 21:20:04 ROUTER-75 [ 551.489249] [<c020ad12>] Mar 26 21:20:04 ROUTER-75 verify_iovec+0x3e/0x70 Mar 26 21:20:04 ROUTER-75 sys_sendmsg+0x194/0x1f9 Mar 26 21:20:04 ROUTER-75 [ 551.489483] [<c020585d>] Mar 26 21:20:04 ROUTER-75 sys_recvmsg+0x14d/0x1cf Mar 26 21:20:04 ROUTER-75 [ 551.489628] [<c0137d13>] Mar 26 21:20:04 ROUTER-75 get_page_from_freelist+0x253/0x2d3 Mar 26 21:20:04 ROUTER-75 [ 551.489717] [<f8a2d18f>] Mar 26 21:20:04 ROUTER-75 ppp_input+0xc6/0xe6 [ppp_generic] Mar 26 21:20:04 ROUTER-75 [ 551.489868] [<f8a34cc4>] Mar 26 21:20:04 ROUTER-75 [ 551.489960] [<c021b315>] Mar 26 21:20:04 ROUTER-75 tc_classify+0x34/0xbc Mar 26 21:20:04 ROUTER-75 [ 551.490110] [<f89d010b>] Mar 26 21:20:04 ROUTER-75 ingress_enqueue+0x16/0x55 [sch_ingress] Mar 26 21:20:04 ROUTER-75 [ 551.490200] [<c020d88b>] Mar 26 21:20:04 ROUTER-75 netif_receive_skb+0x215/0x349 Mar 26 21:20:04 ROUTER-75 [ 551.490346] [<f8914b48>] Mar 26 21:20:04 ROUTER-75 rtl8139_interrupt+0x2a1/0x3c5 [8139too] Mar 26 21:20:04 ROUTER-75 [ 551.490438] [<c020f143>] Mar 26 21:20:04 ROUTER-75 [ 551.490907] Code: Mar 26 21:20:04 ROUTER-75 83 Mar 26 21:20:04 ROUTER-75 24 Mar 26 21:20:04 ROUTER-75 ff Mar 26 21:20:04 ROUTER-75 0f Mar 26 21:20:04 ROUTER-75 85 Mar 26 21:20:04 ROUTER-75 4f Mar 26 21:20:04 ROUTER-75 ff Mar 26 21:20:04 ROUTER-75 ff Mar 26 21:20:04 ROUTER-75 24 Mar 26 21:20:04 ROUTER-75 1c Mar 26 21:20:04 ROUTER-75 00 Mar 26 21:20:04 ROUTER-75 8b Mar 26 21:20:04 ROUTER-75 1c Mar 26 21:20:04 ROUTER-75 20 Mar 26 21:20:04 ROUTER-75 5b Mar 26 21:20:04 ROUTER-75 5f Mar 26 21:20:04 ROUTER-75 5d Mar 26 21:20:04 ROUTER-75 c3 Mar 26 21:20:04 ROUTER-75 56 Mar 26 21:20:04 ROUTER-75 c6 Mar 26 21:20:04 ROUTER-75 53 Mar 26 21:20:04 ROUTER-75 83 Mar 26 21:20:04 ROUTER-75 ec Mar 26 21:20:04 ROUTER-75 10 Mar 26 21:20:04 ROUTER-75 unparseable log message: "<8b> " Mar 26 21:20:04 ROUTER-75 74 Mar 26 21:20:04 ROUTER-75 00 Mar 26 21:20:04 ROUTER-75 83 Mar 26 21:20:04 ROUTER-75 01 Mar 26 21:20:04 ROUTER-75 00 Mar 26 21:20:04 ROUTER-75 00 Mar 26 21:20:04 ROUTER-75 24 Mar 26 21:20:04 ROUTER-75 c7 Mar 26 21:20:04 ROUTER-75 Mar 26 21:20:04 ROUTER-75 [ 551.493934] EIP: [<f8a11df1>] Mar 26 21:20:04 ROUTER-75 htb_qlen_notify+0x9/0x79 [sch_htb] Mar 26 21:20:04 ROUTER-75 SS:ESP 0068:f22c3c40 Mar 26 21:20:04 ROUTER-75 [ 551.494059] Mar 26 21:20:04 ROUTER-75 Kernel panic - not syncing: Fatal exception in interrupt Mar 26 21:20:04 ROUTER-75 Rebooting in 10 seconds.. One more panic Mar 26 21:39:30 ROUTER-75 [ 429.699144] BUG: unable to handle kernel NULL pointer dereference Mar 26 21:39:30 ROUTER-75 at virtual address 00000074 Mar 26 21:39:30 ROUTER-75 [ 429.699272] printing eip: Mar 26 21:39:30 ROUTER-75 [ 429.699328] f8a11df1 Mar 26 21:39:30 ROUTER-75 [ 429.699386] *pde = 00000000 Mar 26 21:39:30 ROUTER-75 [ 429.699441] Oops: 0000 [#1] Mar 26 21:39:30 ROUTER-75 [ 429.699490] Mar 26 21:39:30 ROUTER-75 SMP Mar 26 21:39:30 ROUTER-75 Mar 26 21:39:30 ROUTER-75 [ 429.699607] Modules linked in: .... list of modules Mar 26 21:39:30 ROUTER-75 [ 429.702737] CPU: 1 Mar 26 21:39:30 ROUTER-75 [ 429.702738] EIP: 0060:[<f8a11df1>] Not tainted VLI Mar 26 21:39:30 ROUTER-75 [ 429.702740] EFLAGS: 00010282 (2.6.20.3-build- 0001 #4) Mar 26 21:39:30 ROUTER-75 [ 429.702889] EIP is at htb_qlen_notify+0x9/0x79 [sch_htb] Mar 26 21:39:30 ROUTER-75 [ 429.702940] eax: f36d3800 ebx: 00000000 ecx: 00000000 edx: 00000000 Mar 26 21:39:30 ROUTER-75 [ 429.702994] esi: f36d3800 edi: f8a13420 ebp: 0000000e esp: f2093c40 Mar 26 21:39:30 ROUTER-75 [ 429.703048] ds: 007b es: 007b ss: 0068 Mar 26 21:39:30 ROUTER-75 [ 429.703098] Process tc (pid: 27124, ti=f2092000 task=f2840550 task.ti=f2092000) Mar 26 21:39:30 ROUTER-75 Mar 26 21:39:30 ROUTER-75 [ 429.703151] Stack: Mar 26 21:39:30 ROUTER-75 00000000 Mar 26 21:39:30 ROUTER-75 00010253 Mar 26 21:39:30 ROUTER-75 c021b0b5 Mar 26 21:39:30 ROUTER-75 0000000e Mar 26 21:39:30 ROUTER-75 [ 429.703509] Mar 26 21:39:30 ROUTER-75 f2dd8a00 Mar 26 21:39:30 ROUTER-75 0000000e Mar 26 21:39:30 ROUTER-75 0000001c Mar 26 21:39:30 ROUTER-75 f2093ccc Mar 26 21:39:30 ROUTER-75 00010000 Mar 26 21:39:30 ROUTER-75 f2dd8a00 Mar 26 21:39:30 ROUTER-75 ffffffed Mar 26 21:39:30 ROUTER-75 00010253 Mar 26 21:39:30 ROUTER-75 00000000 Mar 26 21:39:30 ROUTER-75 00000000 Mar 26 21:39:30 ROUTER-75 Mar 26 21:39:30 ROUTER-75 [ 429.704311] [<c011ab71>] Mar 26 21:39:30 ROUTER-75 tasklet_action+0x4b/0xa4 Mar 26 21:39:30 ROUTER-75 qdisc_tree_decrease_qlen+0x39/0x4f Mar 26 21:39:30 ROUTER-75 [ 429.704506] [<f8a11ef6>] Mar 26 21:39:30 ROUTER-75 [ 429.705347] [<c0220a90>] Mar 26 21:39:30 ROUTER-75 netlink_sendmsg+0x23b/0x247 Mar 26 21:39:30 ROUTER-75 sock_sendmsg+0xbc/0xd4 Mar 26 21:39:30 ROUTER-75 [ 429.705567] [<c0126545>] Mar 26 21:39:30 ROUTER-75 [ 429.705665] [<c0126545>] Mar 26 21:39:30 ROUTER-75 autoremove_wake_function+0x0/0x35 Mar 26 21:39:30 ROUTER-75 [ 429.705776] [<c0126545>] Mar 26 21:39:30 ROUTER-75 autoremove_wake_function+0x0/0x35 Mar 26 21:39:30 ROUTER-75 verify_iovec+0x3e/0x70 Mar 26 21:39:30 ROUTER-75 [ 429.705958] [<c02050b8>] Mar 26 21:39:30 ROUTER-75 sys_sendmsg+0x194/0x1f9 Mar 26 21:39:30 ROUTER-75 [ 429.706056] [<c020585d>] Mar 26 21:39:30 ROUTER-75 [ 429.706149] [<c0137d13>] Mar 26 21:39:30 ROUTER-75 get_page_from_freelist+0x253/0x2d3 Mar 26 21:39:30 ROUTER-75 [ 429.706264] [<c0137de5>] Mar 26 21:39:30 ROUTER-75 __alloc_pages+0x52/0x286 Mar 26 21:39:30 ROUTER-75 __handle_mm_fault+0x409/0x60a Mar 26 21:39:30 ROUTER-75 [ 429.706494] [<c0205f58>] Mar 26 21:39:30 ROUTER-75 sys_socketcall+0x223/0x242 Mar 26 21:39:30 ROUTER-75 [ 429.706596] [<c010f378>] Mar 26 21:39:30 ROUTER-75 [ 429.706681] [<c0102c8c>] Mar 26 21:39:30 ROUTER-75 sysenter_past_esp+0x5d/0x81 Mar 26 21:39:30 ROUTER-75 [ 429.706794] ======================= Mar 26 21:39:30 ROUTER-75 ff Mar 26 21:39:30 ROUTER-75 c7 Mar 26 21:39:30 ROUTER-75 44 Mar 26 21:39:30 ROUTER-75 00 Mar 26 21:39:30 ROUTER-75 00 Mar 26 21:39:30 ROUTER-75 24 Mar 26 21:39:30 ROUTER-75 1c Mar 26 21:39:30 ROUTER-75 20 Mar 26 21:39:30 ROUTER-75 5b Mar 26 21:39:30 ROUTER-75 5d Mar 26 21:39:30 ROUTER-75 c3 Mar 26 21:39:30 ROUTER-75 c6 Mar 26 21:39:30 ROUTER-75 53 Mar 26 21:39:30 ROUTER-75 83 Mar 26 21:39:30 ROUTER-75 ec Mar 26 21:39:30 ROUTER-75 42 Mar 26 21:39:30 ROUTER-75 74 Mar 26 21:39:30 ROUTER-75 00 Mar 26 21:39:30 ROUTER-75 75 Mar 26 21:39:30 ROUTER-75 83 Mar 26 21:39:30 ROUTER-75 ba Mar 26 21:39:30 ROUTER-75 00 Mar 26 21:39:30 ROUTER-75 00 Mar 26 21:39:30 ROUTER-75 24 Mar 26 21:39:30 ROUTER-75 c7 Mar 26 21:39:30 ROUTER-75 Mar 26 21:39:30 ROUTER-75 [ 429.709043] EIP: [<f8a11df1>] Mar 26 21:39:30 ROUTER-75 SS:ESP 0068:f2093c40 I am trying to remove command, that executed at interface removal, and moving it to ip-up stage (so when it init interface, it will try to delete old classes first). -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: more... iproute2/htb/whatever critical bug ? 2007-03-26 18:49 ` more... iproute2/htb/whatever critical " Denys @ 2007-03-27 15:10 ` Patrick McHardy 2007-03-27 15:18 ` Denys 2007-03-27 16:00 ` Denys 0 siblings, 2 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-27 15:10 UTC (permalink / raw) To: Denys; +Cc: Stephen Hemminger, netdev [-- Attachment #1: Type: text/plain, Size: 1154 bytes --] Denys wrote: > Mar 26 21:20:04 ROUTER-75 [ 551.481081] BUG: unable to handle kernel NULL > pointer dereference > Mar 26 21:20:04 ROUTER-75 at virtual address 00000074 > Mar 26 21:20:04 ROUTER-75 [ 551.481187] printing eip: > Mar 26 21:20:04 ROUTER-75 [ 551.481236] f8a11df1 > Mar 26 21:20:04 ROUTER-75 [ 551.481289] *pde = 00000000 > Mar 26 21:20:04 ROUTER-75 [ 551.481340] Oops: 0000 [#1] > Mar 26 21:20:04 ROUTER-75 [ 551.481384] > Mar 26 21:20:04 ROUTER-75 SMP > Mar 26 21:20:04 ROUTER-75 > Mar 26 21:20:04 ROUTER-75 [ 551.481549] Modules linked in: > .... long module list > ar 26 21:20:04 ROUTER-75 [ 551.485237] CPU: 0 > Mar 26 21:20:04 ROUTER-75 [ 551.485238] EIP: 0060:[<f8a11df1>] Not > tainted VLI > Mar 26 21:20:04 ROUTER-75 [ 551.485239] EFLAGS: 00010282 (2.6.20.3-build- > 0001 #4) > Mar 26 21:20:04 ROUTER-75 [ 551.485438] EIP is at htb_qlen_notify+0x9/0x79 > [sch_htb] Oops, that seems to be my fault. Can you please try the attached patch? To reproduce the problem you need to have packets queued while the device is going down, so please make sure that is true by flooding the device or something like that. [-- Attachment #2: x --] [-- Type: text/plain, Size: 1334 bytes --] [NET_SCHED]: sch_htb: fix oops in htb_qlen_notify htb_delete calls qdisc_tree_decrease_qlen after removing the class from the class hash. This makes the ->get operation in qdisc_tree_decrease_qlen fail, so it passes a NULL pointer to htb_qlen_notify, causing an oops. Signed-off-by: Patrick McHardy <kaber@trash.net> --- commit 866d03284bf4ae694c8c12c1742dde1c5c06eca7 tree e43ac244d44b2422f0acb92dc9a48a66f92c2792 parent 703071b5b93d88d5acb0edd5b9dd86c69ad970f2 author Patrick McHardy <kaber@trash.net> Tue, 27 Mar 2007 17:07:50 +0200 committer Patrick McHardy <kaber@trash.net> Tue, 27 Mar 2007 17:07:50 +0200 net/sched/sch_htb.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 97cbb9a..3c3294d 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1380,15 +1380,15 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) sch_tree_lock(sch); - /* delete from hash and active; remainder in destroy_class */ - hlist_del_init(&cl->hlist); - if (!cl->level) { qlen = cl->un.leaf.q->q.qlen; qdisc_reset(cl->un.leaf.q); qdisc_tree_decrease_qlen(cl->un.leaf.q, qlen); } + /* delete from hash and active; remainder in destroy_class */ + hlist_del_init(&cl->hlist); + if (cl->prio_activity) htb_deactivate(q, cl); ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: more... iproute2/htb/whatever critical bug ? 2007-03-27 15:10 ` Patrick McHardy @ 2007-03-27 15:18 ` Denys 2007-03-27 16:00 ` Denys 1 sibling, 0 replies; 41+ messages in thread From: Denys @ 2007-03-27 15:18 UTC (permalink / raw) To: Patrick McHardy; +Cc: Stephen Hemminger, netdev Yes, correct, i am able to reproduce problem, i was scared i will be not able. Just when i am killing pppd on "server" side, flooding interface, it's happens. I will try patch in next 30 minutes max. I have also bug, related to mirred to ifb0 and ethernet checksum offloading, but maybe it is fixed. I will try to remember, to tell about it. I dont have right now local computer with checksum offloading capable card, it also causing kernel panic, but i workaround it by ethtool, disabling feature. On Tue, 27 Mar 2007 17:10:36 +0200, Patrick McHardy wrote > Denys wrote: > > Mar 26 21:20:04 ROUTER-75 [ 551.481081] BUG: unable to handle kernel NULL > > pointer dereference > > Mar 26 21:20:04 ROUTER-75 at virtual address 00000074 > > Mar 26 21:20:04 ROUTER-75 [ 551.481187] printing eip: > > Mar 26 21:20:04 ROUTER-75 [ 551.481236] f8a11df1 > > Mar 26 21:20:04 ROUTER-75 [ 551.481289] *pde = 00000000 > > Mar 26 21:20:04 ROUTER-75 [ 551.481340] Oops: 0000 [#1] > > Mar 26 21:20:04 ROUTER-75 [ 551.481384] > > Mar 26 21:20:04 ROUTER-75 SMP > > Mar 26 21:20:04 ROUTER-75 > > Mar 26 21:20:04 ROUTER-75 [ 551.481549] Modules linked in: > > .... long module list > > ar 26 21:20:04 ROUTER-75 [ 551.485237] CPU: 0 > > Mar 26 21:20:04 ROUTER-75 [ 551.485238] EIP: 0060:[<f8a11df1>] Not > > tainted VLI > > Mar 26 21:20:04 ROUTER-75 [ 551.485239] EFLAGS: 00010282 (2.6.20.3- build- > > 0001 #4) > > Mar 26 21:20:04 ROUTER-75 [ 551.485438] EIP is at htb_qlen_notify+0x9/ 0x79 > > [sch_htb] > > Oops, that seems to be my fault. Can you please try the attached patch? > To reproduce the problem you need to have packets queued while the > device is going down, so please make sure that is true by flooding > the device or something like that. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: more... iproute2/htb/whatever critical bug ? 2007-03-27 15:10 ` Patrick McHardy 2007-03-27 15:18 ` Denys @ 2007-03-27 16:00 ` Denys 2007-03-27 16:21 ` Patrick McHardy 1 sibling, 1 reply; 41+ messages in thread From: Denys @ 2007-03-27 16:00 UTC (permalink / raw) To: Patrick McHardy; +Cc: Stephen Hemminger, netdev Hi again Looks like everything is fine, i will deploy at few NAS today night and will see if there any issues. Both oops what i gave, is it related to this bug? No need to dig anything else? On Tue, 27 Mar 2007 17:10:36 +0200, Patrick McHardy wrote > Denys wrote: > > Mar 26 21:20:04 ROUTER-75 [ 551.481081] BUG: unable to handle kernel NULL > > pointer dereference > > Mar 26 21:20:04 ROUTER-75 at virtual address 00000074 > > Mar 26 21:20:04 ROUTER-75 [ 551.481187] printing eip: > > Mar 26 21:20:04 ROUTER-75 [ 551.481236] f8a11df1 > > Mar 26 21:20:04 ROUTER-75 [ 551.481289] *pde = 00000000 > > Mar 26 21:20:04 ROUTER-75 [ 551.481340] Oops: 0000 [#1] > > Mar 26 21:20:04 ROUTER-75 [ 551.481384] > > Mar 26 21:20:04 ROUTER-75 SMP > > Mar 26 21:20:04 ROUTER-75 > > Mar 26 21:20:04 ROUTER-75 [ 551.481549] Modules linked in: > > .... long module list > > ar 26 21:20:04 ROUTER-75 [ 551.485237] CPU: 0 > > Mar 26 21:20:04 ROUTER-75 [ 551.485238] EIP: 0060:[<f8a11df1>] Not > > tainted VLI > > Mar 26 21:20:04 ROUTER-75 [ 551.485239] EFLAGS: 00010282 (2.6.20.3- build- > > 0001 #4) > > Mar 26 21:20:04 ROUTER-75 [ 551.485438] EIP is at htb_qlen_notify+0x9/ 0x79 > > [sch_htb] > > Oops, that seems to be my fault. Can you please try the attached patch? > To reproduce the problem you need to have packets queued while the > device is going down, so please make sure that is true by flooding > the device or something like that. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: more... iproute2/htb/whatever critical bug ? 2007-03-27 16:00 ` Denys @ 2007-03-27 16:21 ` Patrick McHardy 2007-03-28 17:38 ` another " Denys 2007-03-28 23:55 ` Denys 0 siblings, 2 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-27 16:21 UTC (permalink / raw) To: Denys; +Cc: Stephen Hemminger, netdev Denys wrote: > Hi again > > Looks like everything is fine, i will deploy at few NAS today night and will > see if there any issues. Thanks for testing. > Both oops what i gave, is it related to this bug? Yes, they were both because of the same bug. ^ permalink raw reply [flat|nested] 41+ messages in thread
* another critical bug ? 2007-03-27 16:21 ` Patrick McHardy @ 2007-03-28 17:38 ` Denys 2007-03-28 23:55 ` Denys 1 sibling, 0 replies; 41+ messages in thread From: Denys @ 2007-03-28 17:38 UTC (permalink / raw) To: netdev Something more With all kernel debug enabled it was not giving this panic (maybe cause system becomes too slow). vanilla kernel 2.6.20.3 with htb patch applied, ethernet cards RTL8139 If u need anythign more - inform me. Not sure it is iproute2, but if you can, just point me to right direction, to who i need to report, it is also happening on interface flood, when i bring it down: (some data not accepted by kernel maillist, changed to "*") Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: Session 1 closed for client 00:16:ec:7e:47:ea (172.16.102.2) on eth1 Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: Sent PADT Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: PADT: Generic-Error: ****** Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: PADT: Generic-Error: Received PADT from peer Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: PADT: Generic-Error: ********* Mar 28 22:14:36 OFFICE-PPPOE pppoe-server[1456]: Sent PADT Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148236] BUG: unable to handle kernel paging request Mar 28 20:13:29 OFFICE-PPPOE at virtual address 5b5a596c Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148497] printing eip: Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148625] *pde = 00000000 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148743] Oops: 0000 [#1] Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148798] Mar 28 20:13:29 OFFICE-PPPOE SMP Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.148992] Modules linked in: Mar 28 20:13:29 OFFICE-PPPOE netconsole Mar 28 20:13:29 OFFICE-PPPOE xt_mac Mar 28 20:13:29 OFFICE-PPPOE xt_tcpmss Mar 28 20:13:29 OFFICE-PPPOE ipt_TCPMSS Mar 28 20:13:29 OFFICE-PPPOE ipt_REJECT Mar 28 20:13:29 OFFICE-PPPOE ts_bm Mar 28 20:13:29 OFFICE-PPPOE xt_string Mar 28 20:13:29 OFFICE-PPPOE ipt_ttl Mar 28 20:13:29 OFFICE-PPPOE ifb Mar 28 20:13:29 OFFICE-PPPOE iptable_mangle Mar 28 20:13:29 OFFICE-PPPOE xt_MARK Mar 28 20:13:29 OFFICE-PPPOE xt_mark Mar 28 20:13:29 OFFICE-PPPOE pppoe Mar 28 20:13:29 OFFICE-PPPOE pppox Mar 28 20:13:29 OFFICE-PPPOE ppp_generic Mar 28 20:13:29 OFFICE-PPPOE slhc Mar 28 20:13:29 OFFICE-PPPOE xt_tcpudp Mar 28 20:13:29 OFFICE-PPPOE em_nbyte Mar 28 20:13:29 OFFICE-PPPOE cls_tcindex Mar 28 20:13:29 OFFICE-PPPOE act_gact Mar 28 20:13:29 OFFICE-PPPOE cls_rsvp Mar 28 20:13:29 OFFICE-PPPOE sch_htb Mar 28 20:13:29 OFFICE-PPPOE cls_fw Mar 28 20:13:29 OFFICE-PPPOE act_mirred Mar 28 20:13:29 OFFICE-PPPOE em_u32 Mar 28 20:13:29 OFFICE-PPPOE sch_red Mar 28 20:13:29 OFFICE-PPPOE sch_sfq Mar 28 20:13:29 OFFICE-PPPOE sch_tbf Mar 28 20:13:29 OFFICE-PPPOE sch_teql Mar 28 20:13:29 OFFICE-PPPOE cls_basic Mar 28 20:13:29 OFFICE-PPPOE sch_gred Mar 28 20:13:29 OFFICE-PPPOE act_pedit Mar 28 20:13:29 OFFICE-PPPOE sch_hfsc Mar 28 20:13:29 OFFICE-PPPOE cls_rsvp6 Mar 28 20:13:29 OFFICE-PPPOE sch_ingress Mar 28 20:13:29 OFFICE-PPPOE em_meta Mar 28 20:13:29 OFFICE-PPPOE em_text Mar 28 20:13:29 OFFICE-PPPOE act_ipt Mar 28 20:13:29 OFFICE-PPPOE sch_dsmark Mar 28 20:13:29 OFFICE-PPPOE sch_prio Mar 28 20:13:29 OFFICE-PPPOE sch_netem Mar 28 20:13:29 OFFICE-PPPOE act_simple Mar 28 20:13:29 OFFICE-PPPOE cls_u32 Mar 28 20:13:29 OFFICE-PPPOE em_cmp Mar 28 20:13:29 OFFICE-PPPOE sch_cbq Mar 28 20:13:29 OFFICE-PPPOE cls_route Mar 28 20:13:29 OFFICE-PPPOE iptable_nat Mar 28 20:13:29 OFFICE-PPPOE nf_conntrack_ipv4 Mar 28 20:13:29 OFFICE-PPPOE ipt_LOG Mar 28 20:13:29 OFFICE-PPPOE ipt_MASQUERADE Mar 28 20:13:29 OFFICE-PPPOE ipt_REDIRECT Mar 28 20:13:29 OFFICE-PPPOE nf_nat Mar 28 20:13:29 OFFICE-PPPOE nf_conntrack Mar 28 20:13:29 OFFICE-PPPOE nfnetlink Mar 28 20:13:29 OFFICE-PPPOE iptable_filter Mar 28 20:13:29 OFFICE-PPPOE ip_tables Mar 28 20:13:29 OFFICE-PPPOE x_tables Mar 28 20:13:29 OFFICE-PPPOE 8021q Mar 28 20:13:29 OFFICE-PPPOE tun Mar 28 20:13:29 OFFICE-PPPOE via_velocity Mar 28 20:13:29 OFFICE-PPPOE via_rhine Mar 28 20:13:29 OFFICE-PPPOE sis900 Mar 28 20:13:29 OFFICE-PPPOE ne2k_pci Mar 28 20:13:29 OFFICE-PPPOE 8390 Mar 28 20:13:29 OFFICE-PPPOE skge Mar 28 20:13:29 OFFICE-PPPOE tg3 Mar 28 20:13:29 OFFICE-PPPOE 8139too Mar 28 20:13:29 OFFICE-PPPOE e1000 Mar 28 20:13:29 OFFICE-PPPOE e100 Mar 28 20:13:29 OFFICE-PPPOE block2mtd Mar 28 20:13:29 OFFICE-PPPOE usb_storage Mar 28 20:13:29 OFFICE-PPPOE mtdblock Mar 28 20:13:29 OFFICE-PPPOE mtd_blkdevs Mar 28 20:13:29 OFFICE-PPPOE usbhid Mar 28 20:13:29 OFFICE-PPPOE uhci_hcd Mar 28 20:13:29 OFFICE-PPPOE ehci_hcd Mar 28 20:13:29 OFFICE-PPPOE ohci_hcd Mar 28 20:13:29 OFFICE-PPPOE usbcore Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.153713] CPU: 0 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.153716] EIP: 0060:[<c02113c7>] Not tainted VLI Mar 28 20:13:29 OFFICE-PPPOE [ 1758.153718] EFLAGS: 00010202 (2.6.20.3- build-0005 #18) Mar 28 20:13:29 OFFICE-PPPOE [ 1758.153949] EIP is at netif_rx+0x18/0x126 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154009] eax: c0c42800 ebx: 5b5a5958 ecx: 00000001 edx: c6541c80 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154073] esi: c0319800 edi: c6541c80 ebp: c02f5f14 esp: c02f5f04 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154135] ds: 007b es: 007b ss: 0068 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154253] Process swapper (pid: 0, ti=c02f4000 task=c02cd440 task.ti=c02f4000) Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154316] Stack: Mar 28 20:13:29 OFFICE-PPPOE c020c2bc Mar 28 20:13:29 OFFICE-PPPOE c0319c00 Mar 28 20:13:29 OFFICE-PPPOE c0319800 Mar 28 20:13:29 OFFICE-PPPOE 00000007 Mar 28 20:13:29 OFFICE-PPPOE c02f5f24 Mar 28 20:13:29 OFFICE-PPPOE c8a1925d Mar 28 20:13:29 OFFICE-PPPOE c0319c5c Mar 28 20:13:29 OFFICE-PPPOE 00000000 Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.154880] Mar 28 20:13:29 OFFICE-PPPOE c02f5f34 Mar 28 20:13:29 OFFICE-PPPOE c011bd9a Mar 28 20:13:29 OFFICE-PPPOE 00000001 Mar 28 20:13:29 OFFICE-PPPOE c02ea328 Mar 28 20:13:29 OFFICE-PPPOE c02f5f4c Mar 28 20:13:29 OFFICE-PPPOE c011ba5c Mar 28 20:13:29 OFFICE-PPPOE 00000000 Mar 28 20:13:29 OFFICE-PPPOE 00000046 Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.155536] Mar 28 20:13:29 OFFICE-PPPOE 00000000 Mar 28 20:13:29 OFFICE-PPPOE c1105164 Mar 28 20:13:29 OFFICE-PPPOE c02f5f58 Mar 28 20:13:29 OFFICE-PPPOE c011baf3 Mar 28 20:13:29 OFFICE-PPPOE 00000011 Mar 28 20:13:29 OFFICE-PPPOE c02f5f60 Mar 28 20:13:29 OFFICE-PPPOE c011bcb2 Mar 28 20:13:29 OFFICE-PPPOE c02f5f7c Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156124] Call Trace: Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156293] [<c0103d08>] Mar 28 20:13:29 OFFICE-PPPOE show_trace_log_lvl+0x1a/0x2f Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156402] [<c0103dba>] Mar 28 20:13:29 OFFICE-PPPOE show_stack_log_lvl+0x9d/0xa5 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156564] [<c0103f5c>] Mar 28 20:13:29 OFFICE-PPPOE show_registers+0x19a/0x270 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156674] [<c010414b>] Mar 28 20:13:29 OFFICE-PPPOE die+0x119/0x231 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156833] [<c0110853>] Mar 28 20:13:29 OFFICE-PPPOE do_page_fault+0x443/0x51c Mar 28 20:13:29 OFFICE-PPPOE [ 1758.156943] [<c026697c>] Mar 28 20:13:29 OFFICE-PPPOE error_code+0x7c/0x84 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157119] [<c8a1925d>] Mar 28 20:13:29 OFFICE-PPPOE ri_tasklet+0xd3/0x196 [ifb] Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157227] [<c011bd9a>] Mar 28 20:13:29 OFFICE-PPPOE tasklet_action+0x4e/0xa8 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157401] [<c011ba5c>] Mar 28 20:13:29 OFFICE-PPPOE __do_softirq+0x64/0xc6 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157504] [<c011baf3>] Mar 28 20:13:29 OFFICE-PPPOE do_softirq+0x35/0x3a Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157665] [<c011bcb2>] Mar 28 20:13:29 OFFICE-PPPOE irq_exit+0x38/0x3a Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157768] [<c01053b1>] Mar 28 20:13:29 OFFICE-PPPOE do_IRQ+0x8a/0x9d Mar 28 20:13:29 OFFICE-PPPOE [ 1758.157927] [<c0103763>] Mar 28 20:13:29 OFFICE-PPPOE common_interrupt+0x23/0x28 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158029] [<c0101363>] Mar 28 20:13:29 OFFICE-PPPOE cpu_idle+0x61/0x76 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158188] [<c010068f>] Mar 28 20:13:29 OFFICE-PPPOE rest_init+0x23/0x28 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158288] [<c02f977d>] Mar 28 20:13:29 OFFICE-PPPOE start_kernel+0x387/0x38f Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158448] [<00000000>] Mar 28 20:13:29 OFFICE-PPPOE _stext+0x3feffc6c/0x19 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158550] ======================= Mar 28 20:13:29 OFFICE-PPPOE [ 1758.158666] Code: Mar 28 20:13:29 OFFICE-PPPOE ed Mar 28 20:13:29 OFFICE-PPPOE 53 Mar 28 20:13:29 OFFICE-PPPOE 05 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE bf Mar 28 20:13:29 OFFICE-PPPOE 01 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 83 Mar 28 20:13:29 OFFICE-PPPOE c4 Mar 28 20:13:29 OFFICE-PPPOE 1c Mar 28 20:13:29 OFFICE-PPPOE 89 Mar 28 20:13:29 OFFICE-PPPOE f8 Mar 28 20:13:29 OFFICE-PPPOE 5b Mar 28 20:13:29 OFFICE-PPPOE 5e Mar 28 20:13:29 OFFICE-PPPOE 5f Mar 28 20:13:29 OFFICE-PPPOE 5d Mar 28 20:13:29 OFFICE-PPPOE c3 Mar 28 20:13:29 OFFICE-PPPOE 55 Mar 28 20:13:29 OFFICE-PPPOE 89 Mar 28 20:13:29 OFFICE-PPPOE e5 Mar 28 20:13:29 OFFICE-PPPOE 57 Mar 28 20:13:29 OFFICE-PPPOE 89 Mar 28 20:13:29 OFFICE-PPPOE c7 Mar 28 20:13:29 OFFICE-PPPOE 56 Mar 28 20:13:29 OFFICE-PPPOE 53 Mar 28 20:13:29 OFFICE-PPPOE 83 Mar 28 20:13:29 OFFICE-PPPOE ec Mar 28 20:13:29 OFFICE-PPPOE 04 Mar 28 20:13:29 OFFICE-PPPOE 8b Mar 28 20:13:29 OFFICE-PPPOE 40 Mar 28 20:13:29 OFFICE-PPPOE 14 Mar 28 20:13:29 OFFICE-PPPOE 8b Mar 28 20:13:29 OFFICE-PPPOE 98 Mar 28 20:13:29 OFFICE-PPPOE e4 Mar 28 20:13:29 OFFICE-PPPOE 02 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 85 Mar 28 20:13:29 OFFICE-PPPOE db Mar 28 20:13:29 OFFICE-PPPOE 74 Mar 28 20:13:29 OFFICE-PPPOE 38 Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE 7b Mar 28 20:13:29 OFFICE-PPPOE 14 Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 75 Mar 28 20:13:29 OFFICE-PPPOE 06 Mar 28 20:13:29 OFFICE-PPPOE 83 Mar 28 20:13:29 OFFICE-PPPOE 7b Mar 28 20:13:29 OFFICE-PPPOE 0c Mar 28 20:13:29 OFFICE-PPPOE 00 Mar 28 20:13:29 OFFICE-PPPOE 74 Mar 28 20:13:29 OFFICE-PPPOE 2c Mar 28 20:13:29 OFFICE-PPPOE 8d Mar 28 20:13:29 OFFICE-PPPOE 73 Mar 28 20:13:29 OFFICE-PPPOE 10 Mar 28 20:13:29 OFFICE-PPPOE 89 Mar 28 20:13:29 OFFICE-PPPOE f0 Mar 28 20:13:29 OFFICE-PPPOE e8 Mar 28 20:13:29 OFFICE-PPPOE 3d Mar 28 20:13:29 OFFICE-PPPOE 53 Mar 28 20:13:29 OFFICE-PPPOE 05 Mar 28 20:13:29 OFFICE-PPPOE Mar 28 20:13:29 OFFICE-PPPOE [ 1758.162385] EIP: [<c02113c7>] Mar 28 20:13:29 OFFICE-PPPOE netif_rx+0x18/0x126 Mar 28 20:13:29 OFFICE-PPPOE SS:ESP 0068:c02f5f04 Mar 28 20:13:29 OFFICE-PPPOE [ 1758.162594] Mar 28 20:13:29 OFFICE-PPPOE Kernel panic - not syncing: Fatal exception in interrupt Mar 28 20:13:29 OFFICE-PPPOE [ 1758.162719] Mar 28 20:13:29 OFFICE-PPPOE Rebooting in 10 seconds.. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* another critical bug ? 2007-03-27 16:21 ` Patrick McHardy 2007-03-28 17:38 ` another " Denys @ 2007-03-28 23:55 ` Denys 2007-03-29 10:58 ` Patrick McHardy 1 sibling, 1 reply; 41+ messages in thread From: Denys @ 2007-03-28 23:55 UTC (permalink / raw) To: Patrick McHardy; +Cc: Stephen Hemminger, netdev Tried on 2.6.21-rc5-git3, but preempt enabled Same panic, same place seems Mar 29 02:50:53 LINUX [ 164.644102] BUG: unable to handle kernel paging request Mar 29 02:50:53 LINUX at virtual address 0302014c Mar 29 02:50:53 LINUX [ 164.644242] printing eip: Mar 29 02:50:53 LINUX [ 164.644301] *pde = 00000000 Mar 29 02:50:53 LINUX [ 164.644371] Oops: 0000 [#1] Mar 29 02:50:53 LINUX [ 164.644485] PREEMPT Mar 29 02:50:53 LINUX SMP Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.644629] Modules linked in: .... LIST OF MODULES .... Mar 29 02:50:53 LINUX [ 164.648758] CPU: 0 Mar 29 02:50:53 LINUX [ 164.648760] EIP: 0060:[<c0216e37>] Not tainted VLI Mar 29 02:50:53 LINUX [ 164.648762] EFLAGS: 00010206 (2.6.20.3-build-0002 #14) Mar 29 02:50:53 LINUX [ 164.648948] EIP is at netif_rx+0x12/0x115 Mar 29 02:50:53 LINUX [ 164.649011] eax: c33fb000 ebx: 03020100 ecx: 00000001 edx: c3380d80 Mar 29 02:50:53 LINUX [ 164.649078] esi: c4cfc000 edi: c3380d80 ebp: 00000000 esp: c7fb7f74 Mar 29 02:50:53 LINUX [ 164.649144] ds: 007b es: 007b ss: 0068 preempt: 00000001 Mar 29 02:50:53 LINUX [ 164.649210] Process softirq-tasklet (pid: 9, ti=c7fb6000 task=c7f9f000 task.ti=c7fb6000) Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.649276] Stack: Mar 29 02:50:53 LINUX c4cfc400 Mar 29 02:50:53 LINUX c4cfc000 Mar 29 02:50:53 LINUX 00000000 Mar 29 02:50:53 LINUX c8a1f268 Mar 29 02:50:53 LINUX c4cfc45c Mar 29 02:50:53 LINUX 000f4240 Mar 29 02:50:53 LINUX 00000000 Mar 29 02:50:53 LINUX c011c5d8 Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.649754] Mar 29 02:50:53 LINUX c7fb7fac Mar 29 02:50:53 LINUX c026ba53 Mar 29 02:50:53 LINUX 00000006 Mar 29 02:50:53 LINUX c11c5c98 Mar 29 02:50:53 LINUX c11c5c98 Mar 29 02:50:53 LINUX 00000020 Mar 29 02:50:53 LINUX c011cadf Mar 29 02:50:53 LINUX c011cbc9 Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.650231] Mar 29 02:50:53 LINUX 00000000 Mar 29 02:50:53 LINUX 00000001 Mar 29 02:50:53 LINUX c7fb7fc0 Mar 29 02:50:53 LINUX 00000032 Mar 29 02:50:53 LINUX c11c5c98 Mar 29 02:50:53 LINUX c7fa1ef8 Mar 29 02:50:53 LINUX c0128757 Mar 29 02:50:53 LINUX ffffffff Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.650707] Call Trace: Mar 29 02:50:53 LINUX [ 164.650829] [<c8a1f268>] Mar 29 02:50:53 LINUX ri_tasklet+0xd5/0x1a1 [ifb] Mar 29 02:50:53 LINUX [ 164.650954] [<c011c5d8>] Mar 29 02:50:53 LINUX __tasklet_action+0xe5/0x126 Mar 29 02:50:53 LINUX [ 164.651073] [<c026ba53>] Mar 29 02:50:53 LINUX schedule+0xe0/0xfa Mar 29 02:50:53 LINUX [ 164.651201] [<c011cadf>] Mar 29 02:50:53 LINUX ksoftirqd+0x0/0x178 Mar 29 02:50:53 LINUX [ 164.651312] [<c011cbc9>] Mar 29 02:50:53 LINUX ksoftirqd+0xea/0x178 Mar 29 02:50:53 LINUX [ 164.651443] [<c0128757>] Mar 29 02:50:53 LINUX kthread+0xb2/0xdb Mar 29 02:50:53 LINUX [ 164.651560] [<c01286a5>] Mar 29 02:50:53 LINUX kthread+0x0/0xdb Mar 29 02:50:53 LINUX [ 164.651683] [<c0103a5f>] Mar 29 02:50:53 LINUX kernel_thread_helper+0x7/0x10 Mar 29 02:50:53 LINUX [ 164.651823] ======================= Mar 29 02:50:53 LINUX [ 164.651890] Code: Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX eb Mar 29 02:50:53 LINUX 0c Mar 29 02:50:53 LINUX 89 Mar 29 02:50:53 LINUX d8 Mar 29 02:50:53 LINUX e8 Mar 29 02:50:53 LINUX 7a Mar 29 02:50:53 LINUX 5e Mar 29 02:50:53 LINUX 05 Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX e9 Mar 29 02:50:53 LINUX bb Mar 29 02:50:53 LINUX fe Mar 29 02:50:53 LINUX ff Mar 29 02:50:53 LINUX ff Mar 29 02:50:53 LINUX 83 Mar 29 02:50:53 LINUX c4 Mar 29 02:50:53 LINUX 14 Mar 29 02:50:53 LINUX 89 Mar 29 02:50:53 LINUX f0 Mar 29 02:50:53 LINUX 5b Mar 29 02:50:53 LINUX 5e Mar 29 02:50:53 LINUX 5f Mar 29 02:50:53 LINUX 5d Mar 29 02:50:53 LINUX c3 Mar 29 02:50:53 LINUX 57 Mar 29 02:50:53 LINUX 89 Mar 29 02:50:53 LINUX c7 Mar 29 02:50:53 LINUX 56 Mar 29 02:50:53 LINUX 53 Mar 29 02:50:53 LINUX 8b Mar 29 02:50:53 LINUX 40 Mar 29 02:50:53 LINUX 14 Mar 29 02:50:53 LINUX 8b Mar 29 02:50:53 LINUX 98 Mar 29 02:50:53 LINUX e4 Mar 29 02:50:53 LINUX 02 Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX 85 Mar 29 02:50:53 LINUX db Mar 29 02:50:53 LINUX 74 Mar 29 02:50:53 LINUX 32 Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX 7b Mar 29 02:50:53 LINUX 4c Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX 75 Mar 29 02:50:53 LINUX 06 Mar 29 02:50:53 LINUX 83 Mar 29 02:50:53 LINUX 7b Mar 29 02:50:53 LINUX 28 Mar 29 02:50:53 LINUX 00 Mar 29 02:50:53 LINUX 74 Mar 29 02:50:53 LINUX 26 Mar 29 02:50:53 LINUX 8d Mar 29 02:50:53 LINUX 73 Mar 29 02:50:53 LINUX 2c Mar 29 02:50:53 LINUX 89 Mar 29 02:50:53 LINUX f0 Mar 29 02:50:53 LINUX e8 Mar 29 02:50:53 LINUX 65 Mar 29 02:50:53 LINUX 5e Mar 29 02:50:53 LINUX 05 Mar 29 02:50:53 LINUX Mar 29 02:50:53 LINUX [ 164.654978] EIP: [<c0216e37>] Mar 29 02:50:53 LINUX netif_rx+0x12/0x115 Mar 29 02:50:53 LINUX SS:ESP 0068:c7fb7f74 Mar 29 02:50:53 LINUX [ 164.655137] Mar 29 02:50:53 LINUX Kernel panic - not syncing: Fatal exception Mar 29 02:50:53 LINUX [ 164.655280] [<c0118387>] Mar 29 02:50:53 LINUX panic+0x50/0xf1 Mar 29 02:50:53 LINUX [ 164.655433] [<c0104327>] Mar 29 02:50:53 LINUX die+0x207/0x23b Mar 29 02:50:53 LINUX [ 164.655644] [<c0110879>] Mar 29 02:50:53 LINUX do_page_fault+0x46c/0x543 Mar 29 02:50:53 LINUX [ 164.655803] [<c026b781>] Mar 29 02:50:53 LINUX __sched_text_start+0xba1/0xcc8 Mar 29 02:50:53 LINUX [ 164.655973] [<c011040d>] Mar 29 02:50:53 LINUX do_page_fault+0x0/0x543 Mar 29 02:50:53 LINUX [ 164.656114] [<c026d304>] Mar 29 02:50:53 LINUX error_code+0x7c/0x84 Mar 29 02:50:53 LINUX [ 164.656305] [<c0216e37>] Mar 29 02:50:53 LINUX netif_rx+0x12/0x115 Mar 29 02:50:53 LINUX [ 164.656462] [<c8a1f268>] Mar 29 02:50:53 LINUX ri_tasklet+0xd5/0x1a1 [ifb] Mar 29 02:50:53 LINUX [ 164.656609] [<c011c5d8>] Mar 29 02:50:53 LINUX __tasklet_action+0xe5/0x126 Mar 29 02:50:53 LINUX [ 164.656746] [<c026ba53>] Mar 29 02:50:53 LINUX schedule+0xe0/0xfa Mar 29 02:50:53 LINUX [ 164.656893] [<c011cadf>] Mar 29 02:50:53 LINUX ksoftirqd+0x0/0x178 Mar 29 02:50:53 LINUX [ 164.657023] [<c011cbc9>] Mar 29 02:50:53 LINUX ksoftirqd+0xea/0x178 Mar 29 02:50:53 LINUX [ 164.657190] [<c0128757>] Mar 29 02:50:53 LINUX kthread+0xb2/0xdb Mar 29 02:50:53 LINUX [ 164.657328] [<c01286a5>] Mar 29 02:50:53 LINUX kthread+0x0/0xdb Mar 29 02:50:53 LINUX [ 164.657466] [<c0103a5f>] Mar 29 02:50:53 LINUX kernel_thread_helper+0x7/0x10 Mar 29 02:50:53 LINUX [ 164.657628] ======================= Mar 29 02:50:53 LINUX [ 164.657703] Mar 29 02:50:53 LINUX Rebooting in 10 seconds.. -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: another critical bug ? 2007-03-28 23:55 ` Denys @ 2007-03-29 10:58 ` Patrick McHardy 0 siblings, 0 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-29 10:58 UTC (permalink / raw) To: Denys; +Cc: Stephen Hemminger, netdev Denys wrote: > Tried on 2.6.21-rc5-git3, but preempt enabled > Same panic, same place seems > > Mar 29 02:50:53 LINUX [ 164.644102] BUG: unable to handle kernel paging > request > Mar 29 02:50:53 LINUX at virtual address 0302014c > Mar 29 02:50:53 LINUX [ 164.648948] EIP is at netif_rx+0x12/0x115 >[...] > Mar 29 02:50:53 LINUX [ 164.650829] [<c8a1f268>] > Mar 29 02:50:53 LINUX ri_tasklet+0xd5/0x1a1 [ifb] Its somehow related to ifb. Please send your .config and the commands you use to setup ifb. ^ permalink raw reply [flat|nested] 41+ messages in thread
* more iproute2 issues (not critical) 2007-03-22 17:12 ` Stephen Hemminger 2007-03-22 17:14 ` Patrick McHardy @ 2007-03-31 2:26 ` Denys 2007-03-31 2:31 ` Denys 2007-04-04 0:03 ` one more... iproute commands lockup whole system Denys 2 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-03-31 2:26 UTC (permalink / raw) To: Stephen Hemminger, Patrick McHardy; +Cc: netdev While running tc monitor defaulthost ~ #/sbin/tc2 monitor qdisc prio 1: dev if92 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc tbf 2: dev if92 parent 1:1 rate 121600bit burst 512Kb peakrate 1280Kbit minburst 16Kb lat 500.0ms filter dev if92 parent 1: protocol ip pref 5 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 3:1 match c2929918/ffffffff at 12 filter dev if92 parent 1: protocol ip pref 5 u32 fh 800::801 order 2049 key ht 800 bkt 0 flowid 3:1 match c292991a/ffffffff at 12 filter dev if92 parent 1: protocol ip pref 5 u32 fh 800::802 order 2050 key ht 800 bkt 0 flowid 3:1 match 02020266/ffffffff at 12 filter dev if92 parent 1: protocol ip pref 5 u32 fh 800::803 order 2051 key ht 800 bkt 0 flowid 3:1 match 02020269/ffffffff at 12 filter dev if92 parent 1: protocol ip pref 5 u32 fh 800::804 order 2052 key ht 800 bkt 0 flowid 3:1 match 0202026a/ffffffff at 12 filter dev if92 parent 1: protocol ip pref 10 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 2:1 match 00000000/00000000 at 16 deleted filter dev ifb0 parent 1: protocol ip pref 117 fw deleted class htb 1:117 dev ifb0 root leaf 117: prio 0 rate 32000bit ceil 32000bit burst 1604b cburst 1604b class htb 1:117 dev ifb0 root prio 0 rate 32000bit ceil 32000bit burst 1604b cburst 1604b qdisc sfq 117: dev ifb0 parent 1:117 limit 128p quantum 1514b perturb 5sec filter dev ifb0 parent 1: protocol ip pref 117 fw handle 0x75 classid 1:117 qdisc ingress ffff: dev if92 parent ffff:fff1 ---------------- filter dev if92 parent ffff: protocol ip pref 5 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 match 02020266/ffffffff at 16 filter dev if92 parent ffff: protocol ip pref 5 u32 fh 800::801 order 2049 key ht 800 bkt 0 flowid 1:1 match 02020269/ffffffff at 16 filter dev if92 parent ffff: protocol ip pref 5 u32 fh 800::802 order 2050 key ht 800 bkt 0 flowid 1:1 match 0202026a/ffffffff at 16 filter dev if92 parent ffff: protocol ip pref 5 u32 fh 800::803 order 2051 key ht 800 bkt 0 flowid 1:1 match c2929918/ffffffff at 16 filter dev if92 parent ffff: protocol ip pref 5 u32 fh 800::804 order 2052 key ht 800 bkt 0 flowid 1:1 match c292991a/ffffffff at 16 filter dev if92 parent ffff: protocol ip pref 10 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:1 match 00000000/00000000 at 0 action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING Segmentation fault (core dumped) Program terminated with signal 11, Segmentation fault. #0 0x08069f38 in get_target_name (name=0xbfa397f2 "MARK") at m_ipt.c:219 219 char path[strlen(lib_dir) + sizeof ("/libipt_.so") + strlen(name)]; Do you need "bt full" ? Compiled from git tree, but not latest (a little bit more fresh than latest release). ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: more iproute2 issues (not critical) 2007-03-31 2:26 ` more iproute2 issues (not critical) Denys @ 2007-03-31 2:31 ` Denys 2007-03-31 14:16 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-03-31 2:31 UTC (permalink / raw) To: Denys, Stephen Hemminger, Patrick McHardy; +Cc: netdev Ooops, sorry, it seems my fault, no library exist on this system. But i guess it must not coredump in this case? Is it possible to check if library not exist and just print some nice message? It is trivial i guess. On Sat, 31 Mar 2007 05:26:00 +0300, Denys wrote > While running tc monitor > > defaulthost ~ #/sbin/tc2 monitor > qdisc prio 1: dev if92 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 > 1 1 1 1 qdisc tbf 2: dev if92 parent 1:1 rate 121600bit burst 512Kb > peakrate 1280Kbit minburst 16Kb lat 500.0ms filter dev if92 parent > 1: protocol ip pref 5 u32 fh 800::800 order 2048 key ht 800 bkt 0 > flowid 3:1 match c2929918/ffffffff at 12 filter dev if92 parent 1: > protocol ip pref 5 u32 fh 800::801 order 2049 key ht 800 bkt 0 > flowid 3:1 match c292991a/ffffffff at 12 filter dev if92 parent 1: > protocol ip pref 5 u32 fh 800::802 order 2050 key ht 800 bkt 0 > flowid 3:1 match 02020266/ffffffff at 12 filter dev if92 parent 1: > protocol ip pref 5 u32 fh 800::803 order 2051 key ht 800 bkt 0 > flowid 3:1 match 02020269/ffffffff at 12 filter dev if92 parent 1: > protocol ip pref 5 u32 fh 800::804 order 2052 key ht 800 bkt 0 > flowid 3:1 match 0202026a/ffffffff at 12 filter dev if92 parent 1: > protocol ip pref 10 u32 fh 801::800 order 2048 key ht 801 bkt 0 > flowid 2:1 match 00000000/00000000 at 16 deleted filter dev ifb0 > parent 1: protocol ip pref 117 fw deleted class htb 1:117 dev ifb0 > root leaf 117: prio 0 rate 32000bit ceil 32000bit burst 1604b cburst > 1604b class htb 1:117 dev ifb0 root prio 0 rate 32000bit ceil > 32000bit burst 1604b cburst 1604b qdisc sfq 117: dev ifb0 parent > 1:117 limit 128p quantum 1514b perturb 5sec filter dev ifb0 parent > 1: protocol ip pref 117 fw handle 0x75 classid 1:117 qdisc ingress > ffff: dev if92 parent ffff:fff1 ---------------- filter dev if92 > parent ffff: protocol ip pref 5 u32 fh 800::800 order 2048 key ht > 800 bkt 0 flowid 1:1 match 02020266/ffffffff at 16 filter dev if92 > parent ffff: protocol ip pref 5 u32 fh 800::801 order 2049 key ht > 800 bkt 0 flowid 1:1 match 02020269/ffffffff at 16 filter dev if92 > parent ffff: protocol ip pref 5 u32 fh 800::802 order 2050 key ht > 800 bkt 0 flowid 1:1 match 0202026a/ffffffff at 16 filter dev if92 > parent ffff: protocol ip pref 5 u32 fh 800::803 order 2051 key ht > 800 bkt 0 flowid 1:1 match c2929918/ffffffff at 16 filter dev if92 > parent ffff: protocol ip pref 5 u32 fh 800::804 order 2052 key ht > 800 bkt 0 flowid 1:1 match c292991a/ffffffff at 16 filter dev if92 > parent ffff: protocol ip pref 10 u32 fh 801::800 order 2048 key ht > 801 bkt 0 flowid 1:1 match 00000000/00000000 at 0 action > order 1: tablename: mangle hook: NF_IP_PRE_ROUTING Segmentation > fault (core dumped) > > Program terminated with signal 11, Segmentation fault. > #0 0x08069f38 in get_target_name (name=0xbfa397f2 "MARK") at m_ipt.c:219 > 219 char path[strlen(lib_dir) + sizeof ("/libipt_.so") + > strlen(name)]; > > Do you need "bt full" ? > Compiled from git tree, but not latest (a little bit more fresh than > latest release). > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: more iproute2 issues (not critical) 2007-03-31 2:31 ` Denys @ 2007-03-31 14:16 ` Patrick McHardy 0 siblings, 0 replies; 41+ messages in thread From: Patrick McHardy @ 2007-03-31 14:16 UTC (permalink / raw) To: Denys; +Cc: Stephen Hemminger, netdev [-- Attachment #1: Type: text/plain, Size: 328 bytes --] Denys wrote: > Ooops, sorry, it seems my fault, no library exist on this system. > But i guess it must not coredump in this case? Is it possible to check if > library not exist and just print some nice message? > It is trivial i guess. The problem is that lib_dir is NULL when calling get_target_names. This patch fixes it. [-- Attachment #2: x --] [-- Type: text/plain, Size: 957 bytes --] [IPROUTE]: m_ipt: fix crash when dumping rules lib_dir is NULL when calling get_target_name, causing a NULL pointer dereference in the strlen call. Signed-off-by: Patrick McHardy <kaber@trash.net> --- commit 5093ef7504b7f3d76ec421c5193d11d0e9791a8d tree 4dfa97ecd6aa01f33334f605f152a0d7da2ccde9 parent ab4c2f14fb93700c9aefeb02ed9918565ba332a1 author Patrick McHardy <kaber@trash.net> Sat, 31 Mar 2007 16:14:38 +0200 committer Patrick McHardy <kaber@trash.net> Sat, 31 Mar 2007 16:14:38 +0200 tc/m_ipt.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/tc/m_ipt.c b/tc/m_ipt.c index 38d2311..76fa768 100644 --- a/tc/m_ipt.c +++ b/tc/m_ipt.c @@ -506,6 +506,10 @@ print_ipt(struct action_util *au,FILE * f, struct rtattr *arg) if (arg == NULL) return -1; + lib_dir = getenv("IPTABLES_LIB_DIR"); + if (!lib_dir) + lib_dir = IPT_LIB_DIR; + parse_rtattr_nested(tb, TCA_IPT_MAX, arg); if (tb[TCA_IPT_TABLE] == NULL) { ^ permalink raw reply related [flat|nested] 41+ messages in thread
* one more... iproute commands lockup whole system 2007-03-22 17:12 ` Stephen Hemminger 2007-03-22 17:14 ` Patrick McHardy 2007-03-31 2:26 ` more iproute2 issues (not critical) Denys @ 2007-04-04 0:03 ` Denys 2007-04-04 1:10 ` jamal 2 siblings, 1 reply; 41+ messages in thread From: Denys @ 2007-04-04 0:03 UTC (permalink / raw) To: Stephen Hemminger, Patrick McHardy; +Cc: netdev I'm not sure it is mistake or error, but i feel it is dangerous, cause commands locking up the system, no kernel panic, no oops, so only watchdog can save poor server (and not sure this even) Commands to lockup system (just i am giving my example, i didnt sort out what exactly locked up system, i guess redirecting to eth0.5, which is not intended for that): vconfig add eth0 5 ifconfig eth0.5 192.168.1.2 netmask 255.255.255.128 tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: protocol ip prio 6 u32 \ match ip src 195.69.208.252/32 flowid 1:16 \ action police rate 64kbit burst 90k pipe \ action mirred egress mirror dev eth0.5 -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 0:03 ` one more... iproute commands lockup whole system Denys @ 2007-04-04 1:10 ` jamal 2007-04-04 1:39 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: jamal @ 2007-04-04 1:10 UTC (permalink / raw) To: Denys; +Cc: Stephen Hemminger, Patrick McHardy, netdev On Wed, 2007-04-04 at 03:03 +0300, Denys wrote: > I'm not sure it is mistake or error, but i feel it is dangerous, cause > commands locking up the system, no kernel panic, no oops, so only watchdog > can save poor server (and not sure this even) > > Commands to lockup system (just i am giving my example, i didnt sort out what > exactly locked up system, i guess redirecting to eth0.5, which is not > intended for that): > read: doc/actions/mirred-usage cheers, jamal > vconfig add eth0 5 > ifconfig eth0.5 192.168.1.2 netmask 255.255.255.128 > > tc qdisc add dev eth0 ingress > tc filter add dev eth0 parent ffff: protocol ip prio 6 u32 \ > match ip src 195.69.208.252/32 flowid 1:16 \ > action police rate 64kbit burst 90k pipe \ > action mirred egress mirror dev eth0.5 > > -- > Denys Fedoryshchenko > Technical Manager > Virtual ISP S.A.L. > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 1:10 ` jamal @ 2007-04-04 1:39 ` Patrick McHardy 2007-04-04 2:09 ` jamal 2007-04-04 2:11 ` Denys 0 siblings, 2 replies; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 1:39 UTC (permalink / raw) To: hadi; +Cc: Denys, Stephen Hemminger, netdev jamal wrote: > On Wed, 2007-04-04 at 03:03 +0300, Denys wrote: > >>I'm not sure it is mistake or error, but i feel it is dangerous, cause >>commands locking up the system, no kernel panic, no oops, so only watchdog >>can save poor server (and not sure this even) >> >>Commands to lockup system (just i am giving my example, i didnt sort out what >>exactly locked up system, i guess redirecting to eth0.5, which is not >>intended for that): > > > read: > doc/actions/mirred-usage Are you refering to "What NOT to do if you dont want your machine to crash:"? I think we should make sure users can't even accidentally crash their box, so this should at least be caught at runtime. I thought the TTL stuff was intended to avoid this .. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 1:39 ` Patrick McHardy @ 2007-04-04 2:09 ` jamal 2007-04-04 2:11 ` Denys 1 sibling, 0 replies; 41+ messages in thread From: jamal @ 2007-04-04 2:09 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 03:39 +0200, Patrick McHardy wrote: > Are you refering to "What NOT to do if you dont want your machine to > crash:"? yes. > I think we should make sure users can't even accidentally > crash their box, so this should at least be caught at runtime. It is hard to do without penalizing the common use. Hence the documentation. Sometimes - that is the most sane thing to do. >I thought the TTL stuff was intended to avoid this .. The problem is recursive dev queue locking of the same device. TTL is not very helpful there. I dont want to rehash all those discussions we already had; so if you have a clever way to fix this (without affecting performance of a sane user), please send a patch and lets discuss. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 1:39 ` Patrick McHardy 2007-04-04 2:09 ` jamal @ 2007-04-04 2:11 ` Denys 2007-04-04 10:55 ` jamal 1 sibling, 1 reply; 41+ messages in thread From: Denys @ 2007-04-04 2:11 UTC (permalink / raw) To: Patrick McHardy, hadi; +Cc: Stephen Hemminger, netdev I think this highly useful feature given by jamal, difficult to be avoided from crash, if user not enough experienced in networking(like me). I guess packet can be even not ipv4/ipv6 packet, maybe it can be cloned IPX or ARP, so TTL field cannot be used. I checked maybe sk_buff have some fields, seems also bad luck, if there can be something like "internal" counter for packet, how much times it got redirected, it will help. But in my case of VLAN's it is really my own mistake and difficult to avoid it. Only bad thing - machine got completely locked up, and if it is remote system - it will not oops/or reboot even. But i dont have any idea in mind how to avoid this, only than big warning in DOC and internal iproute2 help :-) On Wed, 04 Apr 2007 03:39:12 +0200, Patrick McHardy wrote > jamal wrote: > > On Wed, 2007-04-04 at 03:03 +0300, Denys wrote: > > > >>I'm not sure it is mistake or error, but i feel it is dangerous, cause > >>commands locking up the system, no kernel panic, no oops, so only watchdog > >>can save poor server (and not sure this even) > >> > >>Commands to lockup system (just i am giving my example, i didnt sort out what > >>exactly locked up system, i guess redirecting to eth0.5, which is not > >>intended for that): > > > > > > read: > > doc/actions/mirred-usage > > Are you refering to "What NOT to do if you dont want your machine to > crash:"? I think we should make sure users can't even accidentally > crash their box, so this should at least be caught at runtime. > I thought the TTL stuff was intended to avoid this .. > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 2:11 ` Denys @ 2007-04-04 10:55 ` jamal 2007-04-04 12:56 ` Denys 2007-04-04 13:36 ` one more... iproute commands lockup whole system Patrick McHardy 0 siblings, 2 replies; 41+ messages in thread From: jamal @ 2007-04-04 10:55 UTC (permalink / raw) To: Denys; +Cc: Patrick McHardy, Stephen Hemminger, netdev On Wed, 2007-04-04 at 05:11 +0300, Denys wrote: > I think this highly useful feature given by jamal, difficult to be avoided > from crash, if user not enough experienced in networking(like me). I guess > packet can be even not ipv4/ipv6 packet, maybe it can be cloned IPX or ARP, > so TTL field cannot be used. I checked maybe sk_buff have some fields, seems > also bad luck, if there can be something like "internal" counter for packet, > how much times it got redirected, it will help. Adding a field in the skb that keeps track of things would work well, but would be a controvesial thing to do because it actually requires a vector not just one field. There is a filed called cb[] but it cant be used in this case because every time we redirect it could be trampled. > But in my case of VLAN's it > is really my own mistake and difficult to avoid it. Only bad thing - machine > got completely locked up, and if it is remote system - it will not oops/or > reboot even. But i dont have any idea in mind how to avoid this, only than > big warning in DOC and internal iproute2 help :-) Your case is easy to detect in user space because it is within the same policy. Would simple detection and rejection in tc/userspace be useful to add? Note, this doesnt help the general problem though where you have nesting as described in the document. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 10:55 ` jamal @ 2007-04-04 12:56 ` Denys 2007-04-04 14:10 ` jamal 2007-04-04 14:13 ` HTB/act_mirred problem [was: one more... iproute commands lockup whole system] Patrick McHardy 2007-04-04 13:36 ` one more... iproute commands lockup whole system Patrick McHardy 1 sibling, 2 replies; 41+ messages in thread From: Denys @ 2007-04-04 12:56 UTC (permalink / raw) To: hadi; +Cc: Patrick McHardy, Stephen Hemminger, netdev Well, my case is my own mistake, i guess it is just misconfiguration, not actual bug. And also it is good push for me "read doc's and think well before adding rules". Maybe it can be in TODO, but it is not N1 priority i guess. There is more important things, what u want to do. Another thing, adding one more field in skb will add more overhead to whole kernel(i guess). I have some interesting thing: Rules: tc qdisc del dev eth0.5 root tc qdisc add dev eth0.5 handle 1: root htb tc class add dev eth0.5 parent 1:0 classid 1:2 htb rate 128Kbit tc qdisc add dev eth0.5 parent 1:2 handle 2: prio tc filter add dev eth0.5 parent 1: protocol ip prio 10 u32 \ match ip src 195.69.208.253/32 flowid 1:2 tc filter add dev eth0.5 parent 2: protocol ip prio 10 u32 \ match ip src 195.69.208.253/32 flowid 2:1 \ action mirred egress redirect dev eth0.6 (it is not working, but just i tried few things) At morning i wakeup and see in dmesg, also not sure if it's bug or result of misconfiguration: [46632.941527] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- >un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) [46633.270732] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- >un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) [46633.379446] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- >un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) [46633.450751] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- >un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) [46633.570702] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- >un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) On Wed, 04 Apr 2007 06:55:14 -0400, jamal wrote > On Wed, 2007-04-04 at 05:11 +0300, Denys wrote: > > I think this highly useful feature given by jamal, difficult to be avoided > > from crash, if user not enough experienced in networking(like me). I guess > > packet can be even not ipv4/ipv6 packet, maybe it can be cloned IPX or ARP, > > so TTL field cannot be used. I checked maybe sk_buff have some fields, seems > > also bad luck, if there can be something like "internal" counter for packet, > > how much times it got redirected, it will help. > > Adding a field in the skb that keeps track of things would work well, > but would be a controvesial thing to do because it actually requires > a vector not just one field. There is a filed called cb[] but it > cant be used in this case because every time we redirect it could be > trampled. > > > But in my case of VLAN's it > > is really my own mistake and difficult to avoid it. Only bad thing - machine > > got completely locked up, and if it is remote system - it will not oops/ or > > reboot even. But i dont have any idea in mind how to avoid this, only than > > big warning in DOC and internal iproute2 help :-) > > Your case is easy to detect in user space because it is within the same > policy. > Would simple detection and rejection in tc/userspace be useful to > add? Note, this doesnt help the general problem though where you > have nesting as described in the document. > > cheers, > jamal > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 12:56 ` Denys @ 2007-04-04 14:10 ` jamal 2007-04-04 14:35 ` Denys 2007-04-04 14:13 ` HTB/act_mirred problem [was: one more... iproute commands lockup whole system] Patrick McHardy 1 sibling, 1 reply; 41+ messages in thread From: jamal @ 2007-04-04 14:10 UTC (permalink / raw) To: Denys; +Cc: Patrick McHardy, Stephen Hemminger, netdev On Wed, 2007-04-04 at 15:56 +0300, Denys wrote: > Rules: > tc qdisc del dev eth0.5 root > tc qdisc add dev eth0.5 handle 1: root htb > tc class add dev eth0.5 parent 1:0 classid 1:2 htb rate 128Kbit > > tc qdisc add dev eth0.5 parent 1:2 handle 2: prio > > tc filter add dev eth0.5 parent 1: protocol ip prio 10 u32 \ > match ip src 195.69.208.253/32 flowid 1:2 > > tc filter add dev eth0.5 parent 2: protocol ip prio 10 u32 \ > match ip src 195.69.208.253/32 flowid 2:1 \ > action mirred egress redirect dev eth0.6 > > (it is not working, but just i tried few things) > This will still mean the physical device is eventually going to be eth0. You are hanging, correct? If you redirected to another vlan ontop of a different physical ethernet, and the problem persists then it will be worrisome. Actually i am begining to doubt myself. Maybe the vlan device is acting strangely. I will try to test your simple setup towards the end of day and see what happens. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 14:10 ` jamal @ 2007-04-04 14:35 ` Denys 0 siblings, 0 replies; 41+ messages in thread From: Denys @ 2007-04-04 14:35 UTC (permalink / raw) To: hadi; +Cc: Patrick McHardy, Stephen Hemminger, netdev Yes, it is real setup by the way, i just copy&paste script here. I tried to remove some commands to leave only minimum to reproduce the problem. Right now i dont have another ethernet card on that system, i will add it today, when i come back home and i will test in different situations. Also it is interesting, how system will act when i will redirect from eth0 to ppp0, tunnels, device with hardware offloading to device without hardware offloading and etc. On Wed, 04 Apr 2007 10:10:10 -0400, jamal wrote > On Wed, 2007-04-04 at 15:56 +0300, Denys wrote: > > > Rules: > > tc qdisc del dev eth0.5 root > > tc qdisc add dev eth0.5 handle 1: root htb > > tc class add dev eth0.5 parent 1:0 classid 1:2 htb rate 128Kbit > > > > tc qdisc add dev eth0.5 parent 1:2 handle 2: prio > > > > tc filter add dev eth0.5 parent 1: protocol ip prio 10 u32 \ > > match ip src 195.69.208.253/32 flowid 1:2 > > > > tc filter add dev eth0.5 parent 2: protocol ip prio 10 u32 \ > > match ip src 195.69.208.253/32 flowid 2:1 \ > > action mirred egress redirect dev eth0.6 > > > > (it is not working, but just i tried few things) > > > > This will still mean the physical device is eventually going to be > eth0. You are hanging, correct? > If you redirected to another vlan ontop of a different physical > ethernet, and the problem persists then it will be worrisome. > Actually i am begining to doubt myself. Maybe the vlan device is acting > strangely. I will try to test your simple setup towards the end of > day and see what happens. > > cheers, > jamal > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: HTB/act_mirred problem [was: one more... iproute commands lockup whole system] 2007-04-04 12:56 ` Denys 2007-04-04 14:10 ` jamal @ 2007-04-04 14:13 ` Patrick McHardy 2007-04-04 14:33 ` jamal 1 sibling, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 14:13 UTC (permalink / raw) To: Denys; +Cc: hadi, Stephen Hemminger, netdev Denys wrote: > I have some interesting thing: > > Rules: > tc qdisc del dev eth0.5 root > tc qdisc add dev eth0.5 handle 1: root htb > tc class add dev eth0.5 parent 1:0 classid 1:2 htb rate 128Kbit > > tc qdisc add dev eth0.5 parent 1:2 handle 2: prio > > tc filter add dev eth0.5 parent 1: protocol ip prio 10 u32 \ > match ip src 195.69.208.253/32 flowid 1:2 > > tc filter add dev eth0.5 parent 2: protocol ip prio 10 u32 \ > match ip src 195.69.208.253/32 flowid 2:1 \ > action mirred egress redirect dev eth0.6 > > (it is not working, but just i tried few things) > > At morning i wakeup and see in dmesg, also not sure if it's bug or result of > misconfiguration: > > [46632.941527] KERNEL: assertion (!cl->level && cl->un.leaf.q && cl- > >>un.leaf.q->q.qlen) failed at net/sched/sch_htb.c (585) This seems to be due to be caused by act_mirred returning TC_ACT_STOLEN, which is translated to NET_XMIT_SUCCESS within prio, causing HTB to increase the q.qlen counter and activating the class despite no packet beeing queued. Jamal, we can't return NET_XMIT_SUCCESS unless we've really queued the packet. I can't remeber the reason why this is done, could you remind me? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: HTB/act_mirred problem [was: one more... iproute commands lockup whole system] 2007-04-04 14:13 ` HTB/act_mirred problem [was: one more... iproute commands lockup whole system] Patrick McHardy @ 2007-04-04 14:33 ` jamal 2007-04-04 14:50 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: jamal @ 2007-04-04 14:33 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 16:13 +0200, Patrick McHardy wrote: > This seems to be due to be caused by act_mirred returning TC_ACT_STOLEN, > which is translated to NET_XMIT_SUCCESS within prio, causing HTB to > increase the q.qlen counter and activating the class despite no packet > beeing queued. > > Jamal, we can't return NET_XMIT_SUCCESS unless we've really queued the > packet. I can't remeber the reason why this is done, could you remind > me? IIRC, It had to do with not confusing TCP to try and retransmit. I can go back and look at my notes to be certain. At one point i posted those notes, it maybe time to add them to the kernel code or doc somewhere. cheers, jamal PS:- i may a little slow in responding for the next few hours. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: HTB/act_mirred problem [was: one more... iproute commands lockup whole system] 2007-04-04 14:33 ` jamal @ 2007-04-04 14:50 ` Patrick McHardy 2007-04-05 1:33 ` jamal 0 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 14:50 UTC (permalink / raw) To: hadi; +Cc: Denys, Stephen Hemminger, netdev jamal wrote: > On Wed, 2007-04-04 at 16:13 +0200, Patrick McHardy wrote: > >>This seems to be due to be caused by act_mirred returning TC_ACT_STOLEN, >>which is translated to NET_XMIT_SUCCESS within prio, causing HTB to >>increase the q.qlen counter and activating the class despite no packet >>beeing queued. >> >>Jamal, we can't return NET_XMIT_SUCCESS unless we've really queued the >>packet. I can't remeber the reason why this is done, could you remind >>me? > > > IIRC, It had to do with not confusing TCP to try and retransmit. No, that was the default return code which applies to TC_ACT_SHOT and unclassified packets (29f1df6cc1c3ee3530939f0e38d80a9b50645ba5). Returning NET_XMIT_SUCCESS for TC_ACT_STOLEN/TC_ACT_QUEUED has always been done. Anyway, we can't return NET_XMIT_SUCCESS, so how about just returning NET_XMIT_BYPASS in all cases where the packet was stolen/dropped/... by TC actions? > I can > go back and look at my notes to be certain. At one point i posted those > notes, it maybe time to add them to the kernel code or doc somewhere. In case they're still up to date, adding them somewhere under Documentation/ sounds good. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: HTB/act_mirred problem [was: one more... iproute commands lockup whole system] 2007-04-04 14:50 ` Patrick McHardy @ 2007-04-05 1:33 ` jamal 0 siblings, 0 replies; 41+ messages in thread From: jamal @ 2007-04-05 1:33 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 16:50 +0200, Patrick McHardy wrote: > Anyway, we can't return NET_XMIT_SUCCESS, so how about just returning > NET_XMIT_BYPASS in all cases where the packet was stolen/dropped/... > by TC actions? Ok, I think i get you now. Yes, that would work, but: for the dropped case, you need to record into the dropped stats (unlike the the other two). > > I can > > go back and look at my notes to be certain. At one point i posted those > > notes, it maybe time to add them to the kernel code or doc somewhere. > > > In case they're still up to date, adding them somewhere under > Documentation/ sounds good. I think it is still uotodate, if not i will make sure it is before i submit it. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 10:55 ` jamal 2007-04-04 12:56 ` Denys @ 2007-04-04 13:36 ` Patrick McHardy 2007-04-04 13:58 ` jamal 1 sibling, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 13:36 UTC (permalink / raw) To: hadi; +Cc: Denys, Stephen Hemminger, netdev jamal wrote: > On Wed, 2007-04-04 at 05:11 +0300, Denys wrote: > >>I think this highly useful feature given by jamal, difficult to be avoided >>from crash, if user not enough experienced in networking(like me). I guess >>packet can be even not ipv4/ipv6 packet, maybe it can be cloned IPX or ARP, >>so TTL field cannot be used. We have a loop counter (RTTL) in tc_verd. For some reason it is reset after ing_filter though. > I checked maybe sk_buff have some fields, seems >>also bad luck, if there can be something like "internal" counter for packet, >>how much times it got redirected, it will help. > > > Adding a field in the skb that keeps track of things would work well, > but would be a controvesial thing to do because it actually requires a > vector not just one field. There is a filed called cb[] but it cant be > used in this case because every time we redirect it could be trampled. It would be interesting to find out what the problem is exactly. The configuration itself looks harmless, so I'm guessing its rather a deadlock than a loop. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 13:36 ` one more... iproute commands lockup whole system Patrick McHardy @ 2007-04-04 13:58 ` jamal 2007-04-04 14:07 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: jamal @ 2007-04-04 13:58 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 15:36 +0200, Patrick McHardy wrote: > > We have a loop counter (RTTL) in tc_verd. For some reason it is reset > after ing_filter though. > Essentially it is valuable just to avoid a lot of stacking (separate issue) and not to avoid the locking issue he is seeing. > It would be interesting to find out what the problem is exactly. > The configuration itself looks harmless, so I'm guessing its > rather a deadlock than a loop. We know it is a deadlock. If you redirect the first time queue lock for eth0 will be held, before it is released if you do another redirect, it will again be heading towards eth0 and it will deadlock on grabbing the queue lock. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 13:58 ` jamal @ 2007-04-04 14:07 ` Patrick McHardy 2007-04-04 14:30 ` jamal 0 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 14:07 UTC (permalink / raw) To: hadi; +Cc: Denys, Stephen Hemminger, netdev jamal wrote: > On Wed, 2007-04-04 at 15:36 +0200, Patrick McHardy wrote: > >>It would be interesting to find out what the problem is exactly. >>The configuration itself looks harmless, so I'm guessing its >>rather a deadlock than a loop. > > > We know it is a deadlock. > If you redirect the first time queue lock for eth0 will be held, before > it is released if you do another redirect, it will again be heading > towards eth0 and it will deadlock on grabbing the queue lock. He only used a single redirect to eth0.5, but its probably due to the fact that the VLAN hard_start_xmit function transmits on eth0 again. How about adding something like ifb's ri_tasklet to act_mirred? ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 14:07 ` Patrick McHardy @ 2007-04-04 14:30 ` jamal 2007-04-04 14:39 ` Patrick McHardy 0 siblings, 1 reply; 41+ messages in thread From: jamal @ 2007-04-04 14:30 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 16:07 +0200, Patrick McHardy wrote: > He only used a single redirect to eth0.5, but its probably due to the > fact that the VLAN hard_start_xmit function transmits on eth0 again. Ok, that validates what i was saying earlier then - so i dont need to run the test. > How about adding something like ifb's ri_tasklet to act_mirred? You mean having a tasklet or the constraint checks? cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 14:30 ` jamal @ 2007-04-04 14:39 ` Patrick McHardy 2007-04-05 2:14 ` jamal 0 siblings, 1 reply; 41+ messages in thread From: Patrick McHardy @ 2007-04-04 14:39 UTC (permalink / raw) To: hadi; +Cc: Denys, Stephen Hemminger, netdev jamal wrote: > On Wed, 2007-04-04 at 16:07 +0200, Patrick McHardy wrote: > >>How about adding something like ifb's ri_tasklet to act_mirred? > > > You mean having a tasklet or the constraint checks? Have a tasklet to avoid the deadlocks. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: one more... iproute commands lockup whole system 2007-04-04 14:39 ` Patrick McHardy @ 2007-04-05 2:14 ` jamal 0 siblings, 0 replies; 41+ messages in thread From: jamal @ 2007-04-05 2:14 UTC (permalink / raw) To: Patrick McHardy; +Cc: Denys, Stephen Hemminger, netdev On Wed, 2007-04-04 at 16:39 +0200, Patrick McHardy wrote: > Have a tasklet to avoid the deadlocks. sounds like a good idea from the outset. The principle of punishing only users of mirred is the right one. It will also provide a clean way to drop all redirect to self. The dilema is that the cost may end up being in complexity and performance. Let me throw a few wrenches your way: - Mirroring doesnt have this problem, only redirecting does. And the main problem is on egress. So you may have to do the enqueueing only for those cases, or we may just have to separate redirects from mirroring as two separate actions. - We need to be able to support literally hundreds of such actions, so if you do a single tasklet per action you will have quiet a few running - i dont know what the impact is. A solution might be to do a single tasklet with the mirred table; you will need many queues. -There may be others like lock contention etc on trying to reinject which will affect perfomance. I am not against the idea Patrick, I am just worried about performance and i feel that the majority of people who redirect will, once bitten (as was Denys), eventually resort to doing the right config and would rather have good perfomance. I could be selfish and wrong by looking at my needs for this action but i prefer performance and from that perspective i look at this from someone who didnt read the instructions on a gun and shot at their toe. You could add a sensor that recognises a toe is being aimed at and pop up an LCD with a question "are you sure you want to shoot at _a toe_?" ;-> but that adds to the cost. I am exagerating - but i hope you get my point. cheers, jamal ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2007-04-05 2:15 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-03-21 18:00 iproute2-2.6.20-070313 bug ? Denys 2007-03-22 11:23 ` Patrick McHardy 2007-03-22 12:46 ` Denys 2007-03-22 13:09 ` Patrick McHardy [not found] ` <20070322131245.M85528@visp.net.lb> 2007-03-22 13:23 ` Patrick McHardy 2007-03-22 13:35 ` Denys [not found] ` <20070322132637.M88445@visp.net.lb> 2007-03-22 13:43 ` Patrick McHardy 2007-03-22 13:47 ` Denys 2007-03-22 13:26 ` Denys 2007-03-22 17:12 ` Stephen Hemminger 2007-03-22 17:14 ` Patrick McHardy 2007-03-26 18:49 ` more... iproute2/htb/whatever critical " Denys 2007-03-27 15:10 ` Patrick McHardy 2007-03-27 15:18 ` Denys 2007-03-27 16:00 ` Denys 2007-03-27 16:21 ` Patrick McHardy 2007-03-28 17:38 ` another " Denys 2007-03-28 23:55 ` Denys 2007-03-29 10:58 ` Patrick McHardy 2007-03-31 2:26 ` more iproute2 issues (not critical) Denys 2007-03-31 2:31 ` Denys 2007-03-31 14:16 ` Patrick McHardy 2007-04-04 0:03 ` one more... iproute commands lockup whole system Denys 2007-04-04 1:10 ` jamal 2007-04-04 1:39 ` Patrick McHardy 2007-04-04 2:09 ` jamal 2007-04-04 2:11 ` Denys 2007-04-04 10:55 ` jamal 2007-04-04 12:56 ` Denys 2007-04-04 14:10 ` jamal 2007-04-04 14:35 ` Denys 2007-04-04 14:13 ` HTB/act_mirred problem [was: one more... iproute commands lockup whole system] Patrick McHardy 2007-04-04 14:33 ` jamal 2007-04-04 14:50 ` Patrick McHardy 2007-04-05 1:33 ` jamal 2007-04-04 13:36 ` one more... iproute commands lockup whole system Patrick McHardy 2007-04-04 13:58 ` jamal 2007-04-04 14:07 ` Patrick McHardy 2007-04-04 14:30 ` jamal 2007-04-04 14:39 ` Patrick McHardy 2007-04-05 2:14 ` jamal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.