All of lore.kernel.org
 help / color / mirror / Atom feed
* HTB accuracy for high speed
       [not found] <298f5c050905150745p13dc226eia1ff50ffa8c4b300@mail.gmail.com>
@ 2009-05-15 14:49 ` Antonio Almeida
  2009-05-15 18:12   ` Stephen Hemminger
                     ` (4 more replies)
  0 siblings, 5 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-15 14:49 UTC (permalink / raw)
  To: netdev, jarkao2, kaber, davem, devik

Hi!
I've been using HTB in a Linux bridge and recently I noticed that, for
high speed, the configured rate/ceil is not respected as for lower
speeds.
I'm using a packet generator/analyser to inject over 950Mpbs, and see
what returns back to it, in the other side of my bridge. Generated
packets have 800bytes. I noticed that, for several tc HTB rate/ceil
configurations the amount of traffic received by the analyser stays
the same. See this values:

HTB conf      Analyser reception
476000Kbit    544.260.329
500000Kbit    545.880.017
510000Kbit    544.489.469
512000Kbit    546.890.972
-------------------------
513000Kbit    596.061.383
520000Kbit    596.791.866
550000Kbit    596.543.271
554000Kbit    596.193.545
-------------------------
555000Kbit    654.773.221
570000Kbit    654.996.381
590000Kbit    655.363.253
605000Kbit    654.112.017
-------------------------
606000Kbit    728.262.237
665000Kbit    727.014.365
-------------------------

There are these steps and it looks like doesn't matter if I configure
HTB to 555Mbit or to 605Mbit - the result is the same: 654Mbit. This
is 18% more traffic than the configured value. I also realise that for
smaller packets it gets worse, reaching 30% more traffic than what I
configured. For packets of 1514bytes the accuracy is quiet good.
I'm using kernel 2.6.25

My 'tc -s -d class ls dev eth1' output:

class htb 1:10 parent 1:2 rate 1000Mbit ceil 1000Mbit burst 126375b/8
mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 5
 Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
 rate 653124Kbit 97656pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 113 ctokens: 113

class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 126375b/8 mpu 0b
overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 7
 Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
 rate 653123Kbit 97656pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 113 ctokens: 113

class htb 1:2 parent 1:1 rate 1000Mbit ceil 1000Mbit burst 126375b/8
mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 6
 Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
 rate 653124Kbit 97656pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 113 ctokens: 113

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 51888579644 bytes 62067679 pkt (dropped 27801917, overlimits 0 requeues 0)
 rate 653124Kbit 97656pps backlog 0b 0p requeues 0
 lended: 62067679 borrowed: 0 giants: 0
 tokens: -798 ctokens: -798

As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
it's ceil.

I also note that, for HTB rate configurations over 500Mbit/s on leaf
class, when I stop the traffic, in the output of "tc -s -d class ls
dev eth1" command, I see that leaf's rate (in bits/s) is growing
instead of decreasing (as expected since I've stopped the traffic).
Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
above 1000Mbit and stays there for a few minutes. After two or three
minutes it becomes 0bit. The same happens for it's ancestors (also for
root class).Here's tc output of my leaf class for this situation:

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
requeues 0)
 rate 1074Mbit 0pps backlog 0b 0p requeues 0
 lended: 242475339 borrowed: 0 giants: 0
 tokens: 8 ctokens: 8


  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
@ 2009-05-15 18:12   ` Stephen Hemminger
  2009-05-18 10:01     ` Antonio Almeida
  2009-05-16  8:31   ` Jarek Poplawski
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 104+ messages in thread
From: Stephen Hemminger @ 2009-05-15 18:12 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, jarkao2, kaber, davem, devik

On Fri, 15 May 2009 15:49:31 +0100
Antonio Almeida <vexwek@gmail.com> wrote:

> Hi!
> I've been using HTB in a Linux bridge and recently I noticed that, for
> high speed, the configured rate/ceil is not respected as for lower
> speeds.
> I'm using a packet generator/analyser to inject over 950Mpbs, and see
> what returns back to it, in the other side of my bridge. Generated
> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
> configurations the amount of traffic received by the analyser stays
> the same. See this values:
> 
> HTB conf      Analyser reception
> 476000Kbit    544.260.329
> 500000Kbit    545.880.017
> 510000Kbit    544.489.469
> 512000Kbit    546.890.972
> -------------------------
> 513000Kbit    596.061.383
> 520000Kbit    596.791.866
> 550000Kbit    596.543.271
> 554000Kbit    596.193.545
> -------------------------
> 555000Kbit    654.773.221
> 570000Kbit    654.996.381
> 590000Kbit    655.363.253
> 605000Kbit    654.112.017
> -------------------------
> 606000Kbit    728.262.237
> 665000Kbit    727.014.365
> -------------------------
> 
> There are these steps and it looks like doesn't matter if I configure
> HTB to 555Mbit or to 605Mbit - the result is the same: 654Mbit. This
> is 18% more traffic than the configured value. I also realise that for
> smaller packets it gets worse, reaching 30% more traffic than what I
> configured. For packets of 1514bytes the accuracy is quiet good.
> I'm using kernel 2.6.25
> 
> My 'tc -s -d class ls dev eth1' output:
> 
> class htb 1:10 parent 1:2 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 5
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 126375b/8 mpu 0b
> overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 7
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653123Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:2 parent 1:1 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 6
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 51888579644 bytes 62067679 pkt (dropped 27801917, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 62067679 borrowed: 0 giants: 0
>  tokens: -798 ctokens: -798
> 
> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
> it's ceil.
> 
> I also note that, for HTB rate configurations over 500Mbit/s on leaf
> class, when I stop the traffic, in the output of "tc -s -d class ls
> dev eth1" command, I see that leaf's rate (in bits/s) is growing
> instead of decreasing (as expected since I've stopped the traffic).
> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
> above 1000Mbit and stays there for a few minutes. After two or three
> minutes it becomes 0bit. The same happens for it's ancestors (also for
> root class).Here's tc output of my leaf class for this situation:
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
> requeues 0)
>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>  lended: 242475339 borrowed: 0 giants: 0
>  tokens: 8 ctokens: 8
> 
> 
>   Antonio Almeida

You are probably hitting the limit of the timer resolution. So it matters
what the clock source is.  
    cat /sys/devices/system/clocksource/clocksource0/current_clocksource

Also, is HFSC any better than HTB?

-- 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
  2009-05-15 18:12   ` Stephen Hemminger
@ 2009-05-16  8:31   ` Jarek Poplawski
  2009-05-18 10:39     ` Antonio Almeida
  2009-05-16 14:14   ` Jarek Poplawski
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-16  8:31 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, kaber, davem, devik

On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> Hi!
> I've been using HTB in a Linux bridge and recently I noticed that, for
> high speed, the configured rate/ceil is not respected as for lower
> speeds.
> I'm using a packet generator/analyser to inject over 950Mpbs, and see
> what returns back to it, in the other side of my bridge. Generated
> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
> configurations the amount of traffic received by the analyser stays
> the same. See this values:
> 
> HTB conf      Analyser reception
> 476000Kbit    544.260.329
...
> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
> it's ceil.

Is it for sure there is no gso/tso enabled on this dev (with up to
date ethtool -k)? It would be nice to see also more details like
.config, ifconfigs before and after the test, tc -s qdisc and bytes/
packet number seen by this analyser, plus maybe some proof you can
obtain such flows with something simpler like tbf. Of course using
the current kernel, even if no difference, would give us more
valuable perspective.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
  2009-05-15 18:12   ` Stephen Hemminger
  2009-05-16  8:31   ` Jarek Poplawski
@ 2009-05-16 14:14   ` Jarek Poplawski
  2009-05-18 14:36     ` Antonio Almeida
  2009-05-18 16:40     ` HTB accuracy for high speed Eric Dumazet
  2009-05-17 20:15   ` HTB accuracy for high speed Jarek Poplawski
  2009-05-17 20:29   ` Vladimir Ivashchenko
  4 siblings, 2 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-16 14:14 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, kaber, davem, devik

On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
...
> I also note that, for HTB rate configurations over 500Mbit/s on leaf
> class, when I stop the traffic, in the output of "tc -s -d class ls
> dev eth1" command, I see that leaf's rate (in bits/s) is growing
> instead of decreasing (as expected since I've stopped the traffic).
> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
> above 1000Mbit and stays there for a few minutes. After two or three
> minutes it becomes 0bit. The same happens for it's ancestors (also for
> root class).Here's tc output of my leaf class for this situation:
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
> requeues 0)
>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>  lended: 242475339 borrowed: 0 giants: 0
>  tokens: 8 ctokens: 8

This looks like a regular bug. I guess it's an overflow in
gen_estimator(), but I'm not sure there is nothing more. Could you
try the patch below? (An offset warning when patching 2.6.25 is OK)

Thanks,
Jarek P.
---

 net/core/gen_estimator.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 9cc9f95..87f0ced 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -127,7 +127,11 @@ static void est_timer(unsigned long arg)
 		npackets = e->bstats->packets;
 		rate = (nbytes - e->last_bytes)<<(7 - idx);
 		e->last_bytes = nbytes;
-		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
+		if (rate > e->avbps)
+			e->avbps += (rate - e->avbps) >> e->ewma_log;
+		else
+			e->avbps -= (e->avbps - rate) >> e->ewma_log;
+
 		e->rate_est->bps = (e->avbps+0xF)>>5;
 
 		rate = (npackets - e->last_packets)<<(12 - idx);

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
                     ` (2 preceding siblings ...)
  2009-05-16 14:14   ` Jarek Poplawski
@ 2009-05-17 20:15   ` Jarek Poplawski
  2009-05-18  6:56     ` [PATCH iproute2] " Jarek Poplawski
  2009-05-18  7:01     ` [PATCH iproute2 v2] " Jarek Poplawski
  2009-05-17 20:29   ` Vladimir Ivashchenko
  4 siblings, 2 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-17 20:15 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, kaber, davem, devik

On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> Hi!
> I've been using HTB in a Linux bridge and recently I noticed that, for
> high speed, the configured rate/ceil is not respected as for lower
> speeds.
> I'm using a packet generator/analyser to inject over 950Mpbs, and see
> what returns back to it, in the other side of my bridge. Generated
> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
> configurations the amount of traffic received by the analyser stays
> the same. See this values:
> 
> HTB conf      Analyser reception
> 476000Kbit    544.260.329
> 500000Kbit    545.880.017
> 510000Kbit    544.489.469
> 512000Kbit    546.890.972
> -------------------------
> 513000Kbit    596.061.383
> 520000Kbit    596.791.866
> 550000Kbit    596.543.271
> 554000Kbit    596.193.545
> -------------------------
> 555000Kbit    654.773.221
> 570000Kbit    654.996.381
> 590000Kbit    655.363.253
> 605000Kbit    654.112.017
> -------------------------
> 606000Kbit    728.262.237
> 665000Kbit    727.014.365
> -------------------------
> 
> There are these steps and it looks like doesn't matter if I configure
> HTB to 555Mbit or to 605Mbit - the result is the same: 654Mbit. This
> is 18% more traffic than the configured value. I also realise that for
> smaller packets it gets worse, reaching 30% more traffic than what I
> configured. For packets of 1514bytes the accuracy is quiet good.
> I'm using kernel 2.6.25
> 
> My 'tc -s -d class ls dev eth1' output:
> 
> class htb 1:10 parent 1:2 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 5
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 126375b/8 mpu 0b
> overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 7
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653123Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:2 parent 1:1 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 6
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 51888579644 bytes 62067679 pkt (dropped 27801917, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 62067679 borrowed: 0 giants: 0
>  tokens: -798 ctokens: -798
> 
> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
> it's ceil.
> 

Here is some additional explanation. It looks like these rates above
500Mbit hit the design limits of packet scheduling. Currently used
internal resolution PSCHED_TICKS_PER_SEC is 1,000,000. 550Mbit rate
with 800byte packets means 550M/8/800 = 85938 packets/s, so on average
1000000/85938 = 11.6 ticks per packet. Accounting only 11 ticks means
we leave 0.6*85938 = 51563 ticks per second, letting for additional
sending of 51563/11 = 4687 packets/s or 4687*800*8 = 30Mbit. Of course
it could be worse (0.9 tick/packet lost) depending on packet sizes vs.
rates, and the effect rises for higher rates.

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
                     ` (3 preceding siblings ...)
  2009-05-17 20:15   ` HTB accuracy for high speed Jarek Poplawski
@ 2009-05-17 20:29   ` Vladimir Ivashchenko
  4 siblings, 0 replies; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-17 20:29 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, jarkao2, kaber, davem, devik

Hi Antonio,

FYI, these are exactly the same problems I get in real life.
Check the later posts in "bond + tc regression" thread.

On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> Hi!
> I've been using HTB in a Linux bridge and recently I noticed that, for
> high speed, the configured rate/ceil is not respected as for lower
> speeds.
> I'm using a packet generator/analyser to inject over 950Mpbs, and see
> what returns back to it, in the other side of my bridge. Generated
> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
> configurations the amount of traffic received by the analyser stays
> the same. See this values:
> 
> HTB conf      Analyser reception
> 476000Kbit    544.260.329
> 500000Kbit    545.880.017
> 510000Kbit    544.489.469
> 512000Kbit    546.890.972
> -------------------------
> 513000Kbit    596.061.383
> 520000Kbit    596.791.866
> 550000Kbit    596.543.271
> 554000Kbit    596.193.545
> -------------------------
> 555000Kbit    654.773.221
> 570000Kbit    654.996.381
> 590000Kbit    655.363.253
> 605000Kbit    654.112.017
> -------------------------
> 606000Kbit    728.262.237
> 665000Kbit    727.014.365
> -------------------------
> 
> There are these steps and it looks like doesn't matter if I configure
> HTB to 555Mbit or to 605Mbit - the result is the same: 654Mbit. This
> is 18% more traffic than the configured value. I also realise that for
> smaller packets it gets worse, reaching 30% more traffic than what I
> configured. For packets of 1514bytes the accuracy is quiet good.
> I'm using kernel 2.6.25
> 
> My 'tc -s -d class ls dev eth1' output:
> 
> class htb 1:10 parent 1:2 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 5
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 126375b/8 mpu 0b
> overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 7
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653123Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:2 parent 1:1 rate 1000Mbit ceil 1000Mbit burst 126375b/8
> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 6
>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 0 borrowed: 0 giants: 0
>  tokens: 113 ctokens: 113
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 51888579644 bytes 62067679 pkt (dropped 27801917, overlimits 0 requeues 0)
>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>  lended: 62067679 borrowed: 0 giants: 0
>  tokens: -798 ctokens: -798
> 
> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
> it's ceil.
> 
> I also note that, for HTB rate configurations over 500Mbit/s on leaf
> class, when I stop the traffic, in the output of "tc -s -d class ls
> dev eth1" command, I see that leaf's rate (in bits/s) is growing
> instead of decreasing (as expected since I've stopped the traffic).
> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
> above 1000Mbit and stays there for a few minutes. After two or three
> minutes it becomes 0bit. The same happens for it's ancestors (also for
> root class).Here's tc output of my leaf class for this situation:
> 
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
> requeues 0)
>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>  lended: 242475339 borrowed: 0 giants: 0
>  tokens: 8 ctokens: 8
> 
> 
>   Antonio Almeida
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-17 20:15   ` HTB accuracy for high speed Jarek Poplawski
@ 2009-05-18  6:56     ` Jarek Poplawski
  2009-05-18 16:54       ` Antonio Almeida
  2009-05-18  7:01     ` [PATCH iproute2 v2] " Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18  6:56 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Antonio Almeida, netdev, kaber, davem, devik

Return non-zero tc_calc_xmittime() for rate tables

While looking at the problem of HTB accuracy for high speed (~500Mbit
rates) I've found that rate tables have cells filled with zeros for
the smallest sizes. It means such packets aren't accounted at all.
Apart from the correctness of such configs, let's make it safe with
rather overaccounting than living it unlimited.

Reported-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 tc/tc_core.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/tc/tc_core.c b/tc/tc_core.c
index 9a0ff39..14f25bc 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
 
 unsigned tc_calc_xmittime(unsigned rate, unsigned size)
 {
-	return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
+	unsigned t;
+	t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
+	return t ? : 1;
 }
 
 unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH iproute2 v2] Re: HTB accuracy for high speed
  2009-05-17 20:15   ` HTB accuracy for high speed Jarek Poplawski
  2009-05-18  6:56     ` [PATCH iproute2] " Jarek Poplawski
@ 2009-05-18  7:01     ` Jarek Poplawski
  1 sibling, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18  7:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Antonio Almeida, netdev, kaber, davem, devik

-----------> (One misspelling fixed.)
Return non-zero tc_calc_xmittime() for rate tables

While looking at the problem of HTB accuracy for high speed (~500Mbit
rates) I've found that rate tables have cells filled with zeros for
the smallest sizes. It means such packets aren't accounted at all.
Apart from the correctness of such configs, let's make it safe with
rather overaccounting than leaving it unlimited.

Reported-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

 tc/tc_core.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/tc/tc_core.c b/tc/tc_core.c
index 9a0ff39..14f25bc 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
 
 unsigned tc_calc_xmittime(unsigned rate, unsigned size)
 {
-	return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
+	unsigned t;
+	t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
+	return t ? : 1;
 }
 
 unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-15 18:12   ` Stephen Hemminger
@ 2009-05-18 10:01     ` Antonio Almeida
  2009-05-18 10:45       ` Jarek Poplawski
  2009-05-18 16:13       ` Stephen Hemminger
  0 siblings, 2 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 10:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, jarkao2, kaber, davem, devik

Hi!

cat /sys/devices/system/clocksource/clocksource0/current_clocksource
returns "jiffies"

With HFSC the accuracy is good. Also with packets of 800 bytes I got
these values:
received            configured         error
904596519	900000000		0,51
804293658	800000000		0,54
703662853	700000000		0,52
603354059	600000000		0,56
502805411	500000000		0,56
402527055	400000000		0,63
301484904	300000000		0,49
201074301	200000000		0,54
100546656	100000000		0,55


Thanks
  Antonio Almeida



On Fri, May 15, 2009 at 7:12 PM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> On Fri, 15 May 2009 15:49:31 +0100
> Antonio Almeida <vexwek@gmail.com> wrote:
>
>> Hi!
>> I've been using HTB in a Linux bridge and recently I noticed that, for
>> high speed, the configured rate/ceil is not respected as for lower
>> speeds.
>> I'm using a packet generator/analyser to inject over 950Mpbs, and see
>> what returns back to it, in the other side of my bridge. Generated
>> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
>> configurations the amount of traffic received by the analyser stays
>> the same. See this values:
>>
>> HTB conf      Analyser reception
>> 476000Kbit    544.260.329
>> 500000Kbit    545.880.017
>> 510000Kbit    544.489.469
>> 512000Kbit    546.890.972
>> -------------------------
>> 513000Kbit    596.061.383
>> 520000Kbit    596.791.866
>> 550000Kbit    596.543.271
>> 554000Kbit    596.193.545
>> -------------------------
>> 555000Kbit    654.773.221
>> 570000Kbit    654.996.381
>> 590000Kbit    655.363.253
>> 605000Kbit    654.112.017
>> -------------------------
>> 606000Kbit    728.262.237
>> 665000Kbit    727.014.365
>> -------------------------
>>
>> There are these steps and it looks like doesn't matter if I configure
>> HTB to 555Mbit or to 605Mbit - the result is the same: 654Mbit. This
>> is 18% more traffic than the configured value. I also realise that for
>> smaller packets it gets worse, reaching 30% more traffic than what I
>> configured. For packets of 1514bytes the accuracy is quiet good.
>> I'm using kernel 2.6.25
>>
>> My 'tc -s -d class ls dev eth1' output:
>>
>> class htb 1:10 parent 1:2 rate 1000Mbit ceil 1000Mbit burst 126375b/8
>> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 5
>>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>>  lended: 0 borrowed: 0 giants: 0
>>  tokens: 113 ctokens: 113
>>
>> class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 126375b/8 mpu 0b
>> overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 7
>>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 653123Kbit 97656pps backlog 0b 0p requeues 0
>>  lended: 0 borrowed: 0 giants: 0
>>  tokens: 113 ctokens: 113
>>
>> class htb 1:2 parent 1:1 rate 1000Mbit ceil 1000Mbit burst 126375b/8
>> mpu 0b overhead 0b cburst 126375b/8 mpu 0b overhead 0b level 6
>>  Sent 51888579644 bytes 62067679 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>>  lended: 0 borrowed: 0 giants: 0
>>  tokens: 113 ctokens: 113
>>
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 51888579644 bytes 62067679 pkt (dropped 27801917, overlimits 0 requeues 0)
>>  rate 653124Kbit 97656pps backlog 0b 0p requeues 0
>>  lended: 62067679 borrowed: 0 giants: 0
>>  tokens: -798 ctokens: -798
>>
>> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
>> it's ceil.
>>
>> I also note that, for HTB rate configurations over 500Mbit/s on leaf
>> class, when I stop the traffic, in the output of "tc -s -d class ls
>> dev eth1" command, I see that leaf's rate (in bits/s) is growing
>> instead of decreasing (as expected since I've stopped the traffic).
>> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
>> above 1000Mbit and stays there for a few minutes. After two or three
>> minutes it becomes 0bit. The same happens for it's ancestors (also for
>> root class).Here's tc output of my leaf class for this situation:
>>
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
>> requeues 0)
>>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>>  lended: 242475339 borrowed: 0 giants: 0
>>  tokens: 8 ctokens: 8
>>
>>
>>   Antonio Almeida
>
> You are probably hitting the limit of the timer resolution. So it matters
> what the clock source is.
>    cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> Also, is HFSC any better than HTB?
>
> --
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-16  8:31   ` Jarek Poplawski
@ 2009-05-18 10:39     ` Antonio Almeida
  2009-05-18 11:14       ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 10:39 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik

[-- Attachment #1: Type: text/plain, Size: 13280 bytes --]

Hi!

Here the information you asked:

# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off

# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off

The bridge is between eth0 and eth1

---------------------------
Before traffic starts:
---------------------------
Analyser sent bytes: 0
Analyser sent packets: 0
Analyser received bytes: 0
Analyser received packets: 0


# tc -s -d class ls dev eth1
class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 990 ctokens: 990

class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 990 ctokens: 990

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 990 ctokens: 990

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 999 ctokens: 999


# ifconfig
br0       Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6C
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

eth0      Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6C
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:69617616 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4154463648 (3.8 GiB)  TX bytes:0 (0.0 b)
          Base address:0x4000 Memory:e8200000-e8220000

eth1      Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6D
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:50262048 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:1554907136 (1.4 GiB)
          Base address:0x4040 Memory:e8220000-e8240000

eth3      Link encap:Ethernet  HWaddr 00:11:25:C4:60:AF
          inet addr:192.168.0.244  Bcast:19.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:461403 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13573 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:34150991 (32.5 MiB)  TX bytes:1247864 (1.1 MiB)
          Interrupt:27

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:188 (188.0 b)  TX bytes:188 (188.0 b)


# tc -s qdisc
qdisc pfifo_fast 0: dev eth3 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1
1 1 1 1 1 1
 Sent 5459409 bytes 25647 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth0 root r2q 10 default 0 direct_packets_stat 0
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 108: dev eth0 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth1 root r2q 10 default 0 direct_packets_stat 0
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 108: dev eth1 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0


---------------------
Traffic running:
---------------------

Analyser sent rate: 704218764 bits/s
Analyser received rate: 624942839 bits/s


# tc -s -d class ls dev eth1
class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 5772939852 bytes 7252437 pkt (dropped 0, overlimits 0 requeues 0)
 rate 624826Kbit 97169pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 5772939852 bytes 7252437 pkt (dropped 0, overlimits 0 requeues 0)
 rate 624826Kbit 97169pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 5772939852 bytes 7252437 pkt (dropped 0, overlimits 0 requeues 0)
 rate 624826Kbit 97169pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 5773001940 bytes 7252515 pkt (dropped 916587, overlimits 0 requeues 0)
 rate 624826Kbit 97169pps backlog 0b 78p requeues 0
 lended: 7252437 borrowed: 0 giants: 0
 tokens: -10 ctokens: -10



# tc -s qdisc
qdisc pfifo_fast 0: dev eth3 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1
1 1 1 1 1 1
 Sent 5611186 bytes 26259 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth0 root r2q 10 default 0 direct_packets_stat 0
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 108: dev eth0 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth1 root r2q 10 default 0 direct_packets_stat 0
 Sent 7122619144 bytes 8948014 pkt (dropped 1130906, overlimits
10090666 requeues 0)
 rate 0bit 0pps backlog 0b 70p requeues 0
qdisc sfq 108: dev eth1 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 7122619144 bytes 8948014 pkt (dropped 1130906, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 55720b 70p requeues 0




---------------------------
After traffic stopped:
---------------------------
(traffic ran for 170 seconds)

Analyser sent bytes: 15143884800
Analyser sent packets: 18929856
Analyser received bytes: 13444564800
Analyser received packets: 16805706


# tc -s -d class ls dev eth1
class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 13377341976 bytes 16805706 pkt (dropped 0, overlimits 0 requeues 0)
 rate 1061Mbit 2066pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 708 ctokens: 708

class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 13377341976 bytes 16805706 pkt (dropped 0, overlimits 0 requeues 0)
 rate 1061Mbit 2066pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 708 ctokens: 708

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 13377341976 bytes 16805706 pkt (dropped 0, overlimits 0 requeues 0)
 rate 1061Mbit 2066pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 708 ctokens: 708

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 13377341976 bytes 16805706 pkt (dropped 2124150, overlimits 0 requeues 0)
 rate 1061Mbit 2066pps backlog 0b 0p requeues 0
 lended: 16805706 borrowed: 0 giants: 0
 tokens: 503 ctokens: 503



# ifconfig
br0       Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6C
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

eth0      Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6C
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:88547472 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2118475264 (1.9 GiB)  TX bytes:0 (0.0 b)
          Base address:0x4000 Memory:e8200000-e8220000

eth1      Link encap:Ethernet  HWaddr 00:E0:ED:10:7C:6D
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:67067754 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:2114553248 (1.9 GiB)
          Base address:0x4040 Memory:e8220000-e8240000

eth3      Link encap:Ethernet  HWaddr 00:11:25:C4:60:AF
          inet addr:192.168.0.244  Bcast:19.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:476452 errors:0 dropped:0 overruns:0 frame:0
          TX packets:27435 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:35918090 (34.2 MiB)  TX bytes:5939712 (5.6 MiB)
          Interrupt:27

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:188 (188.0 b)  TX bytes:188 (188.0 b)


# tc -s qdisc
qdisc pfifo_fast 0: dev eth3 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1
1 1 1 1 1 1
 Sent 5623502 bytes 26347 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth0 root r2q 10 default 0 direct_packets_stat 0
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 108: dev eth0 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc htb 1: dev eth1 root r2q 10 default 0 direct_packets_stat 0
 Sent 13377341976 bytes 16805706 pkt (dropped 2124150, overlimits
18953263 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc sfq 108: dev eth1 parent 1:108 limit 127p quantum 1514b perturb 15sec
 Sent 13377341976 bytes 16805706 pkt (dropped 2124150, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0



Thanks
  Antonio Almeida



On Sat, May 16, 2009 at 9:31 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
>> Hi!
>> I've been using HTB in a Linux bridge and recently I noticed that, for
>> high speed, the configured rate/ceil is not respected as for lower
>> speeds.
>> I'm using a packet generator/analyser to inject over 950Mpbs, and see
>> what returns back to it, in the other side of my bridge. Generated
>> packets have 800bytes. I noticed that, for several tc HTB rate/ceil
>> configurations the amount of traffic received by the analyser stays
>> the same. See this values:
>>
>> HTB conf      Analyser reception
>> 476000Kbit    544.260.329
> ...
>> As you can see, class htb 1:108 rate's is 653124Kbit! Much bigger that
>> it's ceil.
>
> Is it for sure there is no gso/tso enabled on this dev (with up to
> date ethtool -k)? It would be nice to see also more details like
> .config, ifconfigs before and after the test, tc -s qdisc and bytes/
> packet number seen by this analyser, plus maybe some proof you can
> obtain such flows with something simpler like tbf. Of course using
> the current kernel, even if no difference, would give us more
> valuable perspective.
>
> Thanks,
> Jarek P.
>

[-- Attachment #2: config --]
[-- Type: application/octet-stream, Size: 60662 bytes --]

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.25.1
# Wed May 13 17:48:02 2009
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
# CONFIG_GENERIC_LOCKBREAK is not set
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
# CONFIG_GENERIC_GPIO is not set
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_AOUT=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_32_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION="SIMPLESHAPER2G-2G-1000Hz"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
CONFIG_AUDIT=y
# CONFIG_AUDITSYSCALL is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=15
# CONFIG_CGROUPS is not set
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_USER_SCHED=y
# CONFIG_CGROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_SYSFS_DEPRECATED_V2=y
# CONFIG_RELAY is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_COMPAT_BRK=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_PROFILING=y
# CONFIG_MARKERS is not set
CONFIG_OPROFILE=m
CONFIG_HAVE_OPROFILE=y
# CONFIG_KPROBES is not set
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_LSF=y
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_CLASSIC_RCU=y

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_X86_RDC321X is not set
# CONFIG_X86_VSMP is not set
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
CONFIG_MK8=y
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_IOMMU_HELPER is not set
CONFIG_NR_CPUS=4
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=m
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
# CONFIG_VMSPLIT_3G is not set
# CONFIG_VMSPLIT_3G_OPT is not set
CONFIG_VMSPLIT_2G=y
# CONFIG_VMSPLIT_2G_OPT is not set
# CONFIG_VMSPLIT_1G is not set
CONFIG_PAGE_OFFSET=0x80000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
# CONFIG_SPARSEMEM_VMEMMAP_ENABLE is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_RESOURCES_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_IRQBALANCE=y
# CONFIG_SECCOMP is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x100000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x100000
CONFIG_HOTPLUG_CPU=y
# CONFIG_COMPAT_VDSO is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management options
#
# CONFIG_PM is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=m
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=m
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m

#
# CPUFreq processor drivers
#
CONFIG_X86_POWERNOW_K6=m
CONFIG_X86_POWERNOW_K7=m
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_GX_SUSPMOD=m
CONFIG_X86_SPEEDSTEP_CENTRINO=m
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_X86_SPEEDSTEP_ICH=m
CONFIG_X86_SPEEDSTEP_SMI=m
CONFIG_X86_P4_CLOCKMOD=m
CONFIG_X86_CPUFREQ_NFORCE2=m
CONFIG_X86_LONGRUN=m
# CONFIG_X86_E_POWERSAVER is not set

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m
CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK=y
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=m
CONFIG_PCIEAER=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
CONFIG_SCx200=m
CONFIG_SCx200HR_TIMER=m
CONFIG_K8_NB=y
CONFIG_PCCARD=m
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=m
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
CONFIG_PD6729=m
CONFIG_I82092=m
CONFIG_I82365=m
CONFIG_TCIC=m
CONFIG_PCMCIA_PROBE=y
CONFIG_PCCARD_NONSTATIC=m
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
CONFIG_HOTPLUG_PCI_COMPAQ=m
# CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM is not set
CONFIG_HOTPLUG_PCI_IBM=m
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
CONFIG_HOTPLUG_PCI_SHPC=m

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_MISC=m

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
CONFIG_NET_KEY=m
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
# CONFIG_IP_ROUTE_MULTIPATH is not set
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
CONFIG_INET_LRO=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
# CONFIG_NETLABEL is not set
CONFIG_NETWORK_SECMARK=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CT_ACCT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CT_PROTO_SCTP is not set
# CONFIG_NF_CT_PROTO_UDPLITE is not set
# CONFIG_NF_CONNTRACK_AMANDA is not set
CONFIG_NF_CONNTRACK_FTP=m
# CONFIG_NF_CONNTRACK_H323 is not set
# CONFIG_NF_CONNTRACK_IRC is not set
# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set
# CONFIG_NF_CONNTRACK_PPTP is not set
# CONFIG_NF_CONNTRACK_SANE is not set
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NETFILTER_XTABLES=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
# CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP is not set
# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
# CONFIG_NETFILTER_XT_MATCH_ESP is not set
# CONFIG_NETFILTER_XT_MATCH_HELPER is not set
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
# CONFIG_NETFILTER_XT_MATCH_OWNER is not set
# CONFIG_NETFILTER_XT_MATCH_POLICY is not set
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_LAYER7=m
# CONFIG_NETFILTER_XT_MATCH_LAYER7_DEBUG is not set
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_CONNTRACK_IPV4=m
CONFIG_NF_CONNTRACK_PROC_COMPAT=y
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
# CONFIG_NF_NAT is not set
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
# CONFIG_IP_NF_TARGET_CLUSTERIP is not set
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# Bridge: Netfilter Configuration
#
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_BRIDGE=m
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=m
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
CONFIG_WAN_ROUTER=m
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_PRIO=m
# CONFIG_NET_SCH_RR is not set
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_INGRESS=m

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_FLOW is not set
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_GACT_PROB=y
CONFIG_NET_ACT_MIRRED=m
CONFIG_NET_ACT_IPT=m
# CONFIG_NET_ACT_NAT is not set
CONFIG_NET_ACT_PEDIT=m
CONFIG_NET_ACT_SIMP=m
CONFIG_NET_CLS_IND=y
CONFIG_NET_SCH_FIFO=y

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_AF_RXRPC=m
# CONFIG_AF_RXRPC_DEBUG is not set
# CONFIG_RXKAD is not set
CONFIG_FIB_RULES=y

#
# Wireless
#
# CONFIG_CFG80211 is not set
# CONFIG_WIRELESS_EXT is not set
# CONFIG_MAC80211 is not set
# CONFIG_IEEE80211 is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=m
CONFIG_MTD=m
# CONFIG_MTD_DEBUG is not set
CONFIG_MTD_CONCAT=m
CONFIG_MTD_PARTITIONS=y
CONFIG_MTD_REDBOOT_PARTS=m
CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
# CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set

#
# User Modules And Translation Layers
#
CONFIG_MTD_CHAR=m
CONFIG_MTD_BLKDEVS=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_BLOCK_RO=m
CONFIG_FTL=m
CONFIG_NFTL=m
CONFIG_NFTL_RW=y
CONFIG_INFTL=m
CONFIG_RFD_FTL=m
# CONFIG_SSFDC is not set
# CONFIG_MTD_OOPS is not set

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=m
CONFIG_MTD_JEDECPROBE=m
CONFIG_MTD_GEN_PROBE=m
# CONFIG_MTD_CFI_ADV_OPTIONS is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
CONFIG_MTD_CFI_INTELEXT=m
CONFIG_MTD_CFI_AMDSTD=m
CONFIG_MTD_CFI_STAA=m
CONFIG_MTD_CFI_UTIL=m
CONFIG_MTD_RAM=m
CONFIG_MTD_ROM=m
CONFIG_MTD_ABSENT=m

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
CONFIG_MTD_PHYSMAP=m
CONFIG_MTD_PHYSMAP_START=0x8000000
CONFIG_MTD_PHYSMAP_LEN=0x4000000
CONFIG_MTD_PHYSMAP_BANKWIDTH=2
CONFIG_MTD_SC520CDP=m
CONFIG_MTD_NETSC520=m
CONFIG_MTD_TS5500=m
CONFIG_MTD_SBC_GXX=m
CONFIG_MTD_SCx200_DOCFLASH=m
# CONFIG_MTD_AMD76XROM is not set
# CONFIG_MTD_ICHXROM is not set
# CONFIG_MTD_ESB2ROM is not set
# CONFIG_MTD_CK804XROM is not set
# CONFIG_MTD_SCB2_FLASH is not set
CONFIG_MTD_NETtel=m
CONFIG_MTD_DILNETPC=m
CONFIG_MTD_DILNETPC_BOOTSIZE=0x80000
# CONFIG_MTD_L440GX is not set
CONFIG_MTD_PCI=m
# CONFIG_MTD_INTEL_VR_NOR is not set
CONFIG_MTD_PLATRAM=m

#
# Self-contained MTD device drivers
#
CONFIG_MTD_PMC551=m
# CONFIG_MTD_PMC551_BUGFIX is not set
# CONFIG_MTD_PMC551_DEBUG is not set
CONFIG_MTD_DATAFLASH=m
CONFIG_MTD_M25P80=m
CONFIG_MTD_SLRAM=m
CONFIG_MTD_PHRAM=m
CONFIG_MTD_MTDRAM=m
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128
CONFIG_MTD_BLOCK2MTD=m

#
# Disk-On-Chip Device Drivers
#
CONFIG_MTD_DOC2000=m
CONFIG_MTD_DOC2001=m
CONFIG_MTD_DOC2001PLUS=m
CONFIG_MTD_DOCPROBE=m
CONFIG_MTD_DOCECC=m
# CONFIG_MTD_DOCPROBE_ADVANCED is not set
CONFIG_MTD_DOCPROBE_ADDRESS=0
CONFIG_MTD_NAND=m
# CONFIG_MTD_NAND_VERIFY_WRITE is not set
# CONFIG_MTD_NAND_ECC_SMC is not set
# CONFIG_MTD_NAND_MUSEUM_IDS is not set
CONFIG_MTD_NAND_IDS=m
CONFIG_MTD_NAND_DISKONCHIP=m
# CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADVANCED is not set
CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADDRESS=0
# CONFIG_MTD_NAND_DISKONCHIP_BBTWRITE is not set
# CONFIG_MTD_NAND_CAFE is not set
CONFIG_MTD_NAND_CS553X=m
# CONFIG_MTD_NAND_NANDSIM is not set
# CONFIG_MTD_NAND_PLATFORM is not set
# CONFIG_MTD_ALAUDA is not set
CONFIG_MTD_ONENAND=m
CONFIG_MTD_ONENAND_VERIFY_WRITE=y
# CONFIG_MTD_ONENAND_OTP is not set
# CONFIG_MTD_ONENAND_2X_PROGRAM is not set
# CONFIG_MTD_ONENAND_SIM is not set

#
# UBI - Unsorted block images
#
# CONFIG_MTD_UBI is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
CONFIG_ISAPNP=y
CONFIG_PNPBIOS=y
CONFIG_PNPBIOS_PROC_FS=y
# CONFIG_PNPACPI is not set
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_FD=m
CONFIG_BLK_DEV_XD=m
CONFIG_BLK_CPQ_DA=m
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_CISS_SCSI_TAPE=y
CONFIG_BLK_DEV_DAC960=m
CONFIG_BLK_DEV_UMEM=m
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=8192
# CONFIG_BLK_DEV_XIP is not set
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_MISC_DEVICES=y
CONFIG_IBM_ASM=m
# CONFIG_PHANTOM is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
CONFIG_HAVE_IDE=y
CONFIG_IDE=m
CONFIG_BLK_DEV_IDE=m

#
# Please see Documentation/ide/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=m
# CONFIG_IDEDISK_MULTI_MODE is not set
CONFIG_BLK_DEV_IDECS=m
# CONFIG_BLK_DEV_DELKIN is not set
CONFIG_BLK_DEV_IDECD=m
CONFIG_BLK_DEV_IDECD_VERBOSE_ERRORS=y
CONFIG_BLK_DEV_IDETAPE=m
CONFIG_BLK_DEV_IDEFLOPPY=m
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_PROC_FS=y

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=m
# CONFIG_BLK_DEV_PLATFORM is not set
CONFIG_BLK_DEV_CMD640=m
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
CONFIG_BLK_DEV_IDEPNP=m
CONFIG_BLK_DEV_IDEDMA_SFF=y

#
# PCI IDE chipsets support
#
CONFIG_BLK_DEV_IDEPCI=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=m
CONFIG_BLK_DEV_OPTI621=m
CONFIG_BLK_DEV_RZ1000=m
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_AEC62XX=m
CONFIG_BLK_DEV_ALI15X3=m
# CONFIG_WDC_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=m
CONFIG_BLK_DEV_ATIIXP=m
CONFIG_BLK_DEV_CMD64X=m
CONFIG_BLK_DEV_TRIFLEX=m
CONFIG_BLK_DEV_CY82C693=m
CONFIG_BLK_DEV_CS5520=m
CONFIG_BLK_DEV_CS5530=m
CONFIG_BLK_DEV_CS5535=m
CONFIG_BLK_DEV_HPT34X=m
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=m
CONFIG_BLK_DEV_JMICRON=m
CONFIG_BLK_DEV_SC1200=m
CONFIG_BLK_DEV_PIIX=m
# CONFIG_BLK_DEV_IT8213 is not set
CONFIG_BLK_DEV_IT821X=m
CONFIG_BLK_DEV_NS87415=m
CONFIG_BLK_DEV_PDC202XX_OLD=m
CONFIG_BLK_DEV_PDC202XX_NEW=m
CONFIG_BLK_DEV_SVWKS=m
CONFIG_BLK_DEV_SIIMAGE=m
CONFIG_BLK_DEV_SIS5513=m
CONFIG_BLK_DEV_SLC90E66=m
CONFIG_BLK_DEV_TRM290=m
CONFIG_BLK_DEV_VIA82CXXX=m
# CONFIG_BLK_DEV_TC86C001 is not set

#
# Other IDE chipsets support
#

#
# Note: most of these also require special kernel boot parameters
#
# CONFIG_BLK_DEV_4DRIVES is not set
# CONFIG_BLK_DEV_ALI14XX is not set
# CONFIG_BLK_DEV_DTC2278 is not set
# CONFIG_BLK_DEV_HT6560B is not set
# CONFIG_BLK_DEV_QD65XX is not set
# CONFIG_BLK_DEV_UMC8672 is not set
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDE_ARCH_OBSOLETE_INIT=y
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
# CONFIG_SCSI_SCAN_ASYNC is not set
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
# CONFIG_SCSI_SAS_LIBSAS is not set
CONFIG_SCSI_SRP_ATTRS=m
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
CONFIG_BLK_DEV_3W_XXXX_RAID=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_7000FASST=m
CONFIG_SCSI_ACARD=m
CONFIG_SCSI_AHA152X=m
CONFIG_SCSI_AHA1542=m
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=8
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_SCSI_AIC7XXX_OLD=m
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=15000
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_AIC79XX_REG_PRETTY_PRINT=y
# CONFIG_SCSI_AIC94XX is not set
CONFIG_SCSI_DPT_I2O=m
CONFIG_SCSI_ADVANSYS=m
CONFIG_SCSI_IN2000=m
CONFIG_SCSI_ARCMSR=m
# CONFIG_SCSI_ARCMSR_AER is not set
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=m
CONFIG_MEGARAID_MAILBOX=m
CONFIG_MEGARAID_LEGACY=m
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_HPTIOP=m
CONFIG_SCSI_BUSLOGIC=m
# CONFIG_SCSI_OMIT_FLASHPOINT is not set
CONFIG_SCSI_DMX3191D=m
CONFIG_SCSI_DTC3280=m
CONFIG_SCSI_EATA=m
CONFIG_SCSI_EATA_TAGGED_QUEUE=y
CONFIG_SCSI_EATA_LINKED_COMMANDS=y
CONFIG_SCSI_EATA_MAX_TAGS=16
CONFIG_SCSI_FUTURE_DOMAIN=m
CONFIG_SCSI_GDTH=m
CONFIG_SCSI_GENERIC_NCR5380=m
CONFIG_SCSI_GENERIC_NCR5380_MMIO=m
CONFIG_SCSI_GENERIC_NCR53C400=y
CONFIG_SCSI_IPS=m
CONFIG_SCSI_INITIO=m
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_MVSAS is not set
CONFIG_SCSI_NCR53C406A=m
# CONFIG_SCSI_STEX is not set
CONFIG_SCSI_SYM53C8XX_2=m
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
CONFIG_SCSI_SYM53C8XX_MMIO=y
# CONFIG_SCSI_IPR is not set
CONFIG_SCSI_PAS16=m
CONFIG_SCSI_QLOGIC_FAS=m
CONFIG_SCSI_QLOGIC_1280=m
CONFIG_SCSI_QLA_FC=m
# CONFIG_SCSI_QLA_ISCSI is not set
CONFIG_SCSI_LPFC=m
CONFIG_SCSI_SYM53C416=m
CONFIG_SCSI_DC395x=m
CONFIG_SCSI_DC390T=m
CONFIG_SCSI_T128=m
CONFIG_SCSI_U14_34F=m
CONFIG_SCSI_U14_34F_TAGGED_QUEUE=y
CONFIG_SCSI_U14_34F_LINKED_COMMANDS=y
CONFIG_SCSI_U14_34F_MAX_TAGS=8
CONFIG_SCSI_ULTRASTOR=m
CONFIG_SCSI_NSP32=m
CONFIG_SCSI_DEBUG=m
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
CONFIG_ATA=m
# CONFIG_ATA_NONSTANDARD is not set
# CONFIG_SATA_AHCI is not set
CONFIG_SATA_SVW=m
# CONFIG_ATA_PIIX is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIL24 is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_ISAPNP is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_LEGACY is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PCMCIA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_QDI is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_WINBOND_VLB is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=m
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_RAID5_RESHAPE=y
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_EMC=m
# CONFIG_DM_MULTIPATH_RDAC is not set
# CONFIG_DM_MULTIPATH_HP is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_UEVENT is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
CONFIG_IEEE1394=m

#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set

#
# Controllers
#
CONFIG_IEEE1394_PCILYNX=m
CONFIG_IEEE1394_OHCI1394=m

#
# Protocols
#
CONFIG_IEEE1394_VIDEO1394=m
CONFIG_IEEE1394_SBP2=m
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
CONFIG_IEEE1394_ETH1394_ROM_ENTRY=y
CONFIG_IEEE1394_ETH1394=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
CONFIG_I2O=m
CONFIG_I2O_LCT_NOTIFY_ON_CHANGES=y
CONFIG_I2O_EXT_ADAPTEC=y
CONFIG_I2O_CONFIG=m
CONFIG_I2O_CONFIG_OLD_IOCTL=y
CONFIG_I2O_BUS=m
CONFIG_I2O_BLOCK=m
CONFIG_I2O_SCSI=m
CONFIG_I2O_PROC=m
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_NETDEVICES_MULTIQUEUE is not set
CONFIG_IFB=m
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=m
# CONFIG_VETH is not set
CONFIG_NET_SB1000=m
CONFIG_ARCNET=m
CONFIG_ARCNET_1201=m
CONFIG_ARCNET_1051=m
CONFIG_ARCNET_RAW=m
CONFIG_ARCNET_CAP=m
CONFIG_ARCNET_COM90xx=m
CONFIG_ARCNET_COM90xxIO=m
CONFIG_ARCNET_RIM_I=m
CONFIG_ARCNET_COM20020=m
CONFIG_ARCNET_COM20020_ISA=m
CONFIG_ARCNET_COM20020_PCI=m
CONFIG_PHYLIB=m

#
# MII PHY device drivers
#
CONFIG_MARVELL_PHY=m
CONFIG_DAVICOM_PHY=m
CONFIG_QSEMI_PHY=m
CONFIG_LXT_PHY=m
CONFIG_CICADA_PHY=m
CONFIG_VITESSE_PHY=m
CONFIG_SMSC_PHY=m
# CONFIG_BROADCOM_PHY is not set
# CONFIG_ICPLUS_PHY is not set
# CONFIG_REALTEK_PHY is not set
# CONFIG_MDIO_BITBANG is not set
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
CONFIG_HAPPYMEAL=m
CONFIG_SUNGEM=m
CONFIG_CASSINI=m
CONFIG_NET_VENDOR_3COM=y
CONFIG_EL1=m
CONFIG_EL2=m
CONFIG_ELPLUS=m
CONFIG_EL16=m
CONFIG_EL3=m
CONFIG_3C515=m
CONFIG_VORTEX=m
CONFIG_TYPHOON=m
CONFIG_LANCE=m
CONFIG_NET_VENDOR_SMC=y
CONFIG_WD80x3=m
CONFIG_ULTRA=m
CONFIG_SMC9194=m
# CONFIG_ENC28J60 is not set
CONFIG_NET_VENDOR_RACAL=y
CONFIG_NI52=m
CONFIG_NI65=m
CONFIG_NET_TULIP=y
CONFIG_DE2104X=m
CONFIG_TULIP=m
# CONFIG_TULIP_MWI is not set
# CONFIG_TULIP_MMIO is not set
CONFIG_TULIP_NAPI=y
CONFIG_TULIP_NAPI_HW_MITIGATION=y
CONFIG_DE4X5=m
CONFIG_WINBOND_840=m
CONFIG_DM9102=m
CONFIG_ULI526X=m
CONFIG_PCMCIA_XIRCOM=m
CONFIG_AT1700=m
CONFIG_DEPCA=m
CONFIG_HP100=m
CONFIG_NET_ISA=y
CONFIG_E2100=m
CONFIG_EWRK3=m
CONFIG_EEXPRESS=m
CONFIG_EEXPRESS_PRO=m
CONFIG_HPLAN_PLUS=m
CONFIG_HPLAN=m
CONFIG_LP486E=m
CONFIG_ETH16I=m
CONFIG_NE2000=m
CONFIG_ZNET=m
CONFIG_SEEQ8005=m
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
CONFIG_NET_PCI=y
CONFIG_PCNET32=m
# CONFIG_PCNET32_NAPI is not set
CONFIG_AMD8111_ETH=m
CONFIG_AMD8111E_NAPI=y
CONFIG_ADAPTEC_STARFIRE=m
CONFIG_ADAPTEC_STARFIRE_NAPI=y
CONFIG_AC3200=m
CONFIG_APRICOT=m
CONFIG_B44=m
CONFIG_B44_PCI_AUTOSELECT=y
CONFIG_B44_PCICORE_AUTOSELECT=y
CONFIG_B44_PCI=y
CONFIG_FORCEDETH=m
# CONFIG_FORCEDETH_NAPI is not set
CONFIG_CS89x0=m
CONFIG_EEPRO100=m
CONFIG_E100=m
CONFIG_FEALNX=m
CONFIG_NATSEMI=m
CONFIG_NE2K_PCI=m
CONFIG_8139CP=m
CONFIG_8139TOO=m
CONFIG_8139TOO_PIO=y
CONFIG_8139TOO_TUNE_TWISTER=y
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_R6040 is not set
CONFIG_SIS900=m
CONFIG_EPIC100=m
CONFIG_SUNDANCE=m
# CONFIG_SUNDANCE_MMIO is not set
CONFIG_TLAN=m
CONFIG_VIA_RHINE=m
# CONFIG_VIA_RHINE_MMIO is not set
CONFIG_VIA_RHINE_NAPI=y
# CONFIG_SC92031 is not set
CONFIG_NETDEV_1000=y
CONFIG_ACENIC=m
# CONFIG_ACENIC_OMIT_TIGON_I is not set
CONFIG_DL2K=m
CONFIG_E1000=m
CONFIG_E1000_NAPI=y
# CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
# CONFIG_E1000E is not set
# CONFIG_E1000E_ENABLED is not set
# CONFIG_IP1000 is not set
# CONFIG_IGB is not set
CONFIG_NS83820=m
CONFIG_HAMACHI=m
CONFIG_YELLOWFIN=m
CONFIG_R8169=m
CONFIG_R8169_NAPI=y
CONFIG_SIS190=m
CONFIG_SKGE=m
# CONFIG_SKGE_DEBUG is not set
CONFIG_SKY2=m
# CONFIG_SKY2_DEBUG is not set
# CONFIG_SK98LIN is not set
CONFIG_VIA_VELOCITY=m
CONFIG_TIGON3=m
CONFIG_BNX2=m
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
CONFIG_NETDEV_10000=y
CONFIG_CHELSIO_T1=m
# CONFIG_CHELSIO_T1_1G is not set
CONFIG_CHELSIO_T1_NAPI=y
# CONFIG_CHELSIO_T3 is not set
# CONFIG_IXGBE is not set
CONFIG_IXGB=m
CONFIG_IXGB_NAPI=y
CONFIG_S2IO=m
CONFIG_S2IO_NAPI=y
CONFIG_MYRI10GE=m
# CONFIG_NETXEN_NIC is not set
# CONFIG_NIU is not set
# CONFIG_MLX4_CORE is not set
# CONFIG_TEHUTI is not set
# CONFIG_BNX2X is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
# CONFIG_WLAN_80211 is not set

#
# USB Network Adapters
#
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_CDCETHER=m
# CONFIG_USB_NET_DM9601 is not set
CONFIG_USB_NET_GL620A=m
CONFIG_USB_NET_NET1080=m
CONFIG_USB_NET_PLUSB=m
# CONFIG_USB_NET_MCS7830 is not set
CONFIG_USB_NET_RNDIS_HOST=m
CONFIG_USB_NET_CDC_SUBSET=m
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
# CONFIG_USB_KC2190 is not set
CONFIG_USB_NET_ZAURUS=m
CONFIG_NET_PCMCIA=y
CONFIG_PCMCIA_3C589=m
CONFIG_PCMCIA_3C574=m
CONFIG_PCMCIA_FMVJ18X=m
CONFIG_PCMCIA_PCNET=m
CONFIG_PCMCIA_NMCLAN=m
CONFIG_PCMCIA_SMC91C92=m
CONFIG_PCMCIA_XIRC2PS=m
CONFIG_PCMCIA_AXNET=m
CONFIG_ARCNET_COM20020_CS=m
CONFIG_WAN=y
CONFIG_HOSTESS_SV11=m
CONFIG_COSA=m
CONFIG_LANMEDIA=m
CONFIG_SEALEVEL_4021=m
CONFIG_HDLC=m
CONFIG_HDLC_RAW=m
CONFIG_HDLC_RAW_ETH=m
CONFIG_HDLC_CISCO=m
CONFIG_HDLC_FR=m

#
# X.25/LAPB support is disabled
#
CONFIG_PCI200SYN=m
CONFIG_WANXL=m
CONFIG_PC300=m

#
# Cyclades-PC300 MLPPP support is disabled.
#

#
# Refer to the file README.mlppp, provided by PC300 package.
#
# CONFIG_PC300TOO is not set
CONFIG_N2=m
CONFIG_C101=m
CONFIG_FARSYNC=m
CONFIG_DSCC4=m
CONFIG_DSCC4_PCISYNC=y
CONFIG_DSCC4_PCI_RST=y
CONFIG_DLCI=m
CONFIG_DLCI_MAX=8
CONFIG_SDLA=m
CONFIG_WAN_ROUTER_DRIVERS=m
CONFIG_CYCLADES_SYNC=m
CONFIG_CYCLOMX_X25=y
CONFIG_SBNI=m
# CONFIG_SBNI_MULTILINE is not set
CONFIG_FDDI=y
CONFIG_DEFXX=m
# CONFIG_DEFXX_MMIO is not set
CONFIG_SKFP=m
CONFIG_HIPPI=y
CONFIG_ROADRUNNER=m
# CONFIG_ROADRUNNER_LARGE_RINGS is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
CONFIG_NETCONSOLE=m
# CONFIG_NETCONSOLE_DYNAMIC is not set
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
CONFIG_INPUT_POLLDEV=m

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_EVDEV=m
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_SUNKBD=m
CONFIG_KEYBOARD_LKKBD=m
CONFIG_KEYBOARD_XTKBD=m
CONFIG_KEYBOARD_NEWTON=m
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_APANEL is not set
CONFIG_INPUT_WISTRON_BTNS=m
# CONFIG_INPUT_ATI_REMOTE is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
CONFIG_INPUT_UINPUT=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=m
CONFIG_SERIO_CT82C710=m
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_GAMEPORT=m
CONFIG_GAMEPORT_NS558=m
CONFIG_GAMEPORT_L4=m
CONFIG_GAMEPORT_EMU10K1=m
CONFIG_GAMEPORT_FM801=m

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
CONFIG_ROCKETPORT=m
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_ESPSERIAL is not set
# CONFIG_MOXA_INTELLIO is not set
CONFIG_MOXA_SMARTIO=m
# CONFIG_ISI is not set
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_N_HDLC=m
# CONFIG_RISCOM8 is not set
# CONFIG_SPECIALIX is not set
CONFIG_SX=m
# CONFIG_RIO is not set
CONFIG_STALDRV=y
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_CS=m
CONFIG_SERIAL_8250_NR_UARTS=16
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_FOURPORT=m
CONFIG_SERIAL_8250_ACCENT=m
CONFIG_SERIAL_8250_BOCA=m
# CONFIG_SERIAL_8250_EXAR_ST16C554 is not set
CONFIG_SERIAL_8250_HUB6=m
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=16
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_GEODE=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_NVRAM=m
CONFIG_RTC=m
CONFIG_GEN_RTC=m
CONFIG_GEN_RTC_X=y
CONFIG_DTLK=m
CONFIG_R3964=m
CONFIG_APPLICOM=m
CONFIG_SONYPI=m

#
# PCMCIA character devices
#
CONFIG_SYNCLINK_CS=m
CONFIG_CARDMAN_4000=m
CONFIG_CARDMAN_4040=m
# CONFIG_IPWIRELESS is not set
CONFIG_MWAVE=m
CONFIG_SCx200_GPIO=m
CONFIG_PC8736x_GPIO=m
CONFIG_NSC_GPIO=m
CONFIG_CS5535_GPIO=m
CONFIG_RAW_DRIVER=m
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=m
# CONFIG_TCG_TPM is not set
CONFIG_TELCLOCK=m
CONFIG_DEVPORT=y
CONFIG_I2C=m
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_OCORES=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
# CONFIG_I2C_SIMTEC is not set
CONFIG_SCx200_I2C=m
CONFIG_SCx200_I2C_SCL=12
CONFIG_SCx200_I2C_SDA=13
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
# CONFIG_I2C_TAOS_EVM is not set
CONFIG_I2C_STUB=m
# CONFIG_I2C_TINY_USB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
CONFIG_I2C_PCA_ISA=m

#
# Miscellaneous I2C Chip support
#
# CONFIG_DS1682 is not set
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
# CONFIG_PCF8575 is not set
CONFIG_SENSORS_PCF8591=m
# CONFIG_TPS65010 is not set
CONFIG_SENSORS_MAX6875=m
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set

#
# SPI support
#
CONFIG_SPI=y
CONFIG_SPI_MASTER=y

#
# SPI Master Controller Drivers
#
CONFIG_SPI_BITBANG=m

#
# SPI Protocol Masters
#
# CONFIG_SPI_AT25 is not set
# CONFIG_SPI_SPIDEV is not set
# CONFIG_SPI_TLE62X0 is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
CONFIG_SENSORS_ABITUGURU=m
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7418 is not set
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
# CONFIG_SENSORS_ADM1029 is not set
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
# CONFIG_SENSORS_ADT7470 is not set
# CONFIG_SENSORS_ADT7473 is not set
# CONFIG_SENSORS_K8TEMP is not set
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS1621=m
# CONFIG_SENSORS_I5K_AMB is not set
CONFIG_SENSORS_F71805F=m
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_FSCPOS=m
# CONFIG_SENSORS_FSCHMD is not set
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IBMPEX is not set
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM70=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
# CONFIG_SENSORS_LM93 is not set
CONFIG_SENSORS_MAX1619=m
# CONFIG_SENSORS_MAX6650 is not set
CONFIG_SENSORS_PC87360=m
# CONFIG_SENSORS_PC87427 is not set
CONFIG_SENSORS_SIS5595=m
# CONFIG_SENSORS_DME1737 is not set
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
# CONFIG_SENSORS_ADS7828 is not set
# CONFIG_SENSORS_THMC50 is not set
CONFIG_SENSORS_VIA686A=m
# CONFIG_SENSORS_VT1211 is not set
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83L785TS=m
# CONFIG_SENSORS_W83L786NG is not set
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
CONFIG_SENSORS_HDAPS=m
# CONFIG_SENSORS_APPLESMC is not set
# CONFIG_HWMON_DEBUG_CHIP is not set
CONFIG_THERMAL=y
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_ACQUIRE_WDT=m
CONFIG_ADVANTECH_WDT=m
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_SC520_WDT=m
CONFIG_EUROTECH_WDT=m
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
CONFIG_WAFER_WDT=m
CONFIG_I6300ESB_WDT=m
# CONFIG_ITCO_WDT is not set
# CONFIG_IT8712F_WDT is not set
# CONFIG_HP_WATCHDOG is not set
CONFIG_SC1200_WDT=m
CONFIG_SCx200_WDT=m
# CONFIG_PC87413_WDT is not set
CONFIG_60XX_WDT=m
CONFIG_SBC8360_WDT=m
# CONFIG_SBC7240_WDT is not set
CONFIG_CPU5_WDT=m
# CONFIG_SMSC37B787_WDT is not set
CONFIG_W83627HF_WDT=m
# CONFIG_W83697HF_WDT is not set
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
CONFIG_SBC_EPX_C3_WATCHDOG=m

#
# ISA-based Watchdog Cards
#
CONFIG_PCWATCHDOG=m
CONFIG_MIXCOMWD=m
CONFIG_WDT=m
CONFIG_WDT_501=y

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
CONFIG_WDT_501_PCI=y

#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m

#
# Sonics Silicon Backplane
#
CONFIG_SSB_POSSIBLE=y
CONFIG_SSB=m
CONFIG_SSB_PCIHOST_POSSIBLE=y
CONFIG_SSB_PCIHOST=y
# CONFIG_SSB_B43_PCI_BRIDGE is not set
CONFIG_SSB_PCMCIAHOST_POSSIBLE=y
# CONFIG_SSB_PCMCIAHOST is not set
# CONFIG_SSB_DEBUG is not set
CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y
CONFIG_SSB_DRIVER_PCICORE=y

#
# Multifunction device drivers
#
# CONFIG_MFD_SM501 is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
# CONFIG_DVB_CORE is not set
# CONFIG_DAB is not set

#
# Graphics support
#
CONFIG_AGP=m
CONFIG_AGP_ALI=m
CONFIG_AGP_ATI=m
CONFIG_AGP_AMD=m
CONFIG_AGP_AMD64=m
CONFIG_AGP_INTEL=m
CONFIG_AGP_NVIDIA=m
CONFIG_AGP_SIS=m
CONFIG_AGP_SWORKS=m
CONFIG_AGP_VIA=m
CONFIG_AGP_EFFICEON=m
CONFIG_DRM=m
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m
CONFIG_DRM_I810=m
CONFIG_DRM_I830=m
CONFIG_DRM_I915=m
CONFIG_DRM_MGA=m
CONFIG_DRM_SIS=m
CONFIG_DRM_VIA=m
CONFIG_DRM_SAVAGE=m
CONFIG_VGASTATE=m
# CONFIG_VIDEO_OUTPUT_CONTROL is not set
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_DDC=m
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
CONFIG_FB_SYS_FILLRECT=m
CONFIG_FB_SYS_COPYAREA=m
CONFIG_FB_SYS_IMAGEBLIT=m
CONFIG_FB_SYS_FOPS=m
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
CONFIG_FB_BACKLIGHT=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
CONFIG_FB_CIRRUS=m
CONFIG_FB_PM2=m
CONFIG_FB_PM2_FIFO_DISCONNECT=y
CONFIG_FB_CYBER2000=m
CONFIG_FB_ARC=m
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
# CONFIG_FB_UVESA is not set
CONFIG_FB_VESA=y
# CONFIG_FB_EFI is not set
# CONFIG_FB_HECUBA is not set
CONFIG_FB_HGA=m
# CONFIG_FB_HGA_ACCEL is not set
CONFIG_FB_S1D13XXX=m
CONFIG_FB_NVIDIA=m
CONFIG_FB_NVIDIA_I2C=y
# CONFIG_FB_NVIDIA_DEBUG is not set
CONFIG_FB_NVIDIA_BACKLIGHT=y
# CONFIG_FB_RIVA is not set
CONFIG_FB_I810=m
# CONFIG_FB_I810_GTF is not set
# CONFIG_FB_LE80578 is not set
CONFIG_FB_INTEL=m
# CONFIG_FB_INTEL_DEBUG is not set
CONFIG_FB_INTEL_I2C=y
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y
CONFIG_FB_MATROX_MYSTIQUE=y
CONFIG_FB_MATROX_G=y
CONFIG_FB_MATROX_I2C=m
CONFIG_FB_MATROX_MAVEN=m
CONFIG_FB_MATROX_MULTIHEAD=y
CONFIG_FB_RADEON=m
CONFIG_FB_RADEON_I2C=y
CONFIG_FB_RADEON_BACKLIGHT=y
# CONFIG_FB_RADEON_DEBUG is not set
CONFIG_FB_ATY128=m
CONFIG_FB_ATY128_BACKLIGHT=y
CONFIG_FB_ATY=m
CONFIG_FB_ATY_CT=y
CONFIG_FB_ATY_GENERIC_LCD=y
CONFIG_FB_ATY_GX=y
CONFIG_FB_ATY_BACKLIGHT=y
# CONFIG_FB_S3 is not set
CONFIG_FB_SAVAGE=m
CONFIG_FB_SAVAGE_I2C=y
# CONFIG_FB_SAVAGE_ACCEL is not set
CONFIG_FB_SIS=m
CONFIG_FB_SIS_300=y
CONFIG_FB_SIS_315=y
CONFIG_FB_NEOMAGIC=m
CONFIG_FB_KYRO=m
CONFIG_FB_3DFX=m
# CONFIG_FB_3DFX_ACCEL is not set
CONFIG_FB_VOODOO1=m
# CONFIG_FB_VT8623 is not set
CONFIG_FB_CYBLA=m
CONFIG_FB_TRIDENT=m
# CONFIG_FB_TRIDENT_ACCEL is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
CONFIG_FB_GEODE=y
# CONFIG_FB_GEODE_LX is not set
CONFIG_FB_GEODE_GX=m
# CONFIG_FB_GEODE_GX_SET_FBSIZE is not set
CONFIG_FB_GEODE_GX1=m
CONFIG_FB_VIRTUAL=m
CONFIG_BACKLIGHT_LCD_SUPPORT=y
# CONFIG_LCD_CLASS_DEVICE is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_CORGI is not set
# CONFIG_BACKLIGHT_PROGEAR is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VGACON_SOFT_SCROLLBACK is not set
CONFIG_VIDEO_SELECT=y
CONFIG_MDA_CONSOLE=m
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY is not set
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_LOGO is not set

#
# Sound
#
# CONFIG_SOUND is not set
CONFIG_HID_SUPPORT=y
CONFIG_HID=y
CONFIG_HID_DEBUG=y
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=m
# CONFIG_USB_HIDINPUT_POWERBOOK is not set
# CONFIG_HID_FF is not set
# CONFIG_USB_HIDDEV is not set

#
# USB HID Boot Protocol drivers
#
CONFIG_USB_KBD=m
CONFIG_USB_MOUSE=m
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set
# CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_DEVICE_CLASS=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_EHCI_ROOT_HUB_TT=y
# CONFIG_USB_EHCI_TT_NEWSCHED is not set
CONFIG_USB_ISP116X_HCD=m
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_HCD_SSB is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
CONFIG_USB_SL811_HCD=m
CONFIG_USB_SL811_CS=m
# CONFIG_USB_R8A66597_HCD is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
# CONFIG_USB_STORAGE_ONETOUCH is not set
# CONFIG_USB_STORAGE_KARMA is not set
# CONFIG_USB_LIBUSUAL is not set

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
CONFIG_USB_MON=y

#
# USB port drivers
#
CONFIG_USB_SERIAL=m
CONFIG_USB_EZUSB=y
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_AIRCABLE is not set
CONFIG_USB_SERIAL_AIRPRIME=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
# CONFIG_USB_SERIAL_CH341 is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP2101=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_FUNSOFT=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
# CONFIG_USB_SERIAL_IUU is not set
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
# CONFIG_USB_SERIAL_KEYSPAN is not set
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
# CONFIG_USB_SERIAL_MOS7720 is not set
# CONFIG_USB_SERIAL_MOS7840 is not set
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_OTI6858 is not set
CONFIG_USB_SERIAL_HP4X=m
CONFIG_USB_SERIAL_SAFE=m
# CONFIG_USB_SERIAL_SAFE_PADDED is not set
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
# CONFIG_USB_SERIAL_DEBUG is not set

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_ADUTUX is not set
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
# CONFIG_USB_BERRY_CHARGE is not set
CONFIG_USB_LED=m
CONFIG_USB_CYPRESS_CY7C63=m
CONFIG_USB_CYTHERM=m
# CONFIG_USB_PHIDGET is not set
CONFIG_USB_IDMOUSE=m
# CONFIG_USB_FTDI_ELAN is not set
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
# CONFIG_USB_TRANCEVIBRATOR is not set
# CONFIG_USB_IOWARRIOR is not set
CONFIG_USB_TEST=m
CONFIG_USB_GADGET=m
# CONFIG_USB_GADGET_DEBUG_FILES is not set
# CONFIG_USB_GADGET_DEBUG_FS is not set
CONFIG_USB_GADGET_SELECTED=y
# CONFIG_USB_GADGET_AMD5536UDC is not set
# CONFIG_USB_GADGET_ATMEL_USBA is not set
# CONFIG_USB_GADGET_FSL_USB2 is not set
CONFIG_USB_GADGET_NET2280=y
CONFIG_USB_NET2280=m
# CONFIG_USB_GADGET_PXA2XX is not set
# CONFIG_USB_GADGET_M66592 is not set
# CONFIG_USB_GADGET_GOKU is not set
# CONFIG_USB_GADGET_LH7A40X is not set
# CONFIG_USB_GADGET_OMAP is not set
# CONFIG_USB_GADGET_S3C2410 is not set
# CONFIG_USB_GADGET_AT91 is not set
# CONFIG_USB_GADGET_DUMMY_HCD is not set
CONFIG_USB_GADGET_DUALSPEED=y
CONFIG_USB_ZERO=m
CONFIG_USB_ETH=m
CONFIG_USB_ETH_RNDIS=y
CONFIG_USB_GADGETFS=m
CONFIG_USB_FILE_STORAGE=m
# CONFIG_USB_FILE_STORAGE_TEST is not set
CONFIG_USB_G_SERIAL=m
# CONFIG_USB_MIDI_GADGET is not set
# CONFIG_USB_G_PRINTER is not set
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
# CONFIG_MMC_UNSAFE_RESUME is not set

#
# MMC/SD Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_BOUNCE=y
# CONFIG_SDIO_UART is not set

#
# MMC/SD Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
# CONFIG_MMC_RICOH_MMC is not set
CONFIG_MMC_WBSD=m
# CONFIG_MMC_TIFM_SD is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=m

#
# LED drivers
#
CONFIG_LEDS_NET48XX=m
# CONFIG_LEDS_WRAP is not set
# CONFIG_LEDS_CLEVO_MAIL is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_TIMER=m
CONFIG_LEDS_TRIGGER_IDE_DISK=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=m
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=m
CONFIG_RTC_CLASS=m

#
# Conflicting RTC option has been selected, check GEN_RTC and RTC
#

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
# CONFIG_RTC_DRV_TEST is not set

#
# I2C RTC drivers
#
CONFIG_RTC_DRV_DS1307=m
# CONFIG_RTC_DRV_DS1374 is not set
CONFIG_RTC_DRV_DS1672=m
# CONFIG_RTC_DRV_MAX6900 is not set
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
# CONFIG_RTC_DRV_M41T80 is not set
# CONFIG_RTC_DRV_S35390A is not set

#
# SPI RTC drivers
#
CONFIG_RTC_DRV_MAX6902=m
# CONFIG_RTC_DRV_R9701 is not set
CONFIG_RTC_DRV_RS5C348=m

#
# Platform RTC drivers
#
# CONFIG_RTC_DRV_CMOS is not set
# CONFIG_RTC_DRV_DS1511 is not set
CONFIG_RTC_DRV_DS1553=m
CONFIG_RTC_DRV_DS1742=m
# CONFIG_RTC_DRV_STK17TA8 is not set
CONFIG_RTC_DRV_M48T86=m
# CONFIG_RTC_DRV_M48T59 is not set
CONFIG_RTC_DRV_V3020=m

#
# on-CPU RTC drivers
#
# CONFIG_DMADEVICES is not set

#
# Userspace I/O
#
# CONFIG_UIO is not set

#
# Firmware Drivers
#
CONFIG_EDD=m
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y

#
# File systems
#
CONFIG_EXT2_FS=m
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4DEV_FS is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
CONFIG_REISERFS_FS_SECURITY=y
CONFIG_JFS_FS=m
CONFIG_JFS_POSIX_ACL=y
CONFIG_JFS_SECURITY=y
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=m
CONFIG_XFS_QUOTA=y
CONFIG_XFS_SECURITY=y
CONFIG_XFS_POSIX_ACL=y
CONFIG_XFS_RT=y
# CONFIG_GFS2_FS is not set
CONFIG_OCFS2_FS=m
CONFIG_OCFS2_DEBUG_MASKLOG=y
# CONFIG_OCFS2_DEBUG_FS is not set
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_QUOTA is not set
CONFIG_QUOTACTL=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
CONFIG_NTFS_RW=y

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
# CONFIG_TMPFS_POSIX_ACL is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_CONFIGFS_FS=m

#
# Miscellaneous filesystems
#
CONFIG_ADFS_FS=m
# CONFIG_ADFS_FS_RW is not set
CONFIG_AFFS_FS=m
# CONFIG_ECRYPT_FS is not set
CONFIG_HFS_FS=m
CONFIG_HFSPLUS_FS=m
CONFIG_BEFS_FS=m
# CONFIG_BEFS_DEBUG is not set
CONFIG_BFS_FS=m
CONFIG_EFS_FS=m
CONFIG_JFFS2_FS=m
CONFIG_JFFS2_FS_DEBUG=0
CONFIG_JFFS2_FS_WRITEBUFFER=y
# CONFIG_JFFS2_FS_WBUF_VERIFY is not set
# CONFIG_JFFS2_SUMMARY is not set
CONFIG_JFFS2_FS_XATTR=y
CONFIG_JFFS2_FS_POSIX_ACL=y
CONFIG_JFFS2_FS_SECURITY=y
# CONFIG_JFFS2_COMPRESSION_OPTIONS is not set
CONFIG_JFFS2_ZLIB=y
# CONFIG_JFFS2_LZO is not set
CONFIG_JFFS2_RTIME=y
# CONFIG_JFFS2_RUBIN is not set
CONFIG_CRAMFS=y
CONFIG_VXFS_FS=m
CONFIG_MINIX_FS=m
CONFIG_HPFS_FS=m
CONFIG_QNX4FS_FS=m
CONFIG_ROMFS_FS=m
CONFIG_SYSV_FS=m
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set
# CONFIG_UFS_DEBUG is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
# CONFIG_NFSD is not set
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
# CONFIG_SUNRPC_BIND34 is not set
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_WEAK_PW_HASH is not set
# CONFIG_CIFS_XATTR is not set
# CONFIG_CIFS_DEBUG2 is not set
# CONFIG_CIFS_EXPERIMENTAL is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
# CONFIG_ACORN_PARTITION_CUMANA is not set
# CONFIG_ACORN_PARTITION_EESOX is not set
CONFIG_ACORN_PARTITION_ICS=y
# CONFIG_ACORN_PARTITION_ADFS is not set
# CONFIG_ACORN_PARTITION_POWERTEC is not set
CONFIG_ACORN_PARTITION_RISCIX=y
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
CONFIG_ATARI_PARTITION=y
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
# CONFIG_DEBUG_KERNEL is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_LATENCYTOP is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_SAMPLES is not set
CONFIG_EARLY_PRINTK=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_DOUBLEFAULT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_NETWORK_XFRM=y
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_FILE_CAPABILITIES is not set
CONFIG_SECURITY_DEFAULT_MMAP_MIN_ADDR=0
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_ENABLE_SECMARK_DEFAULT is not set
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_AEAD=m
CONFIG_CRYPTO_BLKCIPHER=m
# CONFIG_CRYPTO_SEQIV is not set
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_HMAC=y
# CONFIG_CRYPTO_XCBC is not set
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_TGR192=m
# CONFIG_CRYPTO_GF128MUL is not set
CONFIG_CRYPTO_ECB=m
CONFIG_CRYPTO_CBC=m
# CONFIG_CRYPTO_PCBC is not set
# CONFIG_CRYPTO_LRW is not set
# CONFIG_CRYPTO_XTS is not set
# CONFIG_CRYPTO_CTR is not set
# CONFIG_CRYPTO_GCM is not set
# CONFIG_CRYPTO_CCM is not set
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_FCRYPT is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
# CONFIG_CRYPTO_TWOFISH_586 is not set
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_AES_586=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_ANUBIS=m
# CONFIG_CRYPTO_SEED is not set
# CONFIG_CRYPTO_SALSA20 is not set
# CONFIG_CRYPTO_SALSA20_586 is not set
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_TEST=m
CONFIG_CRYPTO_AUTHENC=m
# CONFIG_CRYPTO_LZO is not set
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
# CONFIG_CRYPTO_DEV_PADLOCK_SHA is not set
# CONFIG_CRYPTO_DEV_GEODE is not set
# CONFIG_CRYPTO_DEV_HIFN_795X is not set
CONFIG_HAVE_KVM=y
CONFIG_VIRTUALIZATION=y
# CONFIG_KVM is not set
# CONFIG_LGUEST is not set
# CONFIG_VIRTIO_PCI is not set
# CONFIG_VIRTIO_BALLOON is not set

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_AUDIT_GENERIC=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_DEC16=y
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 10:01     ` Antonio Almeida
@ 2009-05-18 10:45       ` Jarek Poplawski
  2009-05-18 12:27         ` Antonio Almeida
  2009-05-18 16:13       ` Stephen Hemminger
  1 sibling, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 10:45 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Vladimir Ivashchenko

On Mon, May 18, 2009 at 11:01:21AM +0100, Antonio Almeida wrote:
> Hi!
> 
> cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> returns "jiffies"
> 
> With HFSC the accuracy is good. Also with packets of 800 bytes I got
> these values:
> received            configured         error
> 904596519	900000000		0,51
> 804293658	800000000		0,54
> 703662853	700000000		0,52
> 603354059	600000000		0,56
> 502805411	500000000		0,56
> 402527055	400000000		0,63
> 301484904	300000000		0,49
> 201074301	200000000		0,54
> 100546656	100000000		0,55
> 

Looks great! But, since HFSC uses rates directly (without rate tables)
seems to be logical. So, it looks like the best choice for handling
>100Mbit configs now.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 10:39     ` Antonio Almeida
@ 2009-05-18 11:14       ` Jarek Poplawski
  2009-05-18 12:05         ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 11:14 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, kaber, davem, devik

On Mon, May 18, 2009 at 11:39:02AM +0100, Antonio Almeida wrote:
> Hi!
> 
> Here the information you asked:

Very nice, but there are some questions:
- if this analyser uses tcp we definitely need tso off as well during
  these tests,
- it would be nice to use two patches I've sent to exclude known (now)
  reasons.

With the above I expect accuracy should be better, but definitely not
like hfsc (plus no higher than 1000Mbit rate reported after stopping
effect).

Thanks,
Jarek P.

> 
> # ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
> udp fragmentation offload: off
> generic segmentation offload: off
> 
> # ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
> udp fragmentation offload: off
> generic segmentation offload: off
> 
> The bridge is between eth0 and eth1
...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 11:14       ` Jarek Poplawski
@ 2009-05-18 12:05         ` Antonio Almeida
  0 siblings, 0 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 12:05 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik

The analyser traffic is tcp. Setting tso off the accuracy stays the same

# ethtool -K eth0 tso off
# ethtool -K eth1 tso off

# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off


# tc -s -d class ls dev eth1 | head -24
class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 164938012460 bytes 206824215 pkt (dropped 0, overlimits 0 requeues 0)
 rate 652715Kbit 97655pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 164938012460 bytes 206824215 pkt (dropped 0, overlimits 0 requeues 0)
 rate 652715Kbit 97655pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 164938012460 bytes 206824215 pkt (dropped 0, overlimits 0 requeues 0)
 rate 652715Kbit 97655pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 164938040048 bytes 206824248 pkt (dropped 25827911, overlimits 0
requeues 0)
 rate 652715Kbit 97655pps backlog 0b 33p requeues 0
 lended: 206824215 borrowed: 0 giants: 0
 tokens: -6 ctokens: -6


I'm applying the patches now. I'll get back to you.

  Antonio Almeida



On Mon, May 18, 2009 at 12:14 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Mon, May 18, 2009 at 11:39:02AM +0100, Antonio Almeida wrote:
>> Hi!
>>
>> Here the information you asked:
>
> Very nice, but there are some questions:
> - if this analyser uses tcp we definitely need tso off as well during
>  these tests,
> - it would be nice to use two patches I've sent to exclude known (now)
>  reasons.
>
> With the above I expect accuracy should be better, but definitely not
> like hfsc (plus no higher than 1000Mbit rate reported after stopping
> effect).
>
> Thanks,
> Jarek P.
>
>>
>> # ethtool -k eth0
>> Offload parameters for eth0:
>> rx-checksumming: on
>> tx-checksumming: on
>> scatter-gather: on
>> tcp segmentation offload: on
>> udp fragmentation offload: off
>> generic segmentation offload: off
>>
>> # ethtool -k eth1
>> Offload parameters for eth1:
>> rx-checksumming: on
>> tx-checksumming: on
>> scatter-gather: on
>> tcp segmentation offload: on
>> udp fragmentation offload: off
>> generic segmentation offload: off
>>
>> The bridge is between eth0 and eth1
> ...
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 10:45       ` Jarek Poplawski
@ 2009-05-18 12:27         ` Antonio Almeida
  2009-05-18 12:32           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 12:27 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Vladimir Ivashchenko

> Looks great! But, since HFSC uses rates directly (without rate tables)

This matter about the use of rate tables is not very familiar to me.
In fact I keep wondering a lot of things about what kernel does with
packets. Is there any documentation explaining how queue disciplines
work and how it interacts with netfilter and tc_core? What about
packets dispatching?

Thanks
  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 12:27         ` Antonio Almeida
@ 2009-05-18 12:32           ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 12:32 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Vladimir Ivashchenko

On Mon, May 18, 2009 at 01:27:08PM +0100, Antonio Almeida wrote:
> > Looks great! But, since HFSC uses rates directly (without rate tables)
> 
> This matter about the use of rate tables is not very familiar to me.
> In fact I keep wondering a lot of things about what kernel does with
> packets. Is there any documentation explaining how queue disciplines
> work and how it interacts with netfilter and tc_core? What about
> packets dispatching?

Here are a few links:
http://yesican.chsoft.biz/lartc/index.html

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-16 14:14   ` Jarek Poplawski
@ 2009-05-18 14:36     ` Antonio Almeida
  2009-05-18 23:14       ` Vladimir Ivashchenko
  2009-05-18 16:40     ` HTB accuracy for high speed Eric Dumazet
  1 sibling, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 14:36 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik

This patch works perfectly!
rate (bits/s) is now decreasing along with pps when I stop the traffic
(doesn't grow as it used to for rates over 500Mbtis/s).

# tc -s -d class ls dev eth1 | head -21 | tail -1
 rate 651960Kbit 97482pps backlog 0b 0p requeues 0
 rate 541134Kbit 80911pps backlog 0b 0p requeues 0
 rate 405850Kbit 60683pps backlog 0b 0p requeues 0
 rate 304388Kbit 45512pps backlog 0b 0p requeues 0
 rate 304388Kbit 45512pps backlog 0b 0p requeues 0
 rate 228291Kbit 34134pps backlog 0b 0p requeues 0
 rate 171218Kbit 25601pps backlog 0b 0p requeues 0
 rate 171218Kbit 25601pps backlog 0b 0p requeues 0
 rate 128414Kbit 19201pps backlog 0b 0p requeues 0
 rate 96310Kbit 14400pps backlog 0b 0p requeues 0
 rate 96310Kbit 14400pps backlog 0b 0p requeues 0
 rate 72233Kbit 10800pps backlog 0b 0p requeues 0
 rate 54174Kbit 8100pps backlog 0b 0p requeues 0


Thank's to you!
  Antonio Almeida




On Sat, May 16, 2009 at 3:14 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> ...
>> I also note that, for HTB rate configurations over 500Mbit/s on leaf
>> class, when I stop the traffic, in the output of "tc -s -d class ls
>> dev eth1" command, I see that leaf's rate (in bits/s) is growing
>> instead of decreasing (as expected since I've stopped the traffic).
>> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
>> above 1000Mbit and stays there for a few minutes. After two or three
>> minutes it becomes 0bit. The same happens for it's ancestors (also for
>> root class).Here's tc output of my leaf class for this situation:
>>
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
>> requeues 0)
>>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>>  lended: 242475339 borrowed: 0 giants: 0
>>  tokens: 8 ctokens: 8
>
> This looks like a regular bug. I guess it's an overflow in
> gen_estimator(), but I'm not sure there is nothing more. Could you
> try the patch below? (An offset warning when patching 2.6.25 is OK)
>
> Thanks,
> Jarek P.
> ---
>
>  net/core/gen_estimator.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> index 9cc9f95..87f0ced 100644
> --- a/net/core/gen_estimator.c
> +++ b/net/core/gen_estimator.c
> @@ -127,7 +127,11 @@ static void est_timer(unsigned long arg)
>                npackets = e->bstats->packets;
>                rate = (nbytes - e->last_bytes)<<(7 - idx);
>                e->last_bytes = nbytes;
> -               e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> +               if (rate > e->avbps)
> +                       e->avbps += (rate - e->avbps) >> e->ewma_log;
> +               else
> +                       e->avbps -= (e->avbps - rate) >> e->ewma_log;
> +
>                e->rate_est->bps = (e->avbps+0xF)>>5;
>
>                rate = (npackets - e->last_packets)<<(12 - idx);
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 10:01     ` Antonio Almeida
  2009-05-18 10:45       ` Jarek Poplawski
@ 2009-05-18 16:13       ` Stephen Hemminger
  2009-05-18 18:03         ` Antonio Almeida
  2009-05-18 22:02         ` Stephen Hemminger
  1 sibling, 2 replies; 104+ messages in thread
From: Stephen Hemminger @ 2009-05-18 16:13 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: netdev, jarkao2, kaber, davem, devik

On Mon, 18 May 2009 11:01:21 +0100
Antonio Almeida <vexwek@gmail.com> wrote:

> Hi!
> 
> cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> returns "jiffies"

That is the slowest of the choices. Better ones are hpet and tsc, but you
hardware doesn't support them.

You should compile your kernel with HZ=1000 and the resolution will be better
(but with some loss of performance).

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-16 14:14   ` Jarek Poplawski
  2009-05-18 14:36     ` Antonio Almeida
@ 2009-05-18 16:40     ` Eric Dumazet
  2009-05-18 17:23       ` Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Eric Dumazet @ 2009-05-18 16:40 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Antonio Almeida, netdev, kaber, davem, devik

Jarek Poplawski a écrit :
> On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> ...
>> I also note that, for HTB rate configurations over 500Mbit/s on leaf
>> class, when I stop the traffic, in the output of "tc -s -d class ls
>> dev eth1" command, I see that leaf's rate (in bits/s) is growing
>> instead of decreasing (as expected since I've stopped the traffic).
>> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
>> above 1000Mbit and stays there for a few minutes. After two or three
>> minutes it becomes 0bit. The same happens for it's ancestors (also for
>> root class).Here's tc output of my leaf class for this situation:
>>
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
>> requeues 0)
>>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
>>  lended: 242475339 borrowed: 0 giants: 0
>>  tokens: 8 ctokens: 8
> 
> This looks like a regular bug. I guess it's an overflow in
> gen_estimator(), but I'm not sure there is nothing more. Could you
> try the patch below? (An offset warning when patching 2.6.25 is OK)
> 
> Thanks,
> Jarek P.
> ---
> 
>  net/core/gen_estimator.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> index 9cc9f95..87f0ced 100644
> --- a/net/core/gen_estimator.c
> +++ b/net/core/gen_estimator.c
> @@ -127,7 +127,11 @@ static void est_timer(unsigned long arg)
>  		npackets = e->bstats->packets;
>  		rate = (nbytes - e->last_bytes)<<(7 - idx);
>  		e->last_bytes = nbytes;
> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> +		if (rate > e->avbps)
> +			e->avbps += (rate - e->avbps) >> e->ewma_log;
> +		else
> +			e->avbps -= (e->avbps - rate) >> e->ewma_log;
> +
>  		e->rate_est->bps = (e->avbps+0xF)>>5;
>  
>  		rate = (npackets - e->last_packets)<<(12 - idx);

With a typical estimator "1sec 8sec", ewma_log value is 3

At gigabit speeds, we are very close to overflow yes, since
we only have 27 bits available, so 134217728 bytes per second
or 1073741824 bits per second.

So formula :
e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
is going to overflow.

One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec" 

Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 9cc9f95..150e2f5 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -86,9 +86,9 @@ struct gen_estimator
 	spinlock_t		*stats_lock;
 	int			ewma_log;
 	u64			last_bytes;
+	u64			avbps;
 	u32			last_packets;
 	u32			avpps;
-	u32			avbps;
 	struct rcu_head		e_rcu;
 	struct rb_node		node;
 };
@@ -115,6 +115,7 @@ static void est_timer(unsigned long arg)
 	rcu_read_lock();
 	list_for_each_entry_rcu(e, &elist[idx].list, list) {
 		u64 nbytes;
+		u64 brate;
 		u32 npackets;
 		u32 rate;
 
@@ -125,9 +126,9 @@ static void est_timer(unsigned long arg)
 
 		nbytes = e->bstats->bytes;
 		npackets = e->bstats->packets;
-		rate = (nbytes - e->last_bytes)<<(7 - idx);
+		brate = (nbytes - e->last_bytes)<<(7 - idx);
 		e->last_bytes = nbytes;
-		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
+		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
 		e->rate_est->bps = (e->avbps+0xF)>>5;
 
 		rate = (npackets - e->last_packets)<<(12 - idx);


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18  6:56     ` [PATCH iproute2] " Jarek Poplawski
@ 2009-05-18 16:54       ` Antonio Almeida
  2009-05-18 17:16         ` Antonio Almeida
  2009-05-18 17:53         ` Jarek Poplawski
  0 siblings, 2 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 16:54 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Stephen Hemminger, netdev, kaber, davem, devik

I'm not sure if I'm able to test this patch. What do you mean with
"smallest sizes"? Are you talking about packet's size? What kind of
sizes?
When I feed my bridge with 950Mbits/s of packets with 800 bytes that
is close to 150.000pps and CPUs start to get busy. For packets 100
bytes long, 150.000pps would be close to 125Mbits/s and CPUs start to
get busy already, so I'm not able to get close to 500Mbits/s. For
rates near 125bits/s the bad accuracy is not so expressive. For
packets of 100 bytes increasing analyser sent traffic, at some point
is not HTB shaping but the CPU that can't process so many packets. I
might misunderstood your point.

I applied this tc_core.c patch and for packets of 800 bytes it had no
effect in HTB accuracy with rates over 500Mbit.
Anyway I also test it with packets of 100 bytes, generating 200Mbits,
and the result is the same as without this patch:

With the patch:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
14087b/8 mpu 0b overhead 0b level 0
 Sent 2187884640 bytes 22790465 pkt (dropped 8624566, overlimits 0 requeues 0)
 rate 124946Kbit 162691pps backlog 0b 0p requeues 0
 lended: 22790465 borrowed: 0 giants: 0
 tokens: 180 ctokens: 180


Without the patch:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
14087b/8 mpu 0b overhead 0b level 0
 Sent 1260235680 bytes 13127455 pkt (dropped 4531299, overlimits 0 requeues 0)
 rate 124575Kbit 162207pps backlog 0b 0p requeues 0
 lended: 13127455 borrowed: 0 giants: 0
 tokens: 123 ctokens: 123


Thanks
  Antonio Almeida


On Mon, May 18, 2009 at 7:56 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> Return non-zero tc_calc_xmittime() for rate tables
>
> While looking at the problem of HTB accuracy for high speed (~500Mbit
> rates) I've found that rate tables have cells filled with zeros for
> the smallest sizes. It means such packets aren't accounted at all.
> Apart from the correctness of such configs, let's make it safe with
> rather overaccounting than living it unlimited.
>
> Reported-by: Antonio Almeida <vexwek@gmail.com>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> ---
>
>  tc/tc_core.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/tc/tc_core.c b/tc/tc_core.c
> index 9a0ff39..14f25bc 100644
> --- a/tc/tc_core.c
> +++ b/tc/tc_core.c
> @@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
>
>  unsigned tc_calc_xmittime(unsigned rate, unsigned size)
>  {
> -       return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
> +       unsigned t;
> +       t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
> +       return t ? : 1;
>  }
>
>  unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 16:54       ` Antonio Almeida
@ 2009-05-18 17:16         ` Antonio Almeida
  2009-05-21  8:51           ` Jarek Poplawski
  2009-05-18 17:53         ` Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 17:16 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Stephen Hemminger, netdev, kaber, davem, devik

I forgot to tell you that I used tc source code from iproute2-2.6.16.
I couldn't use the newest version because I got errors when compiling.

  Antonio Almeida


On Mon, May 18, 2009 at 5:54 PM, Antonio Almeida <vexwek@gmail.com> wrote:
> I'm not sure if I'm able to test this patch. What do you mean with
> "smallest sizes"? Are you talking about packet's size? What kind of
> sizes?
> When I feed my bridge with 950Mbits/s of packets with 800 bytes that
> is close to 150.000pps and CPUs start to get busy. For packets 100
> bytes long, 150.000pps would be close to 125Mbits/s and CPUs start to
> get busy already, so I'm not able to get close to 500Mbits/s. For
> rates near 125bits/s the bad accuracy is not so expressive. For
> packets of 100 bytes increasing analyser sent traffic, at some point
> is not HTB shaping but the CPU that can't process so many packets. I
> might misunderstood your point.
>
> I applied this tc_core.c patch and for packets of 800 bytes it had no
> effect in HTB accuracy with rates over 500Mbit.
> Anyway I also test it with packets of 100 bytes, generating 200Mbits,
> and the result is the same as without this patch:
>
> With the patch:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
> 14087b/8 mpu 0b overhead 0b level 0
>  Sent 2187884640 bytes 22790465 pkt (dropped 8624566, overlimits 0 requeues 0)
>  rate 124946Kbit 162691pps backlog 0b 0p requeues 0
>  lended: 22790465 borrowed: 0 giants: 0
>  tokens: 180 ctokens: 180
>
>
> Without the patch:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
> 14087b/8 mpu 0b overhead 0b level 0
>  Sent 1260235680 bytes 13127455 pkt (dropped 4531299, overlimits 0 requeues 0)
>  rate 124575Kbit 162207pps backlog 0b 0p requeues 0
>  lended: 13127455 borrowed: 0 giants: 0
>  tokens: 123 ctokens: 123
>
>
> Thanks
>  Antonio Almeida
>
>
> On Mon, May 18, 2009 at 7:56 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
>> Return non-zero tc_calc_xmittime() for rate tables
>>
>> While looking at the problem of HTB accuracy for high speed (~500Mbit
>> rates) I've found that rate tables have cells filled with zeros for
>> the smallest sizes. It means such packets aren't accounted at all.
>> Apart from the correctness of such configs, let's make it safe with
>> rather overaccounting than living it unlimited.
>>
>> Reported-by: Antonio Almeida <vexwek@gmail.com>
>> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
>> ---
>>
>>  tc/tc_core.c |    4 +++-
>>  1 files changed, 3 insertions(+), 1 deletions(-)
>>
>> diff --git a/tc/tc_core.c b/tc/tc_core.c
>> index 9a0ff39..14f25bc 100644
>> --- a/tc/tc_core.c
>> +++ b/tc/tc_core.c
>> @@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
>>
>>  unsigned tc_calc_xmittime(unsigned rate, unsigned size)
>>  {
>> -       return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
>> +       unsigned t;
>> +       t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
>> +       return t ? : 1;
>>  }
>>
>>  unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
>>
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 16:40     ` HTB accuracy for high speed Eric Dumazet
@ 2009-05-18 17:23       ` Jarek Poplawski
  2009-05-18 21:52         ` David Miller
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 17:23 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Antonio Almeida, netdev, kaber, davem, devik

On Mon, May 18, 2009 at 06:40:56PM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Fri, May 15, 2009 at 03:49:31PM +0100, Antonio Almeida wrote:
> > ...
> >> I also note that, for HTB rate configurations over 500Mbit/s on leaf
> >> class, when I stop the traffic, in the output of "tc -s -d class ls
> >> dev eth1" command, I see that leaf's rate (in bits/s) is growing
> >> instead of decreasing (as expected since I've stopped the traffic).
> >> Rate in pps is ok and decreases until 0pps. Rate in bits/s increases
> >> above 1000Mbit and stays there for a few minutes. After two or three
> >> minutes it becomes 0bit. The same happens for it's ancestors (also for
> >> root class).Here's tc output of my leaf class for this situation:
> >>
> >> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> >> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> >> 70901b/8 mpu 0b overhead 0b level 0
> >>  Sent 120267768144 bytes 242475339 pkt (dropped 62272599, overlimits 0
> >> requeues 0)
> >>  rate 1074Mbit 0pps backlog 0b 0p requeues 0
> >>  lended: 242475339 borrowed: 0 giants: 0
> >>  tokens: 8 ctokens: 8
> > 
> > This looks like a regular bug. I guess it's an overflow in
> > gen_estimator(), but I'm not sure there is nothing more. Could you
> > try the patch below? (An offset warning when patching 2.6.25 is OK)
> > 
> > Thanks,
> > Jarek P.
> > ---
> > 
> >  net/core/gen_estimator.c |    6 +++++-
> >  1 files changed, 5 insertions(+), 1 deletions(-)
> > 
> > diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> > index 9cc9f95..87f0ced 100644
> > --- a/net/core/gen_estimator.c
> > +++ b/net/core/gen_estimator.c
> > @@ -127,7 +127,11 @@ static void est_timer(unsigned long arg)
> >  		npackets = e->bstats->packets;
> >  		rate = (nbytes - e->last_bytes)<<(7 - idx);
> >  		e->last_bytes = nbytes;
> > -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> > +		if (rate > e->avbps)
> > +			e->avbps += (rate - e->avbps) >> e->ewma_log;
> > +		else
> > +			e->avbps -= (e->avbps - rate) >> e->ewma_log;
> > +
> >  		e->rate_est->bps = (e->avbps+0xF)>>5;
> >  
> >  		rate = (npackets - e->last_packets)<<(12 - idx);
> 
> With a typical estimator "1sec 8sec", ewma_log value is 3
> 
> At gigabit speeds, we are very close to overflow yes, since
> we only have 27 bits available, so 134217728 bytes per second
> or 1073741824 bits per second.
> 
> So formula :
> e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> is going to overflow.
> 
> One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec" 
> 
> Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...

Yes, I considered this too, but because of an overhead I decided to
fix as designed (according to the comment) for now. But probably you
are right, and we should go further, so I'm OK with your patch.

Jarek P.

> 
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> index 9cc9f95..150e2f5 100644
> --- a/net/core/gen_estimator.c
> +++ b/net/core/gen_estimator.c
> @@ -86,9 +86,9 @@ struct gen_estimator
>  	spinlock_t		*stats_lock;
>  	int			ewma_log;
>  	u64			last_bytes;
> +	u64			avbps;
>  	u32			last_packets;
>  	u32			avpps;
> -	u32			avbps;
>  	struct rcu_head		e_rcu;
>  	struct rb_node		node;
>  };
> @@ -115,6 +115,7 @@ static void est_timer(unsigned long arg)
>  	rcu_read_lock();
>  	list_for_each_entry_rcu(e, &elist[idx].list, list) {
>  		u64 nbytes;
> +		u64 brate;
>  		u32 npackets;
>  		u32 rate;
>  
> @@ -125,9 +126,9 @@ static void est_timer(unsigned long arg)
>  
>  		nbytes = e->bstats->bytes;
>  		npackets = e->bstats->packets;
> -		rate = (nbytes - e->last_bytes)<<(7 - idx);
> +		brate = (nbytes - e->last_bytes)<<(7 - idx);
>  		e->last_bytes = nbytes;
> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
>  		e->rate_est->bps = (e->avbps+0xF)>>5;
>  
>  		rate = (npackets - e->last_packets)<<(12 - idx);
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 16:54       ` Antonio Almeida
  2009-05-18 17:16         ` Antonio Almeida
@ 2009-05-18 17:53         ` Jarek Poplawski
  2009-05-18 18:23           ` Antonio Almeida
  1 sibling, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 17:53 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: Stephen Hemminger, netdev, kaber, davem, devik

On Mon, May 18, 2009 at 05:54:18PM +0100, Antonio Almeida wrote:
> I'm not sure if I'm able to test this patch. What do you mean with
> "smallest sizes"? Are you talking about packet's size? What kind of
> sizes?
> When I feed my bridge with 950Mbits/s of packets with 800 bytes that
> is close to 150.000pps and CPUs start to get busy. For packets 100
> bytes long, 150.000pps would be close to 125Mbits/s and CPUs start to
> get busy already, so I'm not able to get close to 500Mbits/s. For
> rates near 125bits/s the bad accuracy is not so expressive. For
> packets of 100 bytes increasing analyser sent traffic, at some point
> is not HTB shaping but the CPU that can't process so many packets. I
> might misunderstood your point.
> 
> I applied this tc_core.c patch and for packets of 800 bytes it had no
> effect in HTB accuracy with rates over 500Mbit.
> Anyway I also test it with packets of 100 bytes, generating 200Mbits,
> and the result is the same as without this patch:

You're right: if there were only 800 byte packets this patch shouldn't
matter. It should matter e.g. if these 800 byte were mixed with 100
byte packets, rate 550Mbit, and HZ 1000. Btw. if could you send your
.config (gzipped)? I guess, I've to look for some other reason yet.

Thanks,
Jarek P.

> 
> With the patch:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
> 14087b/8 mpu 0b overhead 0b level 0
>  Sent 2187884640 bytes 22790465 pkt (dropped 8624566, overlimits 0 requeues 0)
>  rate 124946Kbit 162691pps backlog 0b 0p requeues 0
>  lended: 22790465 borrowed: 0 giants: 0
>  tokens: 180 ctokens: 180
> 
> 
> Without the patch:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
> 14087b/8 mpu 0b overhead 0b level 0
>  Sent 1260235680 bytes 13127455 pkt (dropped 4531299, overlimits 0 requeues 0)
>  rate 124575Kbit 162207pps backlog 0b 0p requeues 0
>  lended: 13127455 borrowed: 0 giants: 0
>  tokens: 123 ctokens: 123
> 
> 
> Thanks
>   Antonio Almeida
> 
> 
> On Mon, May 18, 2009 at 7:56 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > Return non-zero tc_calc_xmittime() for rate tables
> >
> > While looking at the problem of HTB accuracy for high speed (~500Mbit
> > rates) I've found that rate tables have cells filled with zeros for
> > the smallest sizes. It means such packets aren't accounted at all.
> > Apart from the correctness of such configs, let's make it safe with
> > rather overaccounting than living it unlimited.
> >
> > Reported-by: Antonio Almeida <vexwek@gmail.com>
> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
> > ---
> >
> >  tc/tc_core.c |    4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> >
> > diff --git a/tc/tc_core.c b/tc/tc_core.c
> > index 9a0ff39..14f25bc 100644
> > --- a/tc/tc_core.c
> > +++ b/tc/tc_core.c
> > @@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
> >
> >  unsigned tc_calc_xmittime(unsigned rate, unsigned size)
> >  {
> > -       return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
> > +       unsigned t;
> > +       t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
> > +       return t ? : 1;
> >  }
> >
> >  unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
> >

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 16:13       ` Stephen Hemminger
@ 2009-05-18 18:03         ` Antonio Almeida
  2009-05-18 22:02         ` Stephen Hemminger
  1 sibling, 0 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 18:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, jarkao2, kaber, davem, devik

I have my kernel's timer frequency set to 1000Hz since the beginning.
I've got all these results with HZ_1000.

(I'm working on clocksource)

Thanks
  Antonio Almeida


On Mon, May 18, 2009 at 5:13 PM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> On Mon, 18 May 2009 11:01:21 +0100
> Antonio Almeida <vexwek@gmail.com> wrote:
>
>> Hi!
>>
>> cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> returns "jiffies"
>
> That is the slowest of the choices. Better ones are hpet and tsc, but you
> hardware doesn't support them.
>
> You should compile your kernel with HZ=1000 and the resolution will be better
> (but with some loss of performance).
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 17:53         ` Jarek Poplawski
@ 2009-05-18 18:23           ` Antonio Almeida
  2009-05-18 18:32             ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 18:23 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 3648 bytes --]

Here's my .config

  Antonio Almeida


On Mon, May 18, 2009 at 6:53 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Mon, May 18, 2009 at 05:54:18PM +0100, Antonio Almeida wrote:
>> I'm not sure if I'm able to test this patch. What do you mean with
>> "smallest sizes"? Are you talking about packet's size? What kind of
>> sizes?
>> When I feed my bridge with 950Mbits/s of packets with 800 bytes that
>> is close to 150.000pps and CPUs start to get busy. For packets 100
>> bytes long, 150.000pps would be close to 125Mbits/s and CPUs start to
>> get busy already, so I'm not able to get close to 500Mbits/s. For
>> rates near 125bits/s the bad accuracy is not so expressive. For
>> packets of 100 bytes increasing analyser sent traffic, at some point
>> is not HTB shaping but the CPU that can't process so many packets. I
>> might misunderstood your point.
>>
>> I applied this tc_core.c patch and for packets of 800 bytes it had no
>> effect in HTB accuracy with rates over 500Mbit.
>> Anyway I also test it with packets of 100 bytes, generating 200Mbits,
>> and the result is the same as without this patch:
>
> You're right: if there were only 800 byte packets this patch shouldn't
> matter. It should matter e.g. if these 800 byte were mixed with 100
> byte packets, rate 550Mbit, and HZ 1000. Btw. if could you send your
> .config (gzipped)? I guess, I've to look for some other reason yet.
>
> Thanks,
> Jarek P.
>
>>
>> With the patch:
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
>> 14087b/8 mpu 0b overhead 0b level 0
>>  Sent 2187884640 bytes 22790465 pkt (dropped 8624566, overlimits 0 requeues 0)
>>  rate 124946Kbit 162691pps backlog 0b 0p requeues 0
>>  lended: 22790465 borrowed: 0 giants: 0
>>  tokens: 180 ctokens: 180
>>
>>
>> Without the patch:
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 100000Kbit ceil 100000Kbit burst 14087b/8 mpu 0b overhead 0b cburst
>> 14087b/8 mpu 0b overhead 0b level 0
>>  Sent 1260235680 bytes 13127455 pkt (dropped 4531299, overlimits 0 requeues 0)
>>  rate 124575Kbit 162207pps backlog 0b 0p requeues 0
>>  lended: 13127455 borrowed: 0 giants: 0
>>  tokens: 123 ctokens: 123
>>
>>
>> Thanks
>>   Antonio Almeida
>>
>>
>> On Mon, May 18, 2009 at 7:56 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
>> > Return non-zero tc_calc_xmittime() for rate tables
>> >
>> > While looking at the problem of HTB accuracy for high speed (~500Mbit
>> > rates) I've found that rate tables have cells filled with zeros for
>> > the smallest sizes. It means such packets aren't accounted at all.
>> > Apart from the correctness of such configs, let's make it safe with
>> > rather overaccounting than living it unlimited.
>> >
>> > Reported-by: Antonio Almeida <vexwek@gmail.com>
>> > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
>> > ---
>> >
>> >  tc/tc_core.c |    4 +++-
>> >  1 files changed, 3 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/tc/tc_core.c b/tc/tc_core.c
>> > index 9a0ff39..14f25bc 100644
>> > --- a/tc/tc_core.c
>> > +++ b/tc/tc_core.c
>> > @@ -58,7 +58,9 @@ unsigned tc_core_ktime2time(unsigned ktime)
>> >
>> >  unsigned tc_calc_xmittime(unsigned rate, unsigned size)
>> >  {
>> > -       return tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
>> > +       unsigned t;
>> > +       t = tc_core_time2tick(TIME_UNITS_PER_SEC*((double)size/rate));
>> > +       return t ? : 1;
>> >  }
>> >
>> >  unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
>> >
>

[-- Attachment #2: config.tar --]
[-- Type: application/x-tar, Size: 15707 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 18:23           ` Antonio Almeida
@ 2009-05-18 18:32             ` Jarek Poplawski
  2009-05-18 18:56               ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 18:32 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

On Mon, May 18, 2009 at 07:23:14PM +0100, Antonio Almeida wrote:
> Here's my .config

Hmm... And if it's not a big problem I'd also ask you to try this test
with 555000Kbit rate for 850 and 900 byte packets. (It can wait.)

Thanks again,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 18:32             ` Jarek Poplawski
@ 2009-05-18 18:56               ` Antonio Almeida
  2009-05-18 19:05                 ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-18 18:56 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

Precise measurements:

800 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 46793626324 bytes 57771194 pkt (dropped 29920019, overlimits 0 requeues 0)
 rate 621714Kbit 97631pps backlog 0b 126p requeues 0
 lended: 57771068 borrowed: 0 giants: 0
 tokens: -8 ctokens: -8


850 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 63422144616 bytes 77714246 pkt (dropped 41012275, overlimits 0 requeues 0)
 rate 600699Kbit 88756pps backlog 0b 127p requeues 0
 lended: 77714119 borrowed: 0 giants: 0
 tokens: -11 ctokens: -11


900 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 76868403562 bytes 92835297 pkt (dropped 48565133, overlimits 0 requeues 0)
 rate 636195Kbit 88755pps backlog 0b 126p requeues 0
 lended: 92835171 borrowed: 0 giants: 0
 tokens: -7 ctokens: -7


If you need more values you're free to ask.

  Antonio Almeida


On Mon, May 18, 2009 at 7:32 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Mon, May 18, 2009 at 07:23:14PM +0100, Antonio Almeida wrote:
>> Here's my .config
>
> Hmm... And if it's not a big problem I'd also ask you to try this test
> with 555000Kbit rate for 850 and 900 byte packets. (It can wait.)
>
> Thanks again,
> Jarek P.
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 18:56               ` Antonio Almeida
@ 2009-05-18 19:05                 ` Jarek Poplawski
  2009-05-19 10:55                   ` Antonio Almeida
  2009-05-19 13:18                   ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-18 19:05 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

On Mon, May 18, 2009 at 07:56:12PM +0100, Antonio Almeida wrote:
> Precise measurements:
> 
> 800 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 46793626324 bytes 57771194 pkt (dropped 29920019, overlimits 0 requeues 0)
>  rate 621714Kbit 97631pps backlog 0b 126p requeues 0
>  lended: 57771068 borrowed: 0 giants: 0
>  tokens: -8 ctokens: -8
> 
> 
> 850 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 63422144616 bytes 77714246 pkt (dropped 41012275, overlimits 0 requeues 0)
>  rate 600699Kbit 88756pps backlog 0b 127p requeues 0
>  lended: 77714119 borrowed: 0 giants: 0
>  tokens: -11 ctokens: -11
> 
> 
> 900 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 76868403562 bytes 92835297 pkt (dropped 48565133, overlimits 0 requeues 0)
>  rate 636195Kbit 88755pps backlog 0b 126p requeues 0
>  lended: 92835171 borrowed: 0 giants: 0
>  tokens: -7 ctokens: -7
> 
> 
> If you need more values you're free to ask.

Since you're so kind... :-) There is a line in net/sched/sch_htb.c:

#define HTB_HYSTERESIS 1        /* whether to use mode hysteresis for speedup */

Could you change 1 to 0, and repeat these tests above after recompiling?

More thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 17:23       ` Jarek Poplawski
@ 2009-05-18 21:52         ` David Miller
  2009-05-18 23:59           ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps Eric Dumazet
  0 siblings, 1 reply; 104+ messages in thread
From: David Miller @ 2009-05-18 21:52 UTC (permalink / raw)
  To: jarkao2; +Cc: dada1, vexwek, netdev, kaber, devik

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Mon, 18 May 2009 19:23:49 +0200

> On Mon, May 18, 2009 at 06:40:56PM +0200, Eric Dumazet wrote:
>> With a typical estimator "1sec 8sec", ewma_log value is 3
>> 
>> At gigabit speeds, we are very close to overflow yes, since
>> we only have 27 bits available, so 134217728 bytes per second
>> or 1073741824 bits per second.
>> 
>> So formula :
>> e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
>> is going to overflow.
>> 
>> One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec" 
>> 
>> Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...
> 
> Yes, I considered this too, but because of an overhead I decided to
> fix as designed (according to the comment) for now. But probably you
> are right, and we should go further, so I'm OK with your patch.

I like this patch too, Eric can you submit this formally with
proper signoffs etc.?

Thanks!

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 16:13       ` Stephen Hemminger
  2009-05-18 18:03         ` Antonio Almeida
@ 2009-05-18 22:02         ` Stephen Hemminger
  2009-05-19 11:48           ` Antonio Almeida
  1 sibling, 1 reply; 104+ messages in thread
From: Stephen Hemminger @ 2009-05-18 22:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Antonio Almeida, netdev, jarkao2, kaber, davem, devik

On Mon, 18 May 2009 09:13:14 -0700
Stephen Hemminger <shemminger@vyatta.com> wrote:

> On Mon, 18 May 2009 11:01:21 +0100
> Antonio Almeida <vexwek@gmail.com> wrote:
> 
> > Hi!
> > 
> > cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > returns "jiffies"
> 
> That is the slowest of the choices. Better ones are hpet and tsc, but you
> hardware doesn't support them.
> 
> You should compile your kernel with HZ=1000 and the resolution will be better
> (but with some loss of performance).
> --

Are you using one of the AMD dual core machines?  That processor has the bad
design flaw that the TSC counter is not synced between core's so the kernel can't
use it. You might even be better off running a non SMP kernel on that box.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 14:36     ` Antonio Almeida
@ 2009-05-18 23:14       ` Vladimir Ivashchenko
  2009-05-18 23:27         ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-18 23:14 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik, Antonio Almeida

On Mon, 2009-05-18 at 15:36 +0100, Antonio Almeida wrote:
> This patch works perfectly!
> rate (bits/s) is now decreasing along with pps when I stop the traffic
> (doesn't grow as it used to for rates over 500Mbtis/s).

I'm not able to reach full speed with bond + HTB + sfq on 2.6.29.1, both
with and without these patches. I seem to get a lot of drops on sfq
qdiscs, whatever quantum I set. Playing with IRQ affinity doesn't help.
I didn't check without bond.

With bond + HFSC + sfq, I'm able to reach the speed. It doesn't seem to
overspill with 580 mbps load. Jarek, would your patches help with HSFC
overspill ? I will check tomorrow under 750 mbps load. 

# ethtool -k eth0
Offload parameters for eth0:
Cannot get device flags: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off
large receive offload: off

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

-- 
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 23:14       ` Vladimir Ivashchenko
@ 2009-05-18 23:27         ` Vladimir Ivashchenko
  2009-05-19 11:03           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-18 23:27 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik, Antonio Almeida


> With bond + HFSC + sfq, I'm able to reach the speed. It doesn't seem to
> overspill with 580 mbps load. Jarek, would your patches help with HSFC
> overspill ? I will check tomorrow under 750 mbps load. 

Please disregard my comment about HFSC. It still overspills heavily.

On a 400 mbps limit, I'm getting 520 mbps actual throughput.

-- 
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211



^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-18 21:52         ` David Miller
@ 2009-05-18 23:59           ` Eric Dumazet
  2009-05-19  2:27             ` David Miller
  2009-05-19  7:02             ` Jarek Poplawski
  0 siblings, 2 replies; 104+ messages in thread
From: Eric Dumazet @ 2009-05-18 23:59 UTC (permalink / raw)
  To: David Miller; +Cc: jarkao2, vexwek, netdev, kaber, devik

David Miller a écrit :
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 18 May 2009 19:23:49 +0200
> 
>> On Mon, May 18, 2009 at 06:40:56PM +0200, Eric Dumazet wrote:
>>> With a typical estimator "1sec 8sec", ewma_log value is 3
>>>
>>> At gigabit speeds, we are very close to overflow yes, since
>>> we only have 27 bits available, so 134217728 bytes per second
>>> or 1073741824 bits per second.
>>>
>>> So formula :
>>> e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
>>> is going to overflow.
>>>
>>> One way to avoid the overflow would be to use a smaller estimator, like "500ms 4sec" 
>>>
>>> Or use a 64bits rate & avbps, this is needed fo 10Gb speeds I suppose...
>> Yes, I considered this too, but because of an overhead I decided to
>> fix as designed (according to the comment) for now. But probably you
>> are right, and we should go further, so I'm OK with your patch.
> 
> I like this patch too, Eric can you submit this formally with
> proper signoffs etc.?
> 

Sure, here it is. We might need a similar patch to get a correct pps value
too, since we currently are limited to ~ 2^21 packets per second.

[PATCH] pkt_sched: gen_estimator: use 64 bit intermediate counters for bps

gen_estimator can overflow bps (bytes per second) with Gb links, while
it was designed with a u32 API, with a theorical limit of 34360Mbit (2^32 bytes)

Using 64 bit intermediate avbps/brate counters can allow us to reach this
theorical limit.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
index 9cc9f95..ea28659 100644
--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -66,9 +66,9 @@
 
    NOTES.
 
-   * The stored value for avbps is scaled by 2^5, so that maximal
-     rate is ~1Gbit, avpps is scaled by 2^10.
-
+   * avbps is scaled by 2^5, avpps is scaled by 2^10.
+   * both values are reported as 32 bit unsigned values. bps can
+     overflow for fast links : max speed being 34360Mbit/sec
    * Minimal interval is HZ/4=250msec (it is the greatest common divisor
      for HZ=100 and HZ=1024 8)), maximal interval
      is (HZ*2^EST_MAX_INTERVAL)/4 = 8sec. Shorter intervals
@@ -86,9 +86,9 @@ struct gen_estimator
 	spinlock_t		*stats_lock;
 	int			ewma_log;
 	u64			last_bytes;
+	u64			avbps;
 	u32			last_packets;
 	u32			avpps;
-	u32			avbps;
 	struct rcu_head		e_rcu;
 	struct rb_node		node;
 };
@@ -115,6 +115,7 @@ static void est_timer(unsigned long arg)
 	rcu_read_lock();
 	list_for_each_entry_rcu(e, &elist[idx].list, list) {
 		u64 nbytes;
+		u64 brate;
 		u32 npackets;
 		u32 rate;
 
@@ -125,9 +126,9 @@ static void est_timer(unsigned long arg)
 
 		nbytes = e->bstats->bytes;
 		npackets = e->bstats->packets;
-		rate = (nbytes - e->last_bytes)<<(7 - idx);
+		brate = (nbytes - e->last_bytes)<<(7 - idx);
 		e->last_bytes = nbytes;
-		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
+		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
 		e->rate_est->bps = (e->avbps+0xF)>>5;
 
 		rate = (npackets - e->last_packets)<<(12 - idx);


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-18 23:59           ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps Eric Dumazet
@ 2009-05-19  2:27             ` David Miller
  2009-05-19  7:02             ` Jarek Poplawski
  1 sibling, 0 replies; 104+ messages in thread
From: David Miller @ 2009-05-19  2:27 UTC (permalink / raw)
  To: dada1; +Cc: jarkao2, vexwek, netdev, kaber, devik

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 19 May 2009 01:59:55 +0200

> Sure, here it is.

Applied, thanks!

> We might need a similar patch to get a correct pps value
> too, since we currently are limited to ~ 2^21 packets per second.

True, but it is a less urgent issue than bps overflow.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-18 23:59           ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps Eric Dumazet
  2009-05-19  2:27             ` David Miller
@ 2009-05-19  7:02             ` Jarek Poplawski
  2009-05-19  7:31               ` Eric Dumazet
  1 sibling, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19  7:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, vexwek, netdev, kaber, devik

On Tue, May 19, 2009 at 01:59:55AM +0200, Eric Dumazet wrote:
...
> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
...
> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;

Btw., I'm a bit concerned about the syntax here: isn't such shifting
of signed ints implementation dependant?

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-19  7:02             ` Jarek Poplawski
@ 2009-05-19  7:31               ` Eric Dumazet
  2009-05-19  7:42                 ` Jarek Poplawski
  2009-05-19  8:18                 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
  0 siblings, 2 replies; 104+ messages in thread
From: Eric Dumazet @ 2009-05-19  7:31 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, vexwek, netdev, kaber, devik

Jarek Poplawski a écrit :
> On Tue, May 19, 2009 at 01:59:55AM +0200, Eric Dumazet wrote:
> ...
>> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> ...
>> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
>> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
> 
> Btw., I'm a bit concerned about the syntax here: isn't such shifting
> of signed ints implementation dependant?
> 

You are right Jarek, I very often forget to never ever use signed quantities
at all ! (But also note original code has same undefined behavior)


Quoting wikipedia : (http://en.wikipedia.org/wiki/Arithmetic_shift)

The (1999) ISO standard for the, C programming language defines the C language's 
right shift operator in terms of divisions by powers of 2. Because of the 
aforementioned non-equivalence, the standard explicitly excludes from that
 definition the right shifts of signed numbers that have negative values.
 It doesn't specify the behaviour of the right shift operator in such circumstances,
 but instead requires each individual C compiler to specify the behaviour of shifting 
negative values right.

Apparently gcc does the *right* thing on x86_32, but we probably want something
stronger here. I could not find gcc documentation statement on right shifts of 
negative values.


 436:   8b 4b 14                mov    0x14(%ebx),%ecx
 439:   89 73 18                mov    %esi,0x18(%ebx)
 43c:   89 7b 1c                mov    %edi,0x1c(%ebx)
 43f:   8b 73 20                mov    0x20(%ebx),%esi
 442:   8b 7b 24                mov    0x24(%ebx),%edi
 445:   29 f0                   sub    %esi,%eax
 447:   19 fa                   sbb    %edi,%edx
 449:   0f ad d0                shrd   %cl,%edx,%eax
 44c:   d3 fa                   sar    %cl,%edx         << good >>
 44e:   f6 c1 20                test   $0x20,%cl
 451:   74 05                   je     458 <est_timer+0xb8>
 453:   89 d0                   mov    %edx,%eax
 455:   c1 fa 1f                sar    $0x1f,%edx       
 458:   01 f0                   add    %esi,%eax
 45a:   8b 4b 0c                mov    0xc(%ebx),%ecx
 45d:   89 43 20                mov    %eax,0x20(%ebx)
 460:   11 fa                   adc    %edi,%edx
 462:   83 c0 0f                add    $0xf,%eax
 465:   89 53 24                mov    %edx,0x24(%ebx)
 468:   83 d2 00                adc    $0x0,%edx
 46b:   0f ac d0 05             shrd   $0x5,%edx,%eax
 46f:   89 01                   mov    %eax,(%ecx)


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-19  7:31               ` Eric Dumazet
@ 2009-05-19  7:42                 ` Jarek Poplawski
  2009-05-19  7:57                   ` Jarek Poplawski
  2009-05-19  8:18                 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
  1 sibling, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19  7:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, vexwek, netdev, kaber, devik

On Tue, May 19, 2009 at 09:31:36AM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Tue, May 19, 2009 at 01:59:55AM +0200, Eric Dumazet wrote:
> > ...
> >> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> > ...
> >> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> >> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
> > 
> > Btw., I'm a bit concerned about the syntax here: isn't such shifting
> > of signed ints implementation dependant?
> > 
> 
> You are right Jarek, I very often forget to never ever use signed quantities
> at all ! (But also note original code has same undefined behavior)

Sure, I've meant the original code including 5 lines below.

> Apparently gcc does the *right* thing on x86_32, but we probably want something
> stronger here. I could not find gcc documentation statement on right shifts of 
> negative values.

I guess gcc and most of others do this "right"; but it looks
"unkosher" anyway.

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-19  7:42                 ` Jarek Poplawski
@ 2009-05-19  7:57                   ` Jarek Poplawski
  2009-05-19 18:03                     ` Eric Dumazet
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19  7:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, vexwek, netdev, kaber, devik

On Tue, May 19, 2009 at 07:42:47AM +0000, Jarek Poplawski wrote:
> On Tue, May 19, 2009 at 09:31:36AM +0200, Eric Dumazet wrote:
> > Jarek Poplawski a écrit :
> > > On Tue, May 19, 2009 at 01:59:55AM +0200, Eric Dumazet wrote:
> > > ...
> > >> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
> > > ...
> > >> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
> > >> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
> > > 
> > > Btw., I'm a bit concerned about the syntax here: isn't such shifting
> > > of signed ints implementation dependant?
> > > 
> > 
> > You are right Jarek, I very often forget to never ever use signed quantities
> > at all ! (But also note original code has same undefined behavior)
> 
> Sure, I've meant the original code including 5 lines below.
> 
> > Apparently gcc does the *right* thing on x86_32, but we probably want something
> > stronger here. I could not find gcc documentation statement on right shifts of 
> > negative values.
> 
> I guess gcc and most of others do this "right"; but it looks
> "unkosher" anyway.

I might have missed your point here, but would it be so costly to do
these shifts separately here?

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-19  7:31               ` Eric Dumazet
  2009-05-19  7:42                 ` Jarek Poplawski
@ 2009-05-19  8:18                 ` David Miller
  1 sibling, 0 replies; 104+ messages in thread
From: David Miller @ 2009-05-19  8:18 UTC (permalink / raw)
  To: dada1; +Cc: jarkao2, vexwek, netdev, kaber, devik

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 19 May 2009 09:31:36 +0200

> Apparently gcc does the *right* thing on x86_32, but we probably
> want something stronger here. I could not find gcc documentation
> statement on right shifts of negative values.

It emits an "arithmetic shift right" for every CPU I've ever checked.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 19:05                 ` Jarek Poplawski
@ 2009-05-19 10:55                   ` Antonio Almeida
  2009-05-19 11:04                     ` Denys Fedoryschenko
  2009-05-19 11:09                     ` Jarek Poplawski
  2009-05-19 13:18                   ` Jesper Dangaard Brouer
  1 sibling, 2 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-19 10:55 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

Doesn't seem to make any diference seting HTB_HYSTERESIS to 0. Here're
the values using #define HTB_HYSTERESIS 0

800 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 9773257752 bytes 12277962 pkt (dropped 6292541, overlimits 0 requeues 0)
 rate 621796Kbit 97644pps backlog 0b 127p requeues 0
 lended: 12277835 borrowed: 0 giants: 0
 tokens: -7 ctokens: -7

850 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 18225005732 bytes 22409017 pkt (dropped 11937269, overlimits 0 requeues 0)
 rate 600890Kbit 88796pps backlog 0b 43p requeues 0
 lended: 22408974 borrowed: 0 giants: 0
 tokens: -2 ctokens: -2

900 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 29790867368 bytes 35400708 pkt (dropped 18399726, overlimits 0 requeues 0)
 rate 636361Kbit 88779pps backlog 0b 127p requeues 0
 lended: 35400581 borrowed: 0 giants: 0
 tokens: -2 ctokens: -2


  Antonio Almeida



On Mon, May 18, 2009 at 8:05 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Mon, May 18, 2009 at 07:56:12PM +0100, Antonio Almeida wrote:
>> Precise measurements:
>>
>> 800 bytes:
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 46793626324 bytes 57771194 pkt (dropped 29920019, overlimits 0 requeues 0)
>>  rate 621714Kbit 97631pps backlog 0b 126p requeues 0
>>  lended: 57771068 borrowed: 0 giants: 0
>>  tokens: -8 ctokens: -8
>>
>>
>> 850 bytes:
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 63422144616 bytes 77714246 pkt (dropped 41012275, overlimits 0 requeues 0)
>>  rate 600699Kbit 88756pps backlog 0b 127p requeues 0
>>  lended: 77714119 borrowed: 0 giants: 0
>>  tokens: -11 ctokens: -11
>>
>>
>> 900 bytes:
>> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
>> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
>> 70901b/8 mpu 0b overhead 0b level 0
>>  Sent 76868403562 bytes 92835297 pkt (dropped 48565133, overlimits 0 requeues 0)
>>  rate 636195Kbit 88755pps backlog 0b 126p requeues 0
>>  lended: 92835171 borrowed: 0 giants: 0
>>  tokens: -7 ctokens: -7
>>
>>
>> If you need more values you're free to ask.
>
> Since you're so kind... :-) There is a line in net/sched/sch_htb.c:
>
> #define HTB_HYSTERESIS 1        /* whether to use mode hysteresis for speedup */
>
> Could you change 1 to 0, and repeat these tests above after recompiling?
>
> More thanks,
> Jarek P.
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 23:27         ` Vladimir Ivashchenko
@ 2009-05-19 11:03           ` Jarek Poplawski
  2009-05-19 14:04             ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 11:03 UTC (permalink / raw)
  To: Vladimir Ivashchenko; +Cc: netdev, kaber, davem, devik, Antonio Almeida

On Tue, May 19, 2009 at 02:27:47AM +0300, Vladimir Ivashchenko wrote:
> 
> > With bond + HFSC + sfq, I'm able to reach the speed. It doesn't seem to
> > overspill with 580 mbps load. Jarek, would your patches help with HSFC
> > overspill ? I will check tomorrow under 750 mbps load. 

The gen_estimator patch should fix only the effect of rising rate
after flow stop, and maybe similar overflows while reporting rates
around 1Gbit. It would show on tc stats of HFSC or HTB, but doesn't
affect actual scheduling rates.

The iproute2 tc_core patch can matter for HTB scheduling rates if
there are a lot of small packets (e.g. 100 byte for rate 500Mbit)
possibly mixed with bigger ones. It doesn't matter for HFSC or
rates <100Mbit.

> Please disregard my comment about HFSC. It still overspills heavily.
> 
> On a 400 mbps limit, I'm getting 520 mbps actual throughput.

I guess you should send some logs. Your previous report seem to show
the sum of sc rates of of children could be too high. You seem to
expect the parent's sc and ul should limit this, but actually children
rates decide and parent's rates are mainly for lending/borrowing (at
least in HTB). So, it would be nice to try with one leaf class first,
(similarly to Antonio) how high rates are respected.

High drop should be OK if the flow is much faster than scheduling/
hardware send rate. It could be a bit higher than in older kernels
because of limited requeuing, but this could be corrected with
longer queue lenghts (sfq has a very short queue: max 127).

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 10:55                   ` Antonio Almeida
@ 2009-05-19 11:04                     ` Denys Fedoryschenko
  2009-05-19 11:18                       ` Jarek Poplawski
  2009-05-19 11:09                     ` Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Denys Fedoryschenko @ 2009-05-19 11:04 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Jarek Poplawski, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet

On Tuesday 19 May 2009 13:55:43 Antonio Almeida wrote:
> Doesn't seem to make any diference seting HTB_HYSTERESIS to 0. Here're
> the values using #define HTB_HYSTERESIS 0
>
> 800 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 9773257752 bytes 12277962 pkt (dropped 6292541, overlimits 0 requeues
> 0) rate 621796Kbit 97644pps backlog 0b 127p requeues 0
>  lended: 12277835 borrowed: 0 giants: 0
>  tokens: -7 ctokens: -7
6292541 dropped from 12277962 pkt, means 51% dropped. Maybe something fishy 
here?

Can you try instead of SFQ - BFIFO? For 100ms buffer, 550Mbit/s it will be 
~6875000 bytes bfifo.

It is by the way too short, IMHO, for this bandwidth, 127 packets is not 
enough. 127 packets with 800 bytes can buffer 1 second for 812Kbit/s only, 
and for 550Mbit/s it will buffer data for ~2ms only.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 10:55                   ` Antonio Almeida
  2009-05-19 11:04                     ` Denys Fedoryschenko
@ 2009-05-19 11:09                     ` Jarek Poplawski
  1 sibling, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 11:09 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

On Tue, May 19, 2009 at 11:55:43AM +0100, Antonio Almeida wrote:
> Doesn't seem to make any diference seting HTB_HYSTERESIS to 0. Here're
> the values using #define HTB_HYSTERESIS 0

OK, so it looks like some hidden bug yet.

Many thanks for now,
Jarek P.

> 
> 800 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 9773257752 bytes 12277962 pkt (dropped 6292541, overlimits 0 requeues 0)
>  rate 621796Kbit 97644pps backlog 0b 127p requeues 0
>  lended: 12277835 borrowed: 0 giants: 0
>  tokens: -7 ctokens: -7
> 
> 850 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 18225005732 bytes 22409017 pkt (dropped 11937269, overlimits 0 requeues 0)
>  rate 600890Kbit 88796pps backlog 0b 43p requeues 0
>  lended: 22408974 borrowed: 0 giants: 0
>  tokens: -2 ctokens: -2
> 
> 900 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 29790867368 bytes 35400708 pkt (dropped 18399726, overlimits 0 requeues 0)
>  rate 636361Kbit 88779pps backlog 0b 127p requeues 0
>  lended: 35400581 borrowed: 0 giants: 0
>  tokens: -2 ctokens: -2
> 
> 
>   Antonio Almeida
> 
> 
> 
> On Mon, May 18, 2009 at 8:05 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > On Mon, May 18, 2009 at 07:56:12PM +0100, Antonio Almeida wrote:
> >> Precise measurements:
> >>
> >> 800 bytes:
> >> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> >> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> >> 70901b/8 mpu 0b overhead 0b level 0
> >>  Sent 46793626324 bytes 57771194 pkt (dropped 29920019, overlimits 0 requeues 0)
> >>  rate 621714Kbit 97631pps backlog 0b 126p requeues 0
> >>  lended: 57771068 borrowed: 0 giants: 0
> >>  tokens: -8 ctokens: -8
> >>
> >>
> >> 850 bytes:
> >> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> >> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> >> 70901b/8 mpu 0b overhead 0b level 0
> >>  Sent 63422144616 bytes 77714246 pkt (dropped 41012275, overlimits 0 requeues 0)
> >>  rate 600699Kbit 88756pps backlog 0b 127p requeues 0
> >>  lended: 77714119 borrowed: 0 giants: 0
> >>  tokens: -11 ctokens: -11
> >>
> >>
> >> 900 bytes:
> >> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> >> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> >> 70901b/8 mpu 0b overhead 0b level 0
> >>  Sent 76868403562 bytes 92835297 pkt (dropped 48565133, overlimits 0 requeues 0)
> >>  rate 636195Kbit 88755pps backlog 0b 126p requeues 0
> >>  lended: 92835171 borrowed: 0 giants: 0
> >>  tokens: -7 ctokens: -7
> >>
> >>
> >> If you need more values you're free to ask.
> >
> > Since you're so kind... :-) There is a line in net/sched/sch_htb.c:
> >
> > #define HTB_HYSTERESIS 1        /* whether to use mode hysteresis for speedup */
> >
> > Could you change 1 to 0, and repeat these tests above after recompiling?
> >
> > More thanks,
> > Jarek P.
> >

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 11:04                     ` Denys Fedoryschenko
@ 2009-05-19 11:18                       ` Jarek Poplawski
  2009-05-19 11:21                         ` Denys Fedoryschenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 11:18 UTC (permalink / raw)
  To: Denys Fedoryschenko
  Cc: Antonio Almeida, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet

On Tue, May 19, 2009 at 02:04:50PM +0300, Denys Fedoryschenko wrote:
> On Tuesday 19 May 2009 13:55:43 Antonio Almeida wrote:
> > Doesn't seem to make any diference seting HTB_HYSTERESIS to 0. Here're
> > the values using #define HTB_HYSTERESIS 0
> >
> > 800 bytes:
> > class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> > 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> > 70901b/8 mpu 0b overhead 0b level 0
> >  Sent 9773257752 bytes 12277962 pkt (dropped 6292541, overlimits 0 requeues
> > 0) rate 621796Kbit 97644pps backlog 0b 127p requeues 0
> >  lended: 12277835 borrowed: 0 giants: 0
> >  tokens: -7 ctokens: -7
> 6292541 dropped from 12277962 pkt, means 51% dropped. Maybe something fishy 
> here?
> 
> Can you try instead of SFQ - BFIFO? For 100ms buffer, 550Mbit/s it will be 
> ~6875000 bytes bfifo.
> 
> It is by the way too short, IMHO, for this bandwidth, 127 packets is not 
> enough. 127 packets with 800 bytes can buffer 1 second for 812Kbit/s only, 
> and for 550Mbit/s it will buffer data for ~2ms only.
> 

Sure, if the queue is too short we could have a problem with reaching
the expected rate; but here it's all backwards - it could actually
"help" with the stats. ;-)

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 11:18                       ` Jarek Poplawski
@ 2009-05-19 11:21                         ` Denys Fedoryschenko
  2009-05-19 11:28                           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Denys Fedoryschenko @ 2009-05-19 11:21 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet

On Tuesday 19 May 2009 14:18:57 Jarek Poplawski wrote:
>
> Sure, if the queue is too short we could have a problem with reaching
> the expected rate; but here it's all backwards - it could actually
> "help" with the stats. ;-)
>
> Jarek P.
Well, i had real experience on HTB, when i set too short buffers on  my QoS 
qdiscs, the incoming rate jumped too high than overall. When i set larger 
buffers (and by the way dropped sfq and use bfifo) - it is dropped.  No idea 
why, bug or specific things in  protocols congestion control. Maybe worth to 
try...


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 11:21                         ` Denys Fedoryschenko
@ 2009-05-19 11:28                           ` Jarek Poplawski
  2009-05-19 14:31                             ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 11:28 UTC (permalink / raw)
  To: Denys Fedoryschenko
  Cc: Antonio Almeida, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet

On Tue, May 19, 2009 at 02:21:28PM +0300, Denys Fedoryschenko wrote:
> On Tuesday 19 May 2009 14:18:57 Jarek Poplawski wrote:
> >
> > Sure, if the queue is too short we could have a problem with reaching
> > the expected rate; but here it's all backwards - it could actually
> > "help" with the stats. ;-)
> >
> > Jarek P.
> Well, i had real experience on HTB, when i set too short buffers on  my QoS 
> qdiscs, the incoming rate jumped too high than overall. When i set larger 
> buffers (and by the way dropped sfq and use bfifo) - it is dropped.  No idea 
> why, bug or specific things in  protocols congestion control. Maybe worth to 
> try...
> 

Very strange. Anyway, "overlimits 0" suggests HTB always got packets
when it needed...

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-18 22:02         ` Stephen Hemminger
@ 2009-05-19 11:48           ` Antonio Almeida
  2009-05-19 13:08             ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-19 11:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, jarkao2, kaber, davem, devik, Eric Dumazet

> Are you using one of the AMD dual core machines?  That processor has the bad
> design flaw that the TSC counter is not synced between core's so the kernel can't
> use it. You might even be better off running a non SMP kernel on that box.

My machine has two dual cores AMD Opteron processor 280. Do I have
that TSC problem.


# dmesg | grep AMD
OEM ID: AMD      Product ID: HAMMER       APIC at: 0xFEE00000
CPU0: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
CPU1: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
CPU2: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
CPU3: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02


processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 33
model name      : Dual Core AMD Opteron(tm) Processor 280
stepping        : 2
cpu MHz         : 2394.039
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
bogomips        : 4790.36
clflush size    : 64


  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-19 11:48           ` Antonio Almeida
@ 2009-05-19 13:08             ` Antonio Almeida
  0 siblings, 0 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-19 13:08 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, jarkao2, kaber, davem, devik, Eric Dumazet

Do I have that TSC problem?

On Tue, May 19, 2009 at 12:48 PM, Antonio Almeida <vexwek@gmail.com> wrote:
>> Are you using one of the AMD dual core machines?  That processor has the bad
>> design flaw that the TSC counter is not synced between core's so the kernel can't
>> use it. You might even be better off running a non SMP kernel on that box.
>
> My machine has two dual cores AMD Opteron processor 280. Do I have
> that TSC problem.
>
>
> # dmesg | grep AMD
> OEM ID: AMD      Product ID: HAMMER       APIC at: 0xFEE00000
> CPU0: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
> CPU1: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
> CPU2: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
> CPU3: AMD Dual Core AMD Opteron(tm) Processor 280 stepping 02
>
>
> processor       : 0
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 33
> model name      : Dual Core AMD Opteron(tm) Processor 280
> stepping        : 2
> cpu MHz         : 2394.039
> cache size      : 1024 KB
> physical id     : 0
> siblings        : 2
> core id         : 0
> cpu cores       : 2
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 1
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
> fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ts fid vid ttp
> bogomips        : 4790.36
> clflush size    : 64
>
>
>  Antonio Almeida
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 19:05                 ` Jarek Poplawski
  2009-05-19 10:55                   ` Antonio Almeida
@ 2009-05-19 13:18                   ` Jesper Dangaard Brouer
  2009-05-19 19:35                     ` Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Jesper Dangaard Brouer @ 2009-05-19 13:18 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet


On Mon, 18 May 2009, Jarek Poplawski wrote:

> Since you're so kind... :-) There is a line in net/sched/sch_htb.c:
>
> #define HTB_HYSTERESIS 1        /* whether to use mode hysteresis for speedup */
>
> Could you change 1 to 0, and repeat these tests above after recompiling?

Notice its runtime adjustable via:
  /sys/module/sch_htb/parameters/htb_hysteresis

Since kernel version v2.6.26.


Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-19 11:03           ` Jarek Poplawski
@ 2009-05-19 14:04             ` Vladimir Ivashchenko
  2009-05-19 20:10               ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-19 14:04 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik, Antonio Almeida

> > Please disregard my comment about HFSC. It still overspills heavily.
> > 
> > On a 400 mbps limit, I'm getting 520 mbps actual throughput.
> 
> I guess you should send some logs. Your previous report seem to show

Can you give some hints on which logs you would like to see?

> the sum of sc rates of of children could be too high. You seem to
> expect the parent's sc and ul should limit this, but actually children
> rates decide and parent's rates are mainly for lending/borrowing (at

The children's ceil rate is 70% of the parent 1:2 class rate.

> least in HTB). So, it would be nice to try with one leaf class first,
> (similarly to Antonio) how high rates are respected.

Unfortunately its difficult for me to play with classes as its real traffic. 
I'll try to get a traffic generator.

> High drop should be OK if the flow is much faster than scheduling/
> hardware send rate. It could be a bit higher than in older kernels
> because of limited requeuing, but this could be corrected with
> longer queue lenghts (sfq has a very short queue: max 127).

I don't think its sfq, since I have the same sfq qdiscs with HSFC.

Also I'm comparing this to my production HTB box has 2.6.21.5 with esfq 
and no bond (just eth), esfq also has 127p limit.

I tried to get rid of bond on the outbound traffic, I balanced traffic
via eth0 and eth2 manually by splitting routes going through them.

I still had the same issue with HTB not reaching the full speed.

I'm going to try testing exactly the same configuration on 2.6.29 as I have
on 2.6.21.5 tonight. The only difference would be that I use sfq(dst) instead of
esfq(dst) which is not available on 2.6.29.

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 11:28                           ` Jarek Poplawski
@ 2009-05-19 14:31                             ` Antonio Almeida
  0 siblings, 0 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-19 14:31 UTC (permalink / raw)
  To: Jarek Poplawski, Denys Fedoryschenko
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet

I tested it with BFIFO using limit 6875000. (Analyser keeps sending
950Mbits/s of 800 bytes tcp packets - lots of drops for sure)
Backlog is now huge but the throughout stays much higher than the
configured ceil.

# tc -s -d class ls dev eth1
class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 9542831672 bytes 11988482 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621765Kbit 97639pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -186 ctokens: -186

class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 9542831672 bytes 11988482 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621765Kbit 97639pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -186 ctokens: -186

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 9542831672 bytes 11988482 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621765Kbit 97639pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -186 ctokens: -186

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 9549705928 bytes 11997118 pkt (dropped 6092846, overlimits 0 requeues 0)
 rate 621764Kbit 97639pps backlog 0b 8636p requeues 0
 lended: 11988482 borrowed: 0 giants: 0
 tokens: -1008 ctokens: -1008



# tc -s -d qdisc ls dev eth1
qdisc htb 1: root r2q 10 default 0 direct_packets_stat 11955 ver 3.17
 Sent 9608660872 bytes 12071182 pkt (dropped 6124502, overlimits
18190041 requeues 0)
 rate 0bit 0pps backlog 0b 8636p requeues 0
qdisc bfifo 108: parent 1:108 limit 6875000b
 Sent 9599144692 bytes 12059227 pkt (dropped 6124502, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 6874256b 8636p requeues 0


  Antonio Almeida



On Tue, May 19, 2009 at 12:28 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Tue, May 19, 2009 at 02:21:28PM +0300, Denys Fedoryschenko wrote:
>> On Tuesday 19 May 2009 14:18:57 Jarek Poplawski wrote:
>> >
>> > Sure, if the queue is too short we could have a problem with reaching
>> > the expected rate; but here it's all backwards - it could actually
>> > "help" with the stats. ;-)
>> >
>> > Jarek P.
>> Well, i had real experience on HTB, when i set too short buffers on  my QoS
>> qdiscs, the incoming rate jumped too high than overall. When i set larger
>> buffers (and by the way dropped sfq and use bfifo) - it is dropped.  No idea
>> why, bug or specific things in  protocols congestion control. Maybe worth to
>> try...
>>
>
> Very strange. Anyway, "overlimits 0" suggests HTB always got packets
> when it needed...
>
> Jarek P.
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps
  2009-05-19  7:57                   ` Jarek Poplawski
@ 2009-05-19 18:03                     ` Eric Dumazet
  2009-05-19 19:09                       ` [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Dumazet @ 2009-05-19 18:03 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: David Miller, vexwek, netdev, kaber, devik

Jarek Poplawski a écrit :
> On Tue, May 19, 2009 at 07:42:47AM +0000, Jarek Poplawski wrote:
>> On Tue, May 19, 2009 at 09:31:36AM +0200, Eric Dumazet wrote:
>>> Jarek Poplawski a écrit :
>>>> On Tue, May 19, 2009 at 01:59:55AM +0200, Eric Dumazet wrote:
>>>> ...
>>>>> diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c
>>>> ...
>>>>> -		e->avbps += ((long)rate - (long)e->avbps) >> e->ewma_log;
>>>>> +		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
>>>> Btw., I'm a bit concerned about the syntax here: isn't such shifting
>>>> of signed ints implementation dependant?
>>>>
>>> You are right Jarek, I very often forget to never ever use signed quantities
>>> at all ! (But also note original code has same undefined behavior)
>> Sure, I've meant the original code including 5 lines below.
>>
>>> Apparently gcc does the *right* thing on x86_32, but we probably want something
>>> stronger here. I could not find gcc documentation statement on right shifts of 
>>> negative values.
>> I guess gcc and most of others do this "right"; but it looks
>> "unkosher" anyway.
> 
> I might have missed your point here, but would it be so costly to do
> these shifts separately here?

You replied to yourself Jarek :)

As I said earlier, I found your concern right, so please submit a patch ?

I found many occurrences of a right shift on a signed int/long in kernel.
One example being :

arch/x86/mm/init_64.c

int kern_addr_valid(unsigned long addr)
{
	unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;


and another rate estimator in drivers/atm/idt77252.c

static void
idt77252_est_timer(unsigned long data)


We could aso check net/netfilter/ipvs/ip_vs_est.c (estimation_timer())


^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts.
  2009-05-19 18:03                     ` Eric Dumazet
@ 2009-05-19 19:09                       ` Jarek Poplawski
  2009-05-26  5:47                         ` David Miller
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 19:09 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, vexwek, netdev, kaber, devik

On Tue, May 19, 2009 at 08:03:24PM +0200, Eric Dumazet wrote:
...
> As I said earlier, I found your concern right, so please submit a patch ?

OK, thanks,
Jarek P.
----------------->
pkt_sched: gen_estimator: Fix signed integers right-shifts.

Right-shifts of signed integers are implementation-defined so unportable.

With feedback from: Eric Dumazet <dada1@cosmosbay.com>

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

diff -Nurp a/net/core/gen_estimator.c b/net/core/gen_estimator.c
--- a/net/core/gen_estimator.c	2009-05-19 20:33:47.000000000 +0200
+++ b/net/core/gen_estimator.c	2009-05-19 20:40:58.000000000 +0200
@@ -128,12 +128,12 @@ static void est_timer(unsigned long arg)
 		npackets = e->bstats->packets;
 		brate = (nbytes - e->last_bytes)<<(7 - idx);
 		e->last_bytes = nbytes;
-		e->avbps += ((s64)(brate - e->avbps)) >> e->ewma_log;
+		e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log);
 		e->rate_est->bps = (e->avbps+0xF)>>5;
 
 		rate = (npackets - e->last_packets)<<(12 - idx);
 		e->last_packets = npackets;
-		e->avpps += ((long)rate - (long)e->avpps) >> e->ewma_log;
+		e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log);
 		e->rate_est->pps = (e->avpps+0x1FF)>>10;
 skip:
 		read_unlock(&est_lock);

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-19 13:18                   ` Jesper Dangaard Brouer
@ 2009-05-19 19:35                     ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 19:35 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Antonio Almeida, Stephen Hemminger, netdev, kaber, davem, devik,
	Eric Dumazet

On Tue, May 19, 2009 at 03:18:47PM +0200, Jesper Dangaard Brouer wrote:
>
> On Mon, 18 May 2009, Jarek Poplawski wrote:
>
>> Since you're so kind... :-) There is a line in net/sched/sch_htb.c:
>>
>> #define HTB_HYSTERESIS 1        /* whether to use mode hysteresis for speedup */
>>
>> Could you change 1 to 0, and repeat these tests above after recompiling?
>
> Notice its runtime adjustable via:
>  /sys/module/sch_htb/parameters/htb_hysteresis
>
> Since kernel version v2.6.26.

Yes, this should convince Antonio to try something newer.
(Alas it didn't seem to make much difference to his case ;-)

Cheers,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-19 14:04             ` Vladimir Ivashchenko
@ 2009-05-19 20:10               ` Jarek Poplawski
  2009-05-20 22:07                 ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-19 20:10 UTC (permalink / raw)
  To: Vladimir Ivashchenko; +Cc: netdev, kaber, davem, devik, Antonio Almeida

On Tue, May 19, 2009 at 05:04:16PM +0300, Vladimir Ivashchenko wrote:
> > > Please disregard my comment about HFSC. It still overspills heavily.
> > > 
> > > On a 400 mbps limit, I'm getting 520 mbps actual throughput.
> > 
> > I guess you should send some logs. Your previous report seem to show
> 
> Can you give some hints on which logs you would like to see?

Similarly to Antonio's: ifconfigs and tc -s for qdiscs and classes at
the beginning and at the end of testing.

> > the sum of sc rates of of children could be too high. You seem to
> > expect the parent's sc and ul should limit this, but actually children
> > rates decide and parent's rates are mainly for lending/borrowing (at
> 
> The children's ceil rate is 70% of the parent 1:2 class rate.

How about children's main rates?

> > least in HTB). So, it would be nice to try with one leaf class first,
> > (similarly to Antonio) how high rates are respected.
> 
> Unfortunately its difficult for me to play with classes as its real traffic. 
> I'll try to get a traffic generator.

Let it be the real traffic, but please re-check these rates sums.

> > High drop should be OK if the flow is much faster than scheduling/
> > hardware send rate. It could be a bit higher than in older kernels
> > because of limited requeuing, but this could be corrected with
> > longer queue lenghts (sfq has a very short queue: max 127).
> 
> I don't think its sfq, since I have the same sfq qdiscs with HSFC.
> 
> Also I'm comparing this to my production HTB box has 2.6.21.5 with esfq 
> and no bond (just eth), esfq also has 127p limit.
> 
> I tried to get rid of bond on the outbound traffic, I balanced traffic
> via eth0 and eth2 manually by splitting routes going through them.
> 
> I still had the same issue with HTB not reaching the full speed.
> 
> I'm going to try testing exactly the same configuration on 2.6.29 as I have
> on 2.6.21.5 tonight. The only difference would be that I use sfq(dst) instead of
> esfq(dst) which is not available on 2.6.29.

I'm a bit lost about your configs/results and not reaching vs.
overspilled, so please send some new data to compare (gzipped?).

Jarek P. 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-19 20:10               ` Jarek Poplawski
@ 2009-05-20 22:07                 ` Vladimir Ivashchenko
  2009-05-20 22:46                   ` Eric Dumazet
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-20 22:07 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik, Antonio Almeida

[-- Attachment #1: Type: text/plain, Size: 1163 bytes --]


> > > 
> > > I guess you should send some logs. Your previous report seem to show
> > 
> > Can you give some hints on which logs you would like to see?
> 
> Similarly to Antonio's: ifconfigs and tc -s for qdiscs and classes at
> the beginning and at the end of testing.

Ok, it seems that I finally found what is causing my HTB on 2.6.29 not
to reach full throughput: dst hashing on sfq with high divisor value.

2.6.21 esfq divisor 13 depth 4096 hash dst - 680 mbps
2.6.29 sfq WITHOUT "flow hash keys dst ... " (default sfq) - 680 mbps
2.6.29 sfq + "flow hash keys dst divisor 64" filter - 680 mbps
2.6.29 sfq + "flow hash keys dst divisor 256" filter - 660 mbps
2.6.29 sfq + "flow hash keys dst divisor 2048" filters - 460 mbps

I'm using high sfq hash divisor in order to decrease the number of
collisions, there are several thousands of hosts behind each of the
classes. 

Any ideas why increasing the sfq divisor size results in drop of
throughput ?

Attached are diagnostics gathered in case of divisor 2048.

-- 
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211


[-- Attachment #2: sfqtest.tar.gz --]
[-- Type: application/x-compressed-tar, Size: 66428 bytes --]

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-20 22:07                 ` Vladimir Ivashchenko
@ 2009-05-20 22:46                   ` Eric Dumazet
  2009-05-21  7:20                     ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Eric Dumazet @ 2009-05-20 22:46 UTC (permalink / raw)
  To: Vladimir Ivashchenko
  Cc: Jarek Poplawski, netdev, kaber, davem, devik, Antonio Almeida,
	Corey Hickey

Vladimir Ivashchenko a écrit :
>>>> I guess you should send some logs. Your previous report seem to show
>>> Can you give some hints on which logs you would like to see?
>> Similarly to Antonio's: ifconfigs and tc -s for qdiscs and classes at
>> the beginning and at the end of testing.
> 
> Ok, it seems that I finally found what is causing my HTB on 2.6.29 not
> to reach full throughput: dst hashing on sfq with high divisor value.
> 
> 2.6.21 esfq divisor 13 depth 4096 hash dst - 680 mbps
> 2.6.29 sfq WITHOUT "flow hash keys dst ... " (default sfq) - 680 mbps
> 2.6.29 sfq + "flow hash keys dst divisor 64" filter - 680 mbps
> 2.6.29 sfq + "flow hash keys dst divisor 256" filter - 660 mbps
> 2.6.29 sfq + "flow hash keys dst divisor 2048" filters - 460 mbps
> 
> I'm using high sfq hash divisor in order to decrease the number of
> collisions, there are several thousands of hosts behind each of the
> classes. 
> 
> Any ideas why increasing the sfq divisor size results in drop of
> throughput ?
> 
> Attached are diagnostics gathered in case of divisor 2048.
> 


But... it appears sfq currently supports a fixed divisor of 1024

net/sched/sch_sfq.c

 IMPLEMENTATION:
 This implementation limits maximal queue length to 128;
 maximal mtu to 2^15-1; number of hash buckets to 1024.
 The only goal of this restrictions was that all data
 fit into one 4K page :-). Struct sfq_sched_data is
 organized in anti-cache manner: all the data for a bucket
 are scattered over different locations. This is not good,
 but it allowed me to put it into 4K.

 It is easy to increase these values, but not in flight.  */

#define SFQ_DEPTH   128
#define SFQ_HASH_DIVISOR    1024


Apparently Corey Hickey 2007 work on SFQ was not merged.

http://kerneltrap.org/mailarchive/linux-netdev/2007/9/28/325048



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-20 22:46                   ` Eric Dumazet
@ 2009-05-21  7:20                     ` Jarek Poplawski
  2009-05-21  7:44                       ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-21  7:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Vladimir Ivashchenko, netdev, kaber, davem, devik,
	Antonio Almeida, Corey Hickey

On Thu, May 21, 2009 at 12:46:16AM +0200, Eric Dumazet wrote:
> Vladimir Ivashchenko a écrit :
> >>>> I guess you should send some logs. Your previous report seem to show
> >>> Can you give some hints on which logs you would like to see?
> >> Similarly to Antonio's: ifconfigs and tc -s for qdiscs and classes at
> >> the beginning and at the end of testing.
> > 
> > Ok, it seems that I finally found what is causing my HTB on 2.6.29 not
> > to reach full throughput: dst hashing on sfq with high divisor value.
> > 
> > 2.6.21 esfq divisor 13 depth 4096 hash dst - 680 mbps
> > 2.6.29 sfq WITHOUT "flow hash keys dst ... " (default sfq) - 680 mbps
> > 2.6.29 sfq + "flow hash keys dst divisor 64" filter - 680 mbps
> > 2.6.29 sfq + "flow hash keys dst divisor 256" filter - 660 mbps
> > 2.6.29 sfq + "flow hash keys dst divisor 2048" filters - 460 mbps
> > 
> > I'm using high sfq hash divisor in order to decrease the number of
> > collisions, there are several thousands of hosts behind each of the
> > classes. 
> > 
> > Any ideas why increasing the sfq divisor size results in drop of
> > throughput ?
> > 
> > Attached are diagnostics gathered in case of divisor 2048.
> > 
> 
> 
> But... it appears sfq currently supports a fixed divisor of 1024
> 
> net/sched/sch_sfq.c
> 
>  IMPLEMENTATION:
>  This implementation limits maximal queue length to 128;
>  maximal mtu to 2^15-1; number of hash buckets to 1024.
>  The only goal of this restrictions was that all data
>  fit into one 4K page :-). Struct sfq_sched_data is
>  organized in anti-cache manner: all the data for a bucket
>  are scattered over different locations. This is not good,
>  but it allowed me to put it into 4K.
> 
>  It is easy to increase these values, but not in flight.  */
> 
> #define SFQ_DEPTH   128
> #define SFQ_HASH_DIVISOR    1024
> 
> 
> Apparently Corey Hickey 2007 work on SFQ was not merged.
> 
> http://kerneltrap.org/mailarchive/linux-netdev/2007/9/28/325048

Yes, sfq has its design limits, and as a matter of fact, because of
max length (127) it should be treated as a toy or "personal" qdisc.

I don't know why more of esfq wasn't merged, anyway similar
functionality could be achieved in current kernels with sch_drr +
cls_flow, alas not enough documented. Here is some hint:
http://markmail.org/message/h24627xkrxyqxn4k

Jarek P.

PS: I guess, you wasn't very consistent if your main problem was
exceeding or not reaching htb rate, and there is quite a difference.

Vladimir Ivashchenko wrote, On 05/08/2009 10:46 PM:

> Exporting HZ=1000 doesn't help. However, even if I recompile the kernel
> to 1000 Hz and the burst is calculated correctly, for some reason HTB on
> 2.6.29 is still worse at rate control than 2.6.21.
> 
> With 2.6.21, ceil of 775 mbits, burst 99425b -> actual rate 825 mbits.
> With 2.6.29, same ceil/burst -> actual rate 890 mbits.
...

Vladimir Ivashchenko wrote, On 05/17/2009 10:29 PM:

> Hi Antonio,
> 
> FYI, these are exactly the same problems I get in real life.
> Check the later posts in "bond + tc regression" thread.
...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-21  7:20                     ` Jarek Poplawski
@ 2009-05-21  7:44                       ` Vladimir Ivashchenko
  2009-05-21  8:28                         ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-21  7:44 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Eric Dumazet, netdev, kaber, davem, devik, Antonio Almeida, Corey Hickey

> I don't know why more of esfq wasn't merged, anyway similar
> functionality could be achieved in current kernels with sch_drr +
> cls_flow, alas not enough documented. Here is some hint:
> http://markmail.org/message/h24627xkrxyqxn4k

Can I balance only by destination IP using this approach? 
Normal IP flow-based balancing is not good for me, I need 
to ensure equality between destination hosts.

> 
> Jarek P.
> 
> PS: I guess, you wasn't very consistent if your main problem was
> exceeding or not reaching htb rate, and there is quite a difference.

Yes indeed :(

I'm trying to migrate from 2.6.21 eth/htb/esfq to 2.6.29 
bond/htb/sfq, and that introduces a lot of changes.

Apparently during some point I changed sfq divisor from 1024 
to 2048 and forgot about it.

Now I realize that the problems I reported were as follows:

1) HTB exceeds target when I use HTB + sfq + divisor 1024
2) HFSC exceeds target when I use HFSC + sfq + divisor 1024
3) HTB does not reach target when I use HTB + sfq + divisor 2048

I will check again scenario 1) with the latest patches from
the list.

> Vladimir Ivashchenko wrote, On 05/08/2009 10:46 PM:
> 
> > Exporting HZ=1000 doesn't help. However, even if I recompile the kernel
> > to 1000 Hz and the burst is calculated correctly, for some reason HTB on
> > 2.6.29 is still worse at rate control than 2.6.21.
> > 
> > With 2.6.21, ceil of 775 mbits, burst 99425b -> actual rate 825 mbits.
> > With 2.6.29, same ceil/burst -> actual rate 890 mbits.
> ...
> 
> Vladimir Ivashchenko wrote, On 05/17/2009 10:29 PM:
> 
> > Hi Antonio,
> > 
> > FYI, these are exactly the same problems I get in real life.
> > Check the later posts in "bond + tc regression" thread.
> ...

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-21  7:44                       ` Vladimir Ivashchenko
@ 2009-05-21  8:28                         ` Jarek Poplawski
  2009-05-21  9:07                           ` Eric Dumazet
  2009-05-23 10:37                           ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
  0 siblings, 2 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-21  8:28 UTC (permalink / raw)
  To: Vladimir Ivashchenko
  Cc: Eric Dumazet, netdev, kaber, davem, devik, Antonio Almeida, Corey Hickey

On Thu, May 21, 2009 at 10:44:00AM +0300, Vladimir Ivashchenko wrote:
> > I don't know why more of esfq wasn't merged, anyway similar
> > functionality could be achieved in current kernels with sch_drr +
> > cls_flow, alas not enough documented. Here is some hint:
> > http://markmail.org/message/h24627xkrxyqxn4k
> 
> Can I balance only by destination IP using this approach? 
> Normal IP flow-based balancing is not good for me, I need 
> to ensure equality between destination hosts.

Yes, you need to use flow "dst" key, I guess. (tc filter add flow help)

Jarek P.

> > PS: I guess, you wasn't very consistent if your main problem was
> > exceeding or not reaching htb rate, and there is quite a difference.
> 
> Yes indeed :(

Generally, the most common reasons are:
- too short (or zero) tx queue length or/plus some disturbances in
  maintaining the flow - for not reaching the rate
- gso/tso or other non standard packets sizes - for exceeding the
  rate.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-18 17:16         ` Antonio Almeida
@ 2009-05-21  8:51           ` Jarek Poplawski
  2009-05-22 17:42             ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-21  8:51 UTC (permalink / raw)
  To: Antonio Almeida; +Cc: Stephen Hemminger, netdev, kaber, davem, devik

On Mon, May 18, 2009 at 06:16:26PM +0100, Antonio Almeida wrote:
> I forgot to tell you that I used tc source code from iproute2-2.6.16.
> I couldn't use the newest version because I got errors when compiling.

I still have no clue about the reason, but it would be really nice to
do some short test with more current kernel (>= 2.6.27) and iproute2
(to exclude the possibility of some incomaptibility in configs e.g.
rate tables passed to htb).

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-21  8:28                         ` Jarek Poplawski
@ 2009-05-21  9:07                           ` Eric Dumazet
  2009-05-21  9:22                             ` Jarek Poplawski
  2009-05-23 10:37                           ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
  1 sibling, 1 reply; 104+ messages in thread
From: Eric Dumazet @ 2009-05-21  9:07 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Vladimir Ivashchenko, netdev, kaber, davem, devik,
	Antonio Almeida, Corey Hickey

Jarek Poplawski a écrit :
> On Thu, May 21, 2009 at 10:44:00AM +0300, Vladimir Ivashchenko wrote:
>>> I don't know why more of esfq wasn't merged, anyway similar
>>> functionality could be achieved in current kernels with sch_drr +
>>> cls_flow, alas not enough documented. Here is some hint:
>>> http://markmail.org/message/h24627xkrxyqxn4k
>> Can I balance only by destination IP using this approach? 
>> Normal IP flow-based balancing is not good for me, I need 
>> to ensure equality between destination hosts.
> 
> Yes, you need to use flow "dst" key, I guess. (tc filter add flow help)
> 
> Jarek P.
> 
>>> PS: I guess, you wasn't very consistent if your main problem was
>>> exceeding or not reaching htb rate, and there is quite a difference.
>> Yes indeed :(
> 
> Generally, the most common reasons are:
> - too short (or zero) tx queue length or/plus some disturbances in
>   maintaining the flow - for not reaching the rate


> - gso/tso or other non standard packets sizes - for exceeding the
>   rate.

Could we detect this at runtime and emit a warning (once) ?

Or should we assume guys using this stuff should be smart enough ?
I confess I made this error once and this was not so easy to spot...
	


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed
  2009-05-21  9:07                           ` Eric Dumazet
@ 2009-05-21  9:22                             ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-21  9:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Vladimir Ivashchenko, netdev, kaber, davem, devik,
	Antonio Almeida, Corey Hickey

On Thu, May 21, 2009 at 11:07:24AM +0200, Eric Dumazet wrote:
...
> > - gso/tso or other non standard packets sizes - for exceeding the
> >   rate.
> 
> Could we detect this at runtime and emit a warning (once) ?

I guess, it's a rhetorical question...

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-21  8:51           ` Jarek Poplawski
@ 2009-05-22 17:42             ` Antonio Almeida
  2009-05-23  7:32               ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-22 17:42 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Thu, May 21, 2009 at 9:51 AM, Jarek Poplawski wrote:
> I still have no clue about the reason, but it would be really nice to
> do some short test with more current kernel (>= 2.6.27) and iproute2
> (to exclude the possibility of some incomaptibility in configs e.g.
> rate tables passed to htb).

I installed kernel 2.6.29 (finaly! wasn't easy... I couldn't set
memory split 2G/2G),
but the results are the same. I've already applied gen_estimator.c
patches (works fine).

# tc -s -d class ls dev eth1 | head -24
class htb 1:1 root rate 900000Kbit ceil 900000Kbit burst 113962b/8 mpu
0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level 7
 Sent 119955303928 bytes 150697618 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621844Kbit 97651pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:10 parent 1:2 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
5
 Sent 119955303928 bytes 150697618 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621844Kbit 97651pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 119955366812 bytes 150697697 pkt (dropped 76696483, overlimits 0
requeues 0)
 rate 621847Kbit 97652pps backlog 0b 79p requeues 0
 lended: 150697618 borrowed: 0 giants: 0
 tokens: -5 ctokens: -5

class htb 1:2 parent 1:1 rate 900000Kbit ceil 900000Kbit burst
113962b/8 mpu 0b overhead 0b cburst 113962b/8 mpu 0b overhead 0b level
6
 Sent 119955303928 bytes 150697618 pkt (dropped 0, overlimits 0 requeues 0)
 rate 621844Kbit 97651pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 402 ctokens: 402


# cat /sys/module/sch_htb/parameters/htb_hysteresis
0

# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off


I'm working on a newer iproute2.


  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-22 17:42             ` Antonio Almeida
@ 2009-05-23  7:32               ` Jarek Poplawski
  2009-05-28 18:13                 ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-23  7:32 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, May 22, 2009 at 06:42:16PM +0100, Antonio Almeida wrote:
> On Thu, May 21, 2009 at 9:51 AM, Jarek Poplawski wrote:
> > I still have no clue about the reason, but it would be really nice to
> > do some short test with more current kernel (>= 2.6.27) and iproute2
> > (to exclude the possibility of some incomaptibility in configs e.g.
> > rate tables passed to htb).
> 
> I installed kernel 2.6.29 (finaly! wasn't easy... I couldn't set
> memory split 2G/2G),
> but the results are the same. I've already applied gen_estimator.c
> patches (works fine).
...
> I'm working on a newer iproute2.

Actually, from these two I was more interested in iproute2 more
fitting the kernel version. :-((It should be enough to have at least
tc compiled properly, I guess.)

Btw.: if at any point you think this testing is too disturbing to you
etc., feel free to stop this or delay in time as you like.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-21  8:28                         ` Jarek Poplawski
  2009-05-21  9:07                           ` Eric Dumazet
@ 2009-05-23 10:37                           ` Vladimir Ivashchenko
  2009-05-23 14:34                             ` Jarek Poplawski
  1 sibling, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-23 10:37 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, netdev


> > > cls_flow, alas not enough documented. Here is some hint:
> > > http://markmail.org/message/h24627xkrxyqxn4k
> > 
> > Can I balance only by destination IP using this approach? 
> > Normal IP flow-based balancing is not good for me, I need 
> > to ensure equality between destination hosts.
> 
> Yes, you need to use flow "dst" key, I guess. (tc filter add flow
> help)

What is the number of DRR classes I need to create, a separate class for
each host? I have around 20000 hosts.

I figured out that WRR does what I want and its documented, so I'm using
a 2.6.27 kernel with WRR now.

I was still hitting a wall with bonding. I played with a lot of
combinations and could not find a way to make it scale to multiple
cores. Cores which handle incoming traffic would get hit to 0-20% idle.

So, I got rid of bonding completely and instead configured PBR on Cisco
+ Linux routing in such a way so that packet gets received and
transmitted using NICs connected to the same pair of cores with common
cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
case scenarios before.

> - gso/tso or other non standard packets sizes - for exceeding the
>   rate.

Just FYI, kernel 2.6.29.1, sub-classes with sfq divisor 1024, tso & gso
off, netdevice.h and tc_core.c patches applied:

class htb 1:2 root rate 775000Kbit ceil 775000Kbit burst 98328b cburst
98328b
Sent 64883444467 bytes 72261124 pkt (dropped 0, overlimits 0 requeues 0)
rate 821332Kbit 112572pps backlog 0b 0p requeues 0
lended: 21736738 borrowed: 0 giants: 0

In any case, exceeding the rate is not big of a problem for me.

Thanks a lot to everyone for their help.

-- 
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-23 10:37                           ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
@ 2009-05-23 14:34                             ` Jarek Poplawski
  2009-05-23 15:06                               ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-23 14:34 UTC (permalink / raw)
  To: Vladimir Ivashchenko; +Cc: Eric Dumazet, netdev

On Sat, May 23, 2009 at 01:37:32PM +0300, Vladimir Ivashchenko wrote:
> 
> > > > cls_flow, alas not enough documented. Here is some hint:
> > > > http://markmail.org/message/h24627xkrxyqxn4k
> > > 
> > > Can I balance only by destination IP using this approach? 
> > > Normal IP flow-based balancing is not good for me, I need 
> > > to ensure equality between destination hosts.
> > 
> > Yes, you need to use flow "dst" key, I guess. (tc filter add flow
> > help)
> 
> What is the number of DRR classes I need to create, a separate class for
> each host? I have around 20000 hosts.

One class per divisor.

> I figured out that WRR does what I want and its documented, so I'm using
> a 2.6.27 kernel with WRR now.

OK if it works for you.
 
> I was still hitting a wall with bonding. I played with a lot of
> combinations and could not find a way to make it scale to multiple
> cores. Cores which handle incoming traffic would get hit to 0-20% idle.
> 
> So, I got rid of bonding completely and instead configured PBR on Cisco
> + Linux routing in such a way so that packet gets received and
> transmitted using NICs connected to the same pair of cores with common
> cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
> case scenarios before.

As a matter of fact I don't understand this bonding idea vs. smp: I
guess Eric Dumazet wrote why it's wrong wrt. locking. I'm not an smp
expert but I think the most efficient use is with separate NICs per
cpu (so with separate HTB qdiscs if possible), or multiqueue NICs -
but they would currently need a common HTB etc., so again a common
locking/cache problem.

> > - gso/tso or other non standard packets sizes - for exceeding the
> >   rate.
> 
> Just FYI, kernel 2.6.29.1, sub-classes with sfq divisor 1024, tso & gso
> off, netdevice.h and tc_core.c patches applied:
> 
> class htb 1:2 root rate 775000Kbit ceil 775000Kbit burst 98328b cburst
> 98328b
> Sent 64883444467 bytes 72261124 pkt (dropped 0, overlimits 0 requeues 0)
> rate 821332Kbit 112572pps backlog 0b 0p requeues 0
> lended: 21736738 borrowed: 0 giants: 0
> 
> In any case, exceeding the rate is not big of a problem for me.

Anyway, I'd be interested with the full tc -s class & qdisc report.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-23 14:34                             ` Jarek Poplawski
@ 2009-05-23 15:06                               ` Vladimir Ivashchenko
  2009-05-23 15:35                                 ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-23 15:06 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, netdev

> > So, I got rid of bonding completely and instead configured PBR on Cisco
> > + Linux routing in such a way so that packet gets received and
> > transmitted using NICs connected to the same pair of cores with common
> > cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
> > case scenarios before.
> 
> As a matter of fact I don't understand this bonding idea vs. smp: I
> guess Eric Dumazet wrote why it's wrong wrt. locking. I'm not an smp
> expert but I think the most efficient use is with separate NICs per
> cpu (so with separate HTB qdiscs if possible), or multiqueue NICs -

I tried the following scenario: 2 NICs used for receive + another 2 NICs 
used for transmit having HTB. Each NIC on a separate core. No bonding, 
just manual load balancing using IP routing.

The result was that RX cores would be 20% and 40% idle respectively, even 
though the amount of traffic they were receiving was roughly the same. 
The TX cores were idling at around 90%. 

I found this strange personally, but I'm completely ignorant in internals of
kernel operation.

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-23 15:06                               ` Vladimir Ivashchenko
@ 2009-05-23 15:35                                 ` Jarek Poplawski
  2009-05-23 15:53                                   ` Vladimir Ivashchenko
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-23 15:35 UTC (permalink / raw)
  To: Vladimir Ivashchenko; +Cc: Eric Dumazet, netdev

On Sat, May 23, 2009 at 06:06:30PM +0300, Vladimir Ivashchenko wrote:
> > > So, I got rid of bonding completely and instead configured PBR on Cisco
> > > + Linux routing in such a way so that packet gets received and
> > > transmitted using NICs connected to the same pair of cores with common
> > > cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
> > > case scenarios before.
> > 
> > As a matter of fact I don't understand this bonding idea vs. smp: I
> > guess Eric Dumazet wrote why it's wrong wrt. locking. I'm not an smp
> > expert but I think the most efficient use is with separate NICs per
> > cpu (so with separate HTB qdiscs if possible), or multiqueue NICs -
> 
> I tried the following scenario: 2 NICs used for receive + another 2 NICs 
> used for transmit having HTB. Each NIC on a separate core. No bonding, 
> just manual load balancing using IP routing.
> 
> The result was that RX cores would be 20% and 40% idle respectively, even 
> though the amount of traffic they were receiving was roughly the same. 
> The TX cores were idling at around 90%. 

There is not enough data to analyse this, but generally you should aim
at maintaining one flow (RX + TX) on the same cpu cache.

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-23 15:35                                 ` Jarek Poplawski
@ 2009-05-23 15:53                                   ` Vladimir Ivashchenko
  2009-05-23 16:02                                     ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Vladimir Ivashchenko @ 2009-05-23 15:53 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Eric Dumazet, netdev

> > > > So, I got rid of bonding completely and instead configured PBR on Cisco
> > > > + Linux routing in such a way so that packet gets received and
> > > > transmitted using NICs connected to the same pair of cores with common
> > > > cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
> > > > case scenarios before.
> > > 
> > > As a matter of fact I don't understand this bonding idea vs. smp: I
> > > guess Eric Dumazet wrote why it's wrong wrt. locking. I'm not an smp
> > > expert but I think the most efficient use is with separate NICs per
> > > cpu (so with separate HTB qdiscs if possible), or multiqueue NICs -
> > 
> > I tried the following scenario: 2 NICs used for receive + another 2 NICs 
> > used for transmit having HTB. Each NIC on a separate core. No bonding, 
> > just manual load balancing using IP routing.
> > 
> > The result was that RX cores would be 20% and 40% idle respectively, even 
> > though the amount of traffic they were receiving was roughly the same. 
> > The TX cores were idling at around 90%. 
> 
> There is not enough data to analyse this, but generally you should aim
> at maintaining one flow (RX + TX) on the same cpu cache.

Yep, that's what I did in the end (as per the top paragraph).

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: HTB accuracy for high speed (and bonding)
  2009-05-23 15:53                                   ` Vladimir Ivashchenko
@ 2009-05-23 16:02                                     ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-23 16:02 UTC (permalink / raw)
  To: Vladimir Ivashchenko; +Cc: Eric Dumazet, netdev

On Sat, May 23, 2009 at 06:53:21PM +0300, Vladimir Ivashchenko wrote:
> > > > > So, I got rid of bonding completely and instead configured PBR on Cisco
> > > > > + Linux routing in such a way so that packet gets received and
> > > > > transmitted using NICs connected to the same pair of cores with common
> > > > > cache. 65-70% idle on all cores now, compared to 0-30% idle in worst
> > > > > case scenarios before.
...
> > There is not enough data to analyse this, but generally you should aim
> > at maintaining one flow (RX + TX) on the same cpu cache.
> 
> Yep, that's what I did in the end (as per the top paragraph).

So, stop writing: "I'm completely ignorant in internals of kernel
operation" because you're smp expert now! ;-)

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts.
  2009-05-19 19:09                       ` [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts Jarek Poplawski
@ 2009-05-26  5:47                         ` David Miller
  0 siblings, 0 replies; 104+ messages in thread
From: David Miller @ 2009-05-26  5:47 UTC (permalink / raw)
  To: jarkao2; +Cc: dada1, vexwek, netdev, kaber, devik

From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 19 May 2009 21:09:15 +0200

> pkt_sched: gen_estimator: Fix signed integers right-shifts.
> 
> Right-shifts of signed integers are implementation-defined so unportable.
> 
> With feedback from: Eric Dumazet <dada1@cosmosbay.com>
> 
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Applied to net-next-2.6, thanks!

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-23  7:32               ` Jarek Poplawski
@ 2009-05-28 18:13                 ` Antonio Almeida
  2009-05-28 21:12                   ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-05-28 18:13 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Sat, May 23, 2009 at 8:32 AM, Jarek Poplawski wrote:
> Actually, from these two I was more interested in iproute2 more
> fitting the kernel version. :-((It should be enough to have at least
> tc compiled properly, I guess.)
I installed iproute2-ss090115 with the new patch but the results are
the same for my test scenery. HTB keeps sending 620Mbit/s when I
configure it's ceil to 555Mbit/s, with 800 bytes packets long.

> Btw.: if at any point you think this testing is too disturbing to you
> etc., feel free to stop this or delay in time as you like.
I'm working on this, don't worry. Since I have a traffic
generator/analyser, any modification you would make I can test it.
You're free to ask.

I've been looking inside htb source code. The granularity problem
could be in the use qdisc_rate_table or near that.


  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-28 18:13                 ` Antonio Almeida
@ 2009-05-28 21:12                   ` Jarek Poplawski
  2009-05-29 17:02                     ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-28 21:12 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Thu, May 28, 2009 at 07:13:40PM +0100, Antonio Almeida wrote:
> On Sat, May 23, 2009 at 8:32 AM, Jarek Poplawski wrote:
> > Actually, from these two I was more interested in iproute2 more
> > fitting the kernel version. :-((It should be enough to have at least
> > tc compiled properly, I guess.)
> I installed iproute2-ss090115 with the new patch but the results are
> the same for my test scenery. HTB keeps sending 620Mbit/s when I
> configure it's ceil to 555Mbit/s, with 800 bytes packets long.
> 
> > Btw.: if at any point you think this testing is too disturbing to you
> > etc., feel free to stop this or delay in time as you like.
> I'm working on this, don't worry. Since I have a traffic
> generator/analyser, any modification you would make I can test it.
> You're free to ask.
> 
> I've been looking inside htb source code. The granularity problem
> could be in the use qdisc_rate_table or near that.

Yes, but according to my assessment there should be "only" 50Mbit
difference for this rate/packet size. Anyway, could you try a testing
patch below, which should add some granularity to this rate table?

Thanks,
Jarek P.
---

 include/net/pkt_sched.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index e37fe31..f0faf03 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -42,8 +42,8 @@ typedef u64	psched_time_t;
 typedef long	psched_tdiff_t;
 
 /* Avoid doing 64 bit divide by 1000 */
-#define PSCHED_US2NS(x)			((s64)(x) << 10)
-#define PSCHED_NS2US(x)			((x) >> 10)
+#define PSCHED_US2NS(x)			((s64)(x) << 6)
+#define PSCHED_NS2US(x)			((x) >> 6)
 
 #define PSCHED_TICKS_PER_SEC		PSCHED_NS2US(NSEC_PER_SEC)
 #define PSCHED_PASTPERFECT		0

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-28 21:12                   ` Jarek Poplawski
@ 2009-05-29 17:02                     ` Antonio Almeida
  2009-05-29 17:28                       ` Stephen Hemminger
                                         ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Antonio Almeida @ 2009-05-29 17:02 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
> Yes, but according to my assessment there should be "only" 50Mbit
> difference for this rate/packet size. Anyway, could you try a testing
> patch below, which should add some granularity to this rate table?
>
> Thanks,
> Jarek P.
> ---
>
>  include/net/pkt_sched.h |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> index e37fe31..f0faf03 100644
> --- a/include/net/pkt_sched.h
> +++ b/include/net/pkt_sched.h
> @@ -42,8 +42,8 @@ typedef u64   psched_time_t;
>  typedef long   psched_tdiff_t;
>
>  /* Avoid doing 64 bit divide by 1000 */
> -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> -#define PSCHED_NS2US(x)                        ((x) >> 10)
> +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> +#define PSCHED_NS2US(x)                        ((x) >> 6)
>
>  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
>  #define PSCHED_PASTPERFECT             0

It's better! This patch gives more accuracy to HTB. Here some values:
Note that these are boundary values, so, e.g., any HTB configuration
between 377000Kbit and 400000Kbit would fall in the same step - close
to 397977Kbit.
This test was made over the same conditions: generating 950Mbit/s of
unidirectional tcp traffic of 800 bytes packets long.

leaf class ceil	leaf class sent rate (tc -s values)
376000Kbit	375379Kbit
--
377000Kbit	397977Kbit
400000Kbit	397973Kbit
--
401000Kbit	425199Kbit
426000Kbit	425199Kbit
--
427000Kbit	456389Kbit
457000Kbit	456409Kbit
--
458000Kbit	490111Kbit
492000Kbit	490138Kbit
--
493000Kbit	531957Kbit
533000Kbit	532078Kbit
--
534000Kbit	581835Kbit
581000Kbit	581820Kbit
--
582000Kbit	637809Kbit
640000Kbit	637709Kbit
--
641000Kbit	710526Kbit
711000Kbit	710553Kbit
--
712000Kbit	795921Kbit
800000Kbit	795901Kbit
--
801000Kbit	912706Kbit
914000Kbit	912782Kbit
--
915000Kbit	--


Here more values for a HTB ceil configuration of 555Mbit/s changing packet size:

800 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 18731000768 bytes 23531408 pkt (dropped 15715520, overlimits 0 requeues 0)
 rate 581832Kbit 91368pps backlog 0b 110p requeues 0
 lended: 23531298 borrowed: 0 giants: 0
 tokens: -16091 ctokens: -16091


850 bytes:
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 30556163150 bytes 37645600 pkt (dropped 25746491, overlimits 0 requeues 0)
 rate 565509Kbit 83556pps backlog 0b 15p requeues 0
 lended: 37645585 borrowed: 0 giants: 0
 tokens: -16010 ctokens: -16010


950 bytes	
class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
70901b/8 mpu 0b overhead 0b level 0
 Sent 51363059854 bytes 60954074 pkt (dropped 40474346, overlimits 0 requeues 0)
 rate 598925Kbit 83555pps backlog 0b 112p requeues 0
 lended: 60953962 borrowed: 0 giants: 0
 tokens: 12446 ctokens: 12446

I'm using
# tc -V
tc utility, iproute2-ss090115

and keeping tso and gso off:
# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: off

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 17:02                     ` Antonio Almeida
@ 2009-05-29 17:28                       ` Stephen Hemminger
  2009-05-29 19:58                         ` Jarek Poplawski
  2009-05-29 19:46                       ` Jarek Poplawski
  2009-05-30 20:07                       ` Jarek Poplawski
  2 siblings, 1 reply; 104+ messages in thread
From: Stephen Hemminger @ 2009-05-29 17:28 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Jarek Poplawski, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, 29 May 2009 18:02:39 +0100
Antonio Almeida <vexwek@gmail.com> wrote:

> On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
> > Yes, but according to my assessment there should be "only" 50Mbit
> > difference for this rate/packet size. Anyway, could you try a testing
> > patch below, which should add some granularity to this rate table?
> >
> > Thanks,
> > Jarek P.
> > ---
> >
> >  include/net/pkt_sched.h |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> > index e37fe31..f0faf03 100644
> > --- a/include/net/pkt_sched.h
> > +++ b/include/net/pkt_sched.h
> > @@ -42,8 +42,8 @@ typedef u64   psched_time_t;
> >  typedef long   psched_tdiff_t;
> >
> >  /* Avoid doing 64 bit divide by 1000 */
> > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> >
> >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> >  #define PSCHED_PASTPERFECT             0
> 
> It's better! This patch gives more accuracy to HTB. Here some values:
> Note that these are boundary values, so, e.g., any HTB configuration
> between 377000Kbit and 400000Kbit would fall in the same step - close
> to 397977Kbit.
> This test was made over the same conditions: generating 950Mbit/s of
> unidirectional tcp traffic of 800 bytes packets long.

You really need to get a better box than the dual core AMD.
There is only millisecond (or worse with HZ=100) resolution possible because
there is no working TSC on that hardware.


-- 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 17:02                     ` Antonio Almeida
  2009-05-29 17:28                       ` Stephen Hemminger
@ 2009-05-29 19:46                       ` Jarek Poplawski
  2009-05-29 20:49                         ` Stephen Hemminger
  2009-05-30 20:07                       ` Jarek Poplawski
  2 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-29 19:46 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, May 29, 2009 at 06:02:39PM +0100, Antonio Almeida wrote:
> On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
> > Yes, but according to my assessment there should be "only" 50Mbit
> > difference for this rate/packet size. Anyway, could you try a testing
> > patch below, which should add some granularity to this rate table?
> >
> > Thanks,
> > Jarek P.
> > ---
> >
> >  include/net/pkt_sched.h |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> > index e37fe31..f0faf03 100644
> > --- a/include/net/pkt_sched.h
> > +++ b/include/net/pkt_sched.h
> > @@ -42,8 +42,8 @@ typedef u64   psched_time_t;
> >  typedef long   psched_tdiff_t;
> >
> >  /* Avoid doing 64 bit divide by 1000 */
> > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> >
> >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> >  #define PSCHED_PASTPERFECT             0
> 
> It's better! This patch gives more accuracy to HTB. Here some values:
> Note that these are boundary values, so, e.g., any HTB configuration
> between 377000Kbit and 400000Kbit would fall in the same step - close
> to 397977Kbit.

Good news! So it seems there are no other reasons of this inaccuracy
than too coarse granularity, but I have to check this yet. Alas there
is needed something more than this patch, because it probably breaks
other things like hfsc.

Thanks,
Jarek P.

> This test was made over the same conditions: generating 950Mbit/s of
> unidirectional tcp traffic of 800 bytes packets long.
> 
> leaf class ceil	leaf class sent rate (tc -s values)
> 376000Kbit	375379Kbit
> --
> 377000Kbit	397977Kbit
> 400000Kbit	397973Kbit
> --
> 401000Kbit	425199Kbit
> 426000Kbit	425199Kbit
> --
> 427000Kbit	456389Kbit
> 457000Kbit	456409Kbit
> --
> 458000Kbit	490111Kbit
> 492000Kbit	490138Kbit
> --
> 493000Kbit	531957Kbit
> 533000Kbit	532078Kbit
> --
> 534000Kbit	581835Kbit
> 581000Kbit	581820Kbit
> --
> 582000Kbit	637809Kbit
> 640000Kbit	637709Kbit
> --
> 641000Kbit	710526Kbit
> 711000Kbit	710553Kbit
> --
> 712000Kbit	795921Kbit
> 800000Kbit	795901Kbit
> --
> 801000Kbit	912706Kbit
> 914000Kbit	912782Kbit
> --
> 915000Kbit	--
> 
> 
> Here more values for a HTB ceil configuration of 555Mbit/s changing packet size:
> 
> 800 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 18731000768 bytes 23531408 pkt (dropped 15715520, overlimits 0 requeues 0)
>  rate 581832Kbit 91368pps backlog 0b 110p requeues 0
>  lended: 23531298 borrowed: 0 giants: 0
>  tokens: -16091 ctokens: -16091
> 
> 
> 850 bytes:
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 30556163150 bytes 37645600 pkt (dropped 25746491, overlimits 0 requeues 0)
>  rate 565509Kbit 83556pps backlog 0b 15p requeues 0
>  lended: 37645585 borrowed: 0 giants: 0
>  tokens: -16010 ctokens: -16010
> 
> 
> 950 bytes	
> class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
> 555000Kbit ceil 555000Kbit burst 70901b/8 mpu 0b overhead 0b cburst
> 70901b/8 mpu 0b overhead 0b level 0
>  Sent 51363059854 bytes 60954074 pkt (dropped 40474346, overlimits 0 requeues 0)
>  rate 598925Kbit 83555pps backlog 0b 112p requeues 0
>  lended: 60953962 borrowed: 0 giants: 0
>  tokens: 12446 ctokens: 12446
> 
> I'm using
> # tc -V
> tc utility, iproute2-ss090115
> 
> and keeping tso and gso off:
> # ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: off
> udp fragmentation offload: off
> generic segmentation offload: off
> 
> # ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: off
> udp fragmentation offload: off
> generic segmentation offload: off

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 17:28                       ` Stephen Hemminger
@ 2009-05-29 19:58                         ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-29 19:58 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Antonio Almeida, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, May 29, 2009 at 10:28:45AM -0700, Stephen Hemminger wrote:
> On Fri, 29 May 2009 18:02:39 +0100
> Antonio Almeida <vexwek@gmail.com> wrote:
> 
> > On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
> > > Yes, but according to my assessment there should be "only" 50Mbit
> > > difference for this rate/packet size. Anyway, could you try a testing
> > > patch below, which should add some granularity to this rate table?
> > >
> > > Thanks,
> > > Jarek P.
> > > ---
> > >
> > >  include/net/pkt_sched.h |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> > > index e37fe31..f0faf03 100644
> > > --- a/include/net/pkt_sched.h
> > > +++ b/include/net/pkt_sched.h
> > > @@ -42,8 +42,8 @@ typedef u64   psched_time_t;
> > >  typedef long   psched_tdiff_t;
> > >
> > >  /* Avoid doing 64 bit divide by 1000 */
> > > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> > >
> > >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> > >  #define PSCHED_PASTPERFECT             0
> > 
> > It's better! This patch gives more accuracy to HTB. Here some values:
> > Note that these are boundary values, so, e.g., any HTB configuration
> > between 377000Kbit and 400000Kbit would fall in the same step - close
> > to 397977Kbit.
> > This test was made over the same conditions: generating 950Mbit/s of
> > unidirectional tcp traffic of 800 bytes packets long.
> 
> You really need to get a better box than the dual core AMD.
> There is only millisecond (or worse with HZ=100) resolution possible because
> there is no working TSC on that hardware.

I think this could cause problems with peak rates but IMHO there is
no reason for htb to miss per second (4s) estimations against the same
clock. Plus it mostly confirms theoretical limits of currently used
rate tables vs. usecond time/ticket accounting.

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 19:46                       ` Jarek Poplawski
@ 2009-05-29 20:49                         ` Stephen Hemminger
  2009-05-29 20:59                           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Stephen Hemminger @ 2009-05-29 20:49 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, 29 May 2009 21:46:43 +0200
Jarek Poplawski <jarkao2@gmail.com> wrote:

> On Fri, May 29, 2009 at 06:02:39PM +0100, Antonio Almeida wrote:
> > On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
> > > Yes, but according to my assessment there should be "only" 50Mbit
> > > difference for this rate/packet size. Anyway, could you try a testing
> > > patch below, which should add some granularity to this rate table?
> > >
> > > Thanks,
> > > Jarek P.
> > > ---
> > >
> > >  include/net/pkt_sched.h |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> > > index e37fe31..f0faf03 100644
> > > --- a/include/net/pkt_sched.h
> > > +++ b/include/net/pkt_sched.h
> > > @@ -42,8 +42,8 @@ typedef u64   psched_time_t;
> > >  typedef long   psched_tdiff_t;
> > >
> > >  /* Avoid doing 64 bit divide by 1000 */
> > > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> > >
> > >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> > >  #define PSCHED_PASTPERFECT             0
> > 
> > It's better! This patch gives more accuracy to HTB. Here some values:
> > Note that these are boundary values, so, e.g., any HTB configuration
> > between 377000Kbit and 400000Kbit would fall in the same step - close
> > to 397977Kbit.
> 
> Good news! So it seems there are no other reasons of this inaccuracy
> than too coarse granularity, but I have to check this yet. Alas there
> is needed something more than this patch, because it probably breaks
> other things like hfsc.
> 
> Thanks,
> Jarek P.
> 

Why would it break hfsc, if it isn't already broken.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 20:49                         ` Stephen Hemminger
@ 2009-05-29 20:59                           ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-29 20:59 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Antonio Almeida, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, May 29, 2009 at 01:49:56PM -0700, Stephen Hemminger wrote:
> On Fri, 29 May 2009 21:46:43 +0200
> Jarek Poplawski <jarkao2@gmail.com> wrote:
...
> > > >  /* Avoid doing 64 bit divide by 1000 */
> > > > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > > > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > > > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > > > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> > > >
> > > >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> > > >  #define PSCHED_PASTPERFECT             0
> > > 
> > > It's better! This patch gives more accuracy to HTB. Here some values:
> > > Note that these are boundary values, so, e.g., any HTB configuration
> > > between 377000Kbit and 400000Kbit would fall in the same step - close
> > > to 397977Kbit.
> > 
> > Good news! So it seems there are no other reasons of this inaccuracy
> > than too coarse granularity, but I have to check this yet. Alas there
> > is needed something more than this patch, because it probably breaks
> > other things like hfsc.
> > 
> > Thanks,
> > Jarek P.
> > 
> 
> Why would it break hfsc, if it isn't already broken.

I might be wrong but e.g. these usecs could be one reason:

/* convert d (us) into dx (psched us) */
static u64
d2dx(u32 d)
{
        u64 dx;

        dx = ((u64)d * PSCHED_TICKS_PER_SEC);
        dx += USEC_PER_SEC - 1;
        do_div(dx, USEC_PER_SEC);
        return dx;
}

And maybe these shifts need some adjustment:
m = (sm * PSCHED_TICKS_PER_SEC) >> SM_SHIFT;

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-29 17:02                     ` Antonio Almeida
  2009-05-29 17:28                       ` Stephen Hemminger
  2009-05-29 19:46                       ` Jarek Poplawski
@ 2009-05-30 20:07                       ` Jarek Poplawski
  2009-06-02 10:12                         ` Antonio Almeida
  2 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-05-30 20:07 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Fri, May 29, 2009 at 06:02:39PM +0100, Antonio Almeida wrote:
> On Thu, May 28, 2009 at 10:12 PM, Jarek Poplawski wrote:
...
> > -#define PSCHED_US2NS(x)                        ((s64)(x) << 10)
> > -#define PSCHED_NS2US(x)                        ((x) >> 10)
> > +#define PSCHED_US2NS(x)                        ((s64)(x) << 6)
> > +#define PSCHED_NS2US(x)                        ((x) >> 6)
> >
> >  #define PSCHED_TICKS_PER_SEC           PSCHED_NS2US(NSEC_PER_SEC)
> >  #define PSCHED_PASTPERFECT             0
> 
> It's better! This patch gives more accuracy to HTB. Here some values:
> Note that these are boundary values, so, e.g., any HTB configuration
> between 377000Kbit and 400000Kbit would fall in the same step - close
> to 397977Kbit.
> This test was made over the same conditions: generating 950Mbit/s of
> unidirectional tcp traffic of 800 bytes packets long.

Here is a tc patch, which should minimize these boundaries, so please,
repeat this test with previous patches/conditions plus this one.

Thanks,
Jarek P.
---

 tc/tc_core.c |   10 +++++-----
 tc/tc_core.h |    4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tc/tc_core.c b/tc/tc_core.c
index 9a0ff39..6d74287 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -27,18 +27,18 @@
 static double tick_in_usec = 1;
 static double clock_factor = 1;
 
-int tc_core_time2big(unsigned time)
+int tc_core_time2big(double time)
 {
-	__u64 t = time;
+	__u64 t;
 
-	t *= tick_in_usec;
+	t = time * tick_in_usec + 0.5;
 	return (t >> 32) != 0;
 }
 
 
-unsigned tc_core_time2tick(unsigned time)
+unsigned tc_core_time2tick(double time)
 {
-	return time*tick_in_usec;
+	return time * tick_in_usec + 0.5;
 }
 
 unsigned tc_core_tick2time(unsigned tick)
diff --git a/tc/tc_core.h b/tc/tc_core.h
index 5a693ba..0ac65aa 100644
--- a/tc/tc_core.h
+++ b/tc/tc_core.h
@@ -13,8 +13,8 @@ enum link_layer {
 };
 
 
-int  tc_core_time2big(unsigned time);
-unsigned tc_core_time2tick(unsigned time);
+int  tc_core_time2big(double time);
+unsigned tc_core_time2tick(double time);
 unsigned tc_core_tick2time(unsigned tick);
 unsigned tc_core_time2ktime(unsigned time);
 unsigned tc_core_ktime2time(unsigned ktime);

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-05-30 20:07                       ` Jarek Poplawski
@ 2009-06-02 10:12                         ` Antonio Almeida
  2009-06-02 11:45                           ` Antonio Almeida
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-06-02 10:12 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Sat, May 30, 2009 at 9:07 PM, Jarek Poplawski wrote:
> Here is a tc patch, which should minimize these boundaries, so please,
> repeat this test with previous patches/conditions plus this one.
>
> Thanks,
> Jarek P.
> ---
>
>  tc/tc_core.c |   10 +++++-----
>  tc/tc_core.h |    4 ++--
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/tc/tc_core.c b/tc/tc_core.c
> index 9a0ff39..6d74287 100644
> --- a/tc/tc_core.c
> +++ b/tc/tc_core.c
> @@ -27,18 +27,18 @@
>  static double tick_in_usec = 1;
>  static double clock_factor = 1;
>
> -int tc_core_time2big(unsigned time)
> +int tc_core_time2big(double time)
>  {
> -       __u64 t = time;
> +       __u64 t;
>
> -       t *= tick_in_usec;
> +       t = time * tick_in_usec + 0.5;
>        return (t >> 32) != 0;
>  }
>
>
> -unsigned tc_core_time2tick(unsigned time)
> +unsigned tc_core_time2tick(double time)
>  {
> -       return time*tick_in_usec;
> +       return time * tick_in_usec + 0.5;
>  }
>
>  unsigned tc_core_tick2time(unsigned tick)
> diff --git a/tc/tc_core.h b/tc/tc_core.h
> index 5a693ba..0ac65aa 100644
> --- a/tc/tc_core.h
> +++ b/tc/tc_core.h
> @@ -13,8 +13,8 @@ enum link_layer {
>  };
>
>
> -int  tc_core_time2big(unsigned time);
> -unsigned tc_core_time2tick(unsigned time);
> +int  tc_core_time2big(double time);
> +unsigned tc_core_time2tick(double time);
>  unsigned tc_core_tick2time(unsigned tick);
>  unsigned tc_core_time2ktime(unsigned time);
>  unsigned tc_core_ktime2time(unsigned ktime);
>

I'm getting great values with this patch!

class htb 1:108 parent 1:10 leaf 108: prio 7 quantum 1514 rate
555000Kbit ceil 555000Kbit burst 70970b/8 mpu 0b overhead 0b cburst
70970b/8 mpu 0b overhead 0b level 0
 Sent 14270693572 bytes 17928007 pkt (dropped 12579262, overlimits 0 requeues 0)
 rate 552755Kbit 86802pps backlog 0b 127p requeues 0
 lended: 17927880 borrowed: 0 giants: 0
 tokens: -16095 ctokens: -16095

(for packets of 800 bytes)
I'll get back to you with more values.

  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 10:12                         ` Antonio Almeida
@ 2009-06-02 11:45                           ` Antonio Almeida
  2009-06-02 12:36                             ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-06-02 11:45 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Tue, Jun 2, 2009 at 11:12 AM, Antonio Almeida wrote:
> I'm getting great values with this patch!
>...
> I'll get back to you with more values.

The steps are much smaller and the error keeps lower than 1%.
Injecting over 950Mpbs of tcp packets of 800bytes I get these values:

Configuration	Sent rate		error (%)
498000Kbit	495023Kbit	0,60
499000Kbit	497456Kbit	0,31
500000Kbit	497498Kbit	0,50
501000Kbit	497496Kbit	0,70
502000Kbit	499986Kbit	0,40
503000Kbit	499978Kbit	0,60
504000Kbit	502520Kbit	0,29
		
696000Kbit	690964Kbit	0,72
697000Kbit	695782Kbit	0,17
698000Kbit	695783Kbit	0,32
699000Kbit	695783Kbit	0,46
700000Kbit	695795Kbit	0,60
701000Kbit	695786Kbit	0,74
702000Kbit	700703Kbit	0,18
		
896000Kbit	888383Kbit	0,85
897000Kbit	896289Kbit	0,08
904000Kbit	896389Kbit	0,84
905000Kbit	904542Kbit	0,05

  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 11:45                           ` Antonio Almeida
@ 2009-06-02 12:36                             ` Jarek Poplawski
  2009-06-02 12:45                               ` Patrick McHardy
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-02 12:36 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Stephen Hemminger, netdev, kaber, davem, devik, Eric Dumazet,
	Vladimir Ivashchenko

On Tue, Jun 02, 2009 at 12:45:28PM +0100, Antonio Almeida wrote:
> On Tue, Jun 2, 2009 at 11:12 AM, Antonio Almeida wrote:
> > I'm getting great values with this patch!
> >...
> > I'll get back to you with more values.
> 
> The steps are much smaller and the error keeps lower than 1%.
> Injecting over 950Mpbs of tcp packets of 800bytes I get these values:

Nice values - should be acceptable, I guess. Alas this is not all, and
I'll ask you soon for re-testing HFSC (after another patch) or maybe
even some simple CBQ setup ;-)

Thank you very much for testing,
Jarek P.

> 
> Configuration	Sent rate		error (%)
> 498000Kbit	495023Kbit	0,60
> 499000Kbit	497456Kbit	0,31
> 500000Kbit	497498Kbit	0,50
> 501000Kbit	497496Kbit	0,70
> 502000Kbit	499986Kbit	0,40
> 503000Kbit	499978Kbit	0,60
> 504000Kbit	502520Kbit	0,29
> 		
> 696000Kbit	690964Kbit	0,72
> 697000Kbit	695782Kbit	0,17
> 698000Kbit	695783Kbit	0,32
> 699000Kbit	695783Kbit	0,46
> 700000Kbit	695795Kbit	0,60
> 701000Kbit	695786Kbit	0,74
> 702000Kbit	700703Kbit	0,18
> 		
> 896000Kbit	888383Kbit	0,85
> 897000Kbit	896289Kbit	0,08
> 904000Kbit	896389Kbit	0,84
> 905000Kbit	904542Kbit	0,05
> 
>   Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 12:36                             ` Jarek Poplawski
@ 2009-06-02 12:45                               ` Patrick McHardy
  2009-06-02 13:08                                 ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-02 12:45 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> On Tue, Jun 02, 2009 at 12:45:28PM +0100, Antonio Almeida wrote:
>> On Tue, Jun 2, 2009 at 11:12 AM, Antonio Almeida wrote:
>>> I'm getting great values with this patch!
>>> ...
>>> I'll get back to you with more values.
>> The steps are much smaller and the error keeps lower than 1%.
>> Injecting over 950Mpbs of tcp packets of 800bytes I get these values:
> 
> Nice values - should be acceptable, I guess. Alas this is not all, and
> I'll ask you soon for re-testing HFSC (after another patch) or maybe
> even some simple CBQ setup ;-)

I didn't follow the full discussion, so I'm not sure which kind of
arithmetic error you're attempting to cure. For the HFSC scaling
factors, please just keep in mind that its also supposed to be
very accurate at low bandwidths.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 12:45                               ` Patrick McHardy
@ 2009-06-02 13:08                                 ` Jarek Poplawski
  2009-06-02 13:20                                   ` Patrick McHardy
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-02 13:08 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Tue, Jun 02, 2009 at 02:45:34PM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> On Tue, Jun 02, 2009 at 12:45:28PM +0100, Antonio Almeida wrote:
>>> On Tue, Jun 2, 2009 at 11:12 AM, Antonio Almeida wrote:
>>>> I'm getting great values with this patch!
>>>> ...
>>>> I'll get back to you with more values.
>>> The steps are much smaller and the error keeps lower than 1%.
>>> Injecting over 950Mpbs of tcp packets of 800bytes I get these values:
>>
>> Nice values - should be acceptable, I guess. Alas this is not all, and
>> I'll ask you soon for re-testing HFSC (after another patch) or maybe
>> even some simple CBQ setup ;-)
>
> I didn't follow the full discussion, so I'm not sure which kind of
> arithmetic error you're attempting to cure. For the HFSC scaling
> factors, please just keep in mind that its also supposed to be
> very accurate at low bandwidths.

It's all here:

http://permalink.gmane.org/gmane.linux.network/129301

Of course, I'd appreciate any suggestions.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 13:08                                 ` Jarek Poplawski
@ 2009-06-02 13:20                                   ` Patrick McHardy
  2009-06-02 21:37                                     ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-02 13:20 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> On Tue, Jun 02, 2009 at 02:45:34PM +0200, Patrick McHardy wrote:
>> I didn't follow the full discussion, so I'm not sure which kind of
>> arithmetic error you're attempting to cure. For the HFSC scaling
>> factors, please just keep in mind that its also supposed to be
>> very accurate at low bandwidths.
> 
> It's all here:
> 
> http://permalink.gmane.org/gmane.linux.network/129301

I've read through the mails where you suggested to change the scaling
factors. I wasn't able to find the reasoning (IOW: where does it
overflow or loose precision in which case) though.

> Of course, I'd appreciate any suggestions.

The HFSC shifts would indeed need adjustments if the US<->NS conversion
factor were to change.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 13:20                                   ` Patrick McHardy
@ 2009-06-02 21:37                                     ` Jarek Poplawski
  2009-06-02 21:50                                       ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-02 21:37 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Tue, Jun 02, 2009 at 03:20:20PM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> On Tue, Jun 02, 2009 at 02:45:34PM +0200, Patrick McHardy wrote:
>>> I didn't follow the full discussion, so I'm not sure which kind of
>>> arithmetic error you're attempting to cure. For the HFSC scaling
>>> factors, please just keep in mind that its also supposed to be
>>> very accurate at low bandwidths.
>>
>> It's all here:
>>
>> http://permalink.gmane.org/gmane.linux.network/129301
>
> I've read through the mails where you suggested to change the scaling
> factors. I wasn't able to find the reasoning (IOW: where does it
> overflow or loose precision in which case) though.

I described the reasoning here:
http://permalink.gmane.org/gmane.linux.network/128189

Of course, we could try some other solution than changing the scaling.
I considered a possibility to do it internally in htb, even with
skipping rate tables, but the change of the scaling seems to be the
most generic way (alas there are some odd compatibility issues in
iproute/tc like TIME_UNITS_PER_SEC or "if (nom == 1000000)" to make
it really consistent/readable).

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 21:37                                     ` Jarek Poplawski
@ 2009-06-02 21:50                                       ` Jarek Poplawski
  2009-06-03  7:06                                         ` Patrick McHardy
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-02 21:50 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote, On 06/02/2009 11:37 PM:
...

> I described the reasoning here:
> http://permalink.gmane.org/gmane.linux.network/128189

The link is stuck now, so here is a quote:

Jarek Poplawski wrote, On 05/17/2009 10:15 PM:

> Here is some additional explanation. It looks like these rates above
> 500Mbit hit the design limits of packet scheduling. Currently used
> internal resolution PSCHED_TICKS_PER_SEC is 1,000,000. 550Mbit rate
> with 800byte packets means 550M/8/800 = 85938 packets/s, so on average
> 1000000/85938 = 11.6 ticks per packet. Accounting only 11 ticks means
> we leave 0.6*85938 = 51563 ticks per second, letting for additional
> sending of 51563/11 = 4687 packets/s or 4687*800*8 = 30Mbit. Of course
> it could be worse (0.9 tick/packet lost) depending on packet sizes vs.
> rates, and the effect rises for higher rates.


Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-02 21:50                                       ` Jarek Poplawski
@ 2009-06-03  7:06                                         ` Patrick McHardy
  2009-06-03  7:40                                           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03  7:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> Jarek Poplawski wrote, On 06/02/2009 11:37 PM:
> ...
> 
>> I described the reasoning here:
>> http://permalink.gmane.org/gmane.linux.network/128189
> 
> The link is stuck now, so here is a quote:

Thanks.

> Jarek Poplawski wrote, On 05/17/2009 10:15 PM:
> 
>> Here is some additional explanation. It looks like these rates above
>> 500Mbit hit the design limits of packet scheduling. Currently used
>> internal resolution PSCHED_TICKS_PER_SEC is 1,000,000. 550Mbit rate
>> with 800byte packets means 550M/8/800 = 85938 packets/s, so on average
>> 1000000/85938 = 11.6 ticks per packet. Accounting only 11 ticks means
>> we leave 0.6*85938 = 51563 ticks per second, letting for additional
>> sending of 51563/11 = 4687 packets/s or 4687*800*8 = 30Mbit. Of course
>> it could be worse (0.9 tick/packet lost) depending on packet sizes vs.
>> rates, and the effect rises for higher rates.

I see. Unfortunately changing the scaling factors is pushing the lower
end towards overflowing. For example Denys Fedoryshchenko reported some
breakage a few years ago when I changed the iproute-internal factors
triggered by this command:

.. tbf buffer 1024kb latency 500ms rate 128kbit peakrate 256kbit 
minburst 16384

The burst size calculated by TBF with the current parameters is
64000000. Increasing it by a factor of 16 as in your patch results
in 1024000000. Which means we're getting dangerously close to
overflowing, a buffer size increase or a rate decrease of slightly
bigger than factor 4 will already overflow.

Mid-term we really need to move to 64 bit values and ns resolution,
otherwise this problem is just going to reappear as soon as someone
tries 10gbit. Not sure what the best short term fix is, I feel a bit
uneasy about changing the current factors given how close this brings
us towards overflowing.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  7:06                                         ` Patrick McHardy
@ 2009-06-03  7:40                                           ` Jarek Poplawski
  2009-06-03  7:53                                             ` Patrick McHardy
  0 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-03  7:40 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 03, 2009 at 09:06:37AM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> Jarek Poplawski wrote, On 06/02/2009 11:37 PM:
>> ...
>>
>>> I described the reasoning here:
>>> http://permalink.gmane.org/gmane.linux.network/128189
>>
>> The link is stuck now, so here is a quote:
>
> Thanks.
>
>> Jarek Poplawski wrote, On 05/17/2009 10:15 PM:
>>
>>> Here is some additional explanation. It looks like these rates above
>>> 500Mbit hit the design limits of packet scheduling. Currently used
>>> internal resolution PSCHED_TICKS_PER_SEC is 1,000,000. 550Mbit rate
>>> with 800byte packets means 550M/8/800 = 85938 packets/s, so on average
>>> 1000000/85938 = 11.6 ticks per packet. Accounting only 11 ticks means
>>> we leave 0.6*85938 = 51563 ticks per second, letting for additional
>>> sending of 51563/11 = 4687 packets/s or 4687*800*8 = 30Mbit. Of course
>>> it could be worse (0.9 tick/packet lost) depending on packet sizes vs.
>>> rates, and the effect rises for higher rates.
>
> I see. Unfortunately changing the scaling factors is pushing the lower
> end towards overflowing. For example Denys Fedoryshchenko reported some
> breakage a few years ago when I changed the iproute-internal factors
> triggered by this command:
>
> .. tbf buffer 1024kb latency 500ms rate 128kbit peakrate 256kbit  
> minburst 16384
>
> The burst size calculated by TBF with the current parameters is
> 64000000. Increasing it by a factor of 16 as in your patch results
> in 1024000000. Which means we're getting dangerously close to
> overflowing, a buffer size increase or a rate decrease of slightly
> bigger than factor 4 will already overflow.
>
> Mid-term we really need to move to 64 bit values and ns resolution,
> otherwise this problem is just going to reappear as soon as someone
> tries 10gbit. Not sure what the best short term fix is, I feel a bit
> uneasy about changing the current factors given how close this brings
> us towards overflowing.

I completely agree it's on the verge of overflow, and actually would
overflow for some insanely low (for today's standards) rates. So I
treat it's as a temporary solution, until people start asking about
more than 1 or 2Gbit. And of course we will have to move to 64 bit
anyway. Or we can do it now...

Btw., I've some doubts about HFSC; it's really different than others
wrt. rate tables/time accounting, and these PSCHED_TICKS look only
like an unnecesary compatibility; it works OK with usecs and doesn't
need this change now, unless I miss something. So maybe we would
simply stop using common psched_get_time() for it, and only do a
conversion for qdisc_watchdog_schedule() etc.?

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  7:40                                           ` Jarek Poplawski
@ 2009-06-03  7:53                                             ` Patrick McHardy
  2009-06-03  8:01                                               ` Jarek Poplawski
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03  7:53 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> On Wed, Jun 03, 2009 at 09:06:37AM +0200, Patrick McHardy wrote:
>> Mid-term we really need to move to 64 bit values and ns resolution,
>> otherwise this problem is just going to reappear as soon as someone
>> tries 10gbit. Not sure what the best short term fix is, I feel a bit
>> uneasy about changing the current factors given how close this brings
>> us towards overflowing.
> 
> I completely agree it's on the verge of overflow, and actually would
> overflow for some insanely low (for today's standards) rates. So I
> treat it's as a temporary solution, until people start asking about
> more than 1 or 2Gbit. And of course we will have to move to 64 bit
> anyway. Or we can do it now...

That (now) would certainly be the best solution, but its a non-trivial
task since all the ABIs use 32 bit values.

> Btw., I've some doubts about HFSC; it's really different than others
> wrt. rate tables/time accounting, and these PSCHED_TICKS look only
> like an unnecesary compatibility; it works OK with usecs and doesn't
> need this change now, unless I miss something. So maybe we would
> simply stop using common psched_get_time() for it, and only do a
> conversion for qdisc_watchdog_schedule() etc.?

Yes, it would work perfectly fine with usecs, which is actually (and
unfortunately) the unit it uses in its ABI. But I think its better
to convert the values once during initialization, instead of again
and again when scheduling the watchdog. The necessary changes are
really trivial, all you need to do when changing the scaling factors
is to increase SM_MASK and decrease ISM_MASK accordingly.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  7:53                                             ` Patrick McHardy
@ 2009-06-03  8:01                                               ` Jarek Poplawski
  2009-06-03  8:29                                                 ` Patrick McHardy
  2009-06-03  9:54                                               ` Jarek Poplawski
  2009-06-04  4:53                                               ` David Miller
  2 siblings, 1 reply; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-03  8:01 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
...
> Yes, it would work perfectly fine with usecs, which is actually (and
> unfortunately) the unit it uses in its ABI. But I think its better
> to convert the values once during initialization, instead of again
> and again when scheduling the watchdog. The necessary changes are
> really trivial, all you need to do when changing the scaling factors
> is to increase SM_MASK and decrease ISM_MASK accordingly.

Right! (On the other hand we could consider a separate watchdog too...)

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  8:01                                               ` Jarek Poplawski
@ 2009-06-03  8:29                                                 ` Patrick McHardy
  2009-06-03  8:45                                                   ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03  8:29 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
> ...
>> Yes, it would work perfectly fine with usecs, which is actually (and
>> unfortunately) the unit it uses in its ABI. But I think its better
>> to convert the values once during initialization, instead of again
>> and again when scheduling the watchdog. The necessary changes are
>> really trivial, all you need to do when changing the scaling factors
>> is to increase SM_MASK and decrease ISM_MASK accordingly.
> 
> Right! (On the other hand we could consider a separate watchdog too...)

We could :) But I don't see any benefit doing that, especially given
that eventually everything should be using ns resolution anyways.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  8:29                                                 ` Patrick McHardy
@ 2009-06-03  8:45                                                   ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-03  8:45 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 03, 2009 at 10:29:58AM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
>> ...
>>> Yes, it would work perfectly fine with usecs, which is actually (and
>>> unfortunately) the unit it uses in its ABI. But I think its better
>>> to convert the values once during initialization, instead of again
>>> and again when scheduling the watchdog. The necessary changes are
>>> really trivial, all you need to do when changing the scaling factors
>>> is to increase SM_MASK and decrease ISM_MASK accordingly.
>>
>> Right! (On the other hand we could consider a separate watchdog too...)
>
> We could :) But I don't see any benefit doing that, especially given
> that eventually everything should be using ns resolution anyways.

The main benefit would be readability... I guess it's no problem for
you, but I'm currently trying to make sure things like this are/will
be OK :-)

        dx = ((u64)d * PSCHED_TICKS_PER_SEC);
        dx += USEC_PER_SEC - 1;
 
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  7:53                                             ` Patrick McHardy
  2009-06-03  8:01                                               ` Jarek Poplawski
@ 2009-06-03  9:54                                               ` Jarek Poplawski
  2009-06-03 10:01                                                 ` Patrick McHardy
  2009-06-04 13:50                                                 ` Antonio Almeida
  2009-06-04  4:53                                               ` David Miller
  2 siblings, 2 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-03  9:54 UTC (permalink / raw)
  To: Antonio Almeida
  Cc: Patrick McHardy, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
...
> The necessary changes are
> really trivial, all you need to do when changing the scaling factors
> is to increase SM_MASK and decrease ISM_MASK accordingly.

OK, looks like it's really enough and I was confused with some
rounding, thanks Patrick.

Antonio, could you give this patch a try (with all the previous) and
repeat those HFSC tests you did before (plus maybe a few tries with
lower rates)?

Thanks,
Jarek P.
---

 net/sched/sch_hfsc.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 5022f9c..7c53a36 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -384,8 +384,9 @@ cftree_update(struct hfsc_class *cl)
  *
  *  1.024us/byte  78.125     7.8125     0.78125    0.078125   0.0078125
  */
-#define	SM_SHIFT	20
-#define	ISM_SHIFT	18
+#define	PSCHED_SHIFT	6	/* TODO: move to pkt_sched.h */
+#define	SM_SHIFT	(30 - PSCHED_SHIFT)
+#define	ISM_SHIFT	(8 + PSCHED_SHIFT)
 
 #define	SM_MASK		((1ULL << SM_SHIFT) - 1)
 #define	ISM_MASK	((1ULL << ISM_SHIFT) - 1)

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  9:54                                               ` Jarek Poplawski
@ 2009-06-03 10:01                                                 ` Patrick McHardy
  2009-06-03 10:05                                                   ` Patrick McHardy
  2009-06-04 13:50                                                 ` Antonio Almeida
  1 sibling, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03 10:01 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Jarek Poplawski wrote:
> On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
> ...
>> The necessary changes are
>> really trivial, all you need to do when changing the scaling factors
>> is to increase SM_MASK and decrease ISM_MASK accordingly.
> 
> OK, looks like it's really enough and I was confused with some
> rounding, thanks Patrick.

Looks fine in principle, but considering your change to the generic
scaling factors:

> -#define PSCHED_US2NS(x)			((s64)(x) << 10)
> -#define PSCHED_NS2US(x)			((x) >> 10)
> +#define PSCHED_US2NS(x)			((s64)(x) << 6)
> +#define PSCHED_NS2US(x)			((x) >> 6)

PSCHED_SHIFT should be 4, right?

> +#define	PSCHED_SHIFT	6	/* TODO: move to pkt_sched.h */
> +#define	SM_SHIFT	(30 - PSCHED_SHIFT)
> +#define	ISM_SHIFT	(8 + PSCHED_SHIFT)

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03 10:01                                                 ` Patrick McHardy
@ 2009-06-03 10:05                                                   ` Patrick McHardy
  2009-06-03 10:06                                                     ` Patrick McHardy
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03 10:05 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> On Wed, Jun 03, 2009 at 09:53:11AM +0200, Patrick McHardy wrote:
>> ...
>>> The necessary changes are
>>> really trivial, all you need to do when changing the scaling factors
>>> is to increase SM_MASK and decrease ISM_MASK accordingly.
>>
>> OK, looks like it's really enough and I was confused with some
>> rounding, thanks Patrick.
> 
> Looks fine in principle, but considering your change to the generic
> scaling factors:
> 
>> -#define PSCHED_US2NS(x)            ((s64)(x) << 10)
>> -#define PSCHED_NS2US(x)            ((x) >> 10)
>> +#define PSCHED_US2NS(x)            ((s64)(x) << 6)
>> +#define PSCHED_NS2US(x)            ((x) >> 6)
> 
> PSCHED_SHIFT should be 4, right?


> -#define	SM_SHIFT	20
> -#define	ISM_SHIFT	18
> +#define	PSCHED_SHIFT	6	/* TODO: move to pkt_sched.h */
> +#define	SM_SHIFT	(30 - PSCHED_SHIFT)
> +#define	ISM_SHIFT	(8 + PSCHED_SHIFT)

Actually I'm confused, why the additional change of 10?

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03 10:05                                                   ` Patrick McHardy
@ 2009-06-03 10:06                                                     ` Patrick McHardy
  2009-06-03 10:27                                                       ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Patrick McHardy @ 2009-06-03 10:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

Patrick McHardy wrote:
>> PSCHED_SHIFT should be 4, right?
> 
>> -#define    SM_SHIFT    20
>> -#define    ISM_SHIFT    18
>> +#define    PSCHED_SHIFT    6    /* TODO: move to pkt_sched.h */
>> +#define    SM_SHIFT    (30 - PSCHED_SHIFT)
>> +#define    ISM_SHIFT    (8 + PSCHED_SHIFT)
> 
> Actually I'm confused, why the additional change of 10?

OK, 10 - 6 = 4, got it :)


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03 10:06                                                     ` Patrick McHardy
@ 2009-06-03 10:27                                                       ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-03 10:27 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Antonio Almeida, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 03, 2009 at 12:06:10PM +0200, Patrick McHardy wrote:
> Patrick McHardy wrote:
>>> PSCHED_SHIFT should be 4, right?
>>
>>> -#define    SM_SHIFT    20
>>> -#define    ISM_SHIFT    18
>>> +#define    PSCHED_SHIFT    6    /* TODO: move to pkt_sched.h */
>>> +#define    SM_SHIFT    (30 - PSCHED_SHIFT)
>>> +#define    ISM_SHIFT    (8 + PSCHED_SHIFT)
>>
>> Actually I'm confused, why the additional change of 10?
>
> OK, 10 - 6 = 4, got it :)
>

If you wanted to console me after my hfsc confusions, you did it!

Thanks again,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  7:53                                             ` Patrick McHardy
  2009-06-03  8:01                                               ` Jarek Poplawski
  2009-06-03  9:54                                               ` Jarek Poplawski
@ 2009-06-04  4:53                                               ` David Miller
  2009-06-04  7:50                                                 ` Jarek Poplawski
  2 siblings, 1 reply; 104+ messages in thread
From: David Miller @ 2009-06-04  4:53 UTC (permalink / raw)
  To: kaber; +Cc: jarkao2, vexwek, shemminger, netdev, devik, dada1, hazard

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 03 Jun 2009 09:53:11 +0200

> Jarek Poplawski wrote:
>> On Wed, Jun 03, 2009 at 09:06:37AM +0200, Patrick McHardy wrote:
>>> Mid-term we really need to move to 64 bit values and ns resolution,
>>> otherwise this problem is just going to reappear as soon as someone
>>> tries 10gbit. Not sure what the best short term fix is, I feel a bit
>>> uneasy about changing the current factors given how close this brings
>>> us towards overflowing.
>> I completely agree it's on the verge of overflow, and actually would
>> overflow for some insanely low (for today's standards) rates. So I
>> treat it's as a temporary solution, until people start asking about
>> more than 1 or 2Gbit. And of course we will have to move to 64 bit
>> anyway. Or we can do it now...
> 
> That (now) would certainly be the best solution, but its a non-trivial
> task since all the ABIs use 32 bit values.

We could pass in a new attribute which provides the upper-32bits
of the value.  I'm not sure if that works in this case but it's
an idea.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-04  4:53                                               ` David Miller
@ 2009-06-04  7:50                                                 ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-04  7:50 UTC (permalink / raw)
  To: David Miller; +Cc: kaber, vexwek, shemminger, netdev, devik, dada1, hazard

On Wed, Jun 03, 2009 at 09:53:14PM -0700, David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Wed, 03 Jun 2009 09:53:11 +0200
> 
> > Jarek Poplawski wrote:
> >> On Wed, Jun 03, 2009 at 09:06:37AM +0200, Patrick McHardy wrote:
> >>> Mid-term we really need to move to 64 bit values and ns resolution,
> >>> otherwise this problem is just going to reappear as soon as someone
> >>> tries 10gbit. Not sure what the best short term fix is, I feel a bit
> >>> uneasy about changing the current factors given how close this brings
> >>> us towards overflowing.
> >> I completely agree it's on the verge of overflow, and actually would
> >> overflow for some insanely low (for today's standards) rates. So I
> >> treat it's as a temporary solution, until people start asking about
> >> more than 1 or 2Gbit. And of course we will have to move to 64 bit
> >> anyway. Or we can do it now...
> > 
> > That (now) would certainly be the best solution, but its a non-trivial
> > task since all the ABIs use 32 bit values.
> 
> We could pass in a new attribute which provides the upper-32bits
> of the value.  I'm not sure if that works in this case but it's
> an idea.

I'm not sure it could be so simple: I guess Patrick is concerned with
a new tc talking to an old kernel (otherwise a kernel should recognize
an old format). Then it would need something reasonable in 32bits.

But, I'm not even sure we need 64bit rate tables. We could
alternatively use (after checking a kernel can handle this)
simply a log to shift these values in kernel to u64:

- static inline u32 qdisc_l2t(struct qdisc_rate_table* rtab, unsigned int pktlen)
+ static inline u64 qdisc_l2t(struct qdisc_rate_table* rtab, unsigned int pktlen)
  {
	...
-        return rtab->data[slot];
+        return rtab->data[slot] << rtab->rate.rate_log;
  }

Since these overflows are for low rates, this rounding of lower bits
shouldn't matter here. So, IMHO, it's more about adding this overhead
of u64 to the kernel now.

Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-03  9:54                                               ` Jarek Poplawski
  2009-06-03 10:01                                                 ` Patrick McHardy
@ 2009-06-04 13:50                                                 ` Antonio Almeida
       [not found]                                                   ` <20090604193013.GA2755@ami.dom.local>
  1 sibling, 1 reply; 104+ messages in thread
From: Antonio Almeida @ 2009-06-04 13:50 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Patrick McHardy, Stephen Hemminger, netdev, davem, devik,
	Eric Dumazet, Vladimir Ivashchenko

On Wed, Jun 3, 2009 at 10:54 AM, Jarek Poplawski wrote:
> Antonio, could you give this patch a try (with all the previous) and
> repeat those HFSC tests you did before (plus maybe a few tries with
> lower rates)?

For me, HTB values are just perfect! I would say that they're better
than HFSC, since sent rate stays below the configured ceil (but that's
for me)
After applying the patch you sent (to sch_hfsc.c) I got these values for HFSC:

configuration	analyser RX	error (%)
  10000000	10062688		0,63
  20000000	20096961		0,48
  30000000	30135028		0,45
  40000000	40186190		0,47
  50000000	50294890		0,59
  60000000	60294553		0,49
  70000000	70284220		0,41
  80000000	80414272		0,52
  90000000	90354675		0,39
100000000	100453024		0,45
200000000	200962041		0,48
250000000	251467886		0,59
300000000	301422613		0,47
400000000	402123479		0,53
500000000	502356820		0,47
550000000	552988253		0,54
600000000	602956905		0,49
700000000	703405632		0,49
750000000	753949085		0,53
800000000	804315169		0,54
900000000	904584208		0,51

As usually, generating 970Mbit/s of tcp traffic of 800 bytes packets.

Here's the setup picture:
# tc -s -d class ls dev eth1
class hfsc 1: root
 Sent 253924 bytes 319 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 period 0 level 4

class hfsc 1:1 parent 1: sc m1 0bit d 0us m2 1000Mbit ul m1 0bit d 0us
m2 1000Mbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 period 2 work 299437688 bytes level 3

class hfsc 1:10 parent 1:2 sc m1 0bit d 0us m2 1000Mbit ul m1 0bit d
0us m2 1000Mbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 period 2 work 299437688 bytes level 1

class hfsc 1:2 parent 1:1 sc m1 0bit d 0us m2 1000Mbit ul m1 0bit d
0us m2 1000Mbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
 period 2 work 299437688 bytes level 2

class hfsc 1:108 parent 1:10 sc m1 0bit d 50.0ms m2 500000Kbit ul m1
0bit d 0us m2 500000Kbit
 Sent 300178764 bytes 377109 pkt (dropped 349464, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 931p requeues 0
 period 2 work 299437688 bytes rtwork 299437688 bytes level 0


If you'd like any other values just ask. I'll be away till the fourteenth.
Thanks a lot! Good job!
  Antonio Almeida

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
       [not found]                                                       ` <20090604194203.GB2755@ami.dom.local>
@ 2009-06-09  5:25                                                         ` Badalian Vyacheslav
  2009-06-09  5:49                                                           ` Jarek Poplawski
  0 siblings, 1 reply; 104+ messages in thread
From: Badalian Vyacheslav @ 2009-06-09  5:25 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Patrick McHardy, Antonio Almeida, netdev

Hello!
Do you have any progress to apply this patch set?
I'm very interested to view that patches in mainline kernel tree. We
would like to use HTB for speeds more than 1G (converts 10 servers x 1G
to few with 10G intel multi queue network devices).

Thanks for you doing!
Best regals, Slavon

> On Thu, Jun 04, 2009 at 09:35:50PM +0200, Patrick McHardy wrote:
>   
>> Jarek Poplawski wrote:
>>     
> ...
>   
>>> OK, I'll browse other schedulers, and if there is nothing suspicious
>>> I'll submit these patches.
>>>       
>> Please give me a day to have another look at this, I didn't find
>> any time today.
>>
>> In most areas the overflows are only occuring when crossing
>> IMO unreasonable boundaries (but I've been wrong about that
>> before), but tc_cbq_calc_maxidle() is still making me nervous.
>>     
>
> Sure, I planned similar time for browsing it yet, as well.
>
> Thanks,
> Jarek P.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>   


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH iproute2] Re: HTB accuracy for high speed
  2009-06-09  5:25                                                         ` Badalian Vyacheslav
@ 2009-06-09  5:49                                                           ` Jarek Poplawski
  0 siblings, 0 replies; 104+ messages in thread
From: Jarek Poplawski @ 2009-06-09  5:49 UTC (permalink / raw)
  To: Badalian Vyacheslav; +Cc: Patrick McHardy, Antonio Almeida, netdev

On Tue, Jun 09, 2009 at 09:25:48AM +0400, Badalian Vyacheslav wrote:
> Hello!
> Do you have any progress to apply this patch set?
> I'm very interested to view that patches in mainline kernel tree. We
> would like to use HTB for speeds more than 1G (converts 10 servers x 1G
> to few with 10G intel multi queue network devices).

Hi,

I'll try to send patches today, but they are expected to work with 1G
or maybe a little more. I'm not sure higher rates make sense without
tso/gso, which isn't properly handled by packet schedulers anyway, so
more time/feedback/testing will be needed to go further.

Regards,
Jarek P.

^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2009-06-09  5:57 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <298f5c050905150745p13dc226eia1ff50ffa8c4b300@mail.gmail.com>
2009-05-15 14:49 ` HTB accuracy for high speed Antonio Almeida
2009-05-15 18:12   ` Stephen Hemminger
2009-05-18 10:01     ` Antonio Almeida
2009-05-18 10:45       ` Jarek Poplawski
2009-05-18 12:27         ` Antonio Almeida
2009-05-18 12:32           ` Jarek Poplawski
2009-05-18 16:13       ` Stephen Hemminger
2009-05-18 18:03         ` Antonio Almeida
2009-05-18 22:02         ` Stephen Hemminger
2009-05-19 11:48           ` Antonio Almeida
2009-05-19 13:08             ` Antonio Almeida
2009-05-16  8:31   ` Jarek Poplawski
2009-05-18 10:39     ` Antonio Almeida
2009-05-18 11:14       ` Jarek Poplawski
2009-05-18 12:05         ` Antonio Almeida
2009-05-16 14:14   ` Jarek Poplawski
2009-05-18 14:36     ` Antonio Almeida
2009-05-18 23:14       ` Vladimir Ivashchenko
2009-05-18 23:27         ` Vladimir Ivashchenko
2009-05-19 11:03           ` Jarek Poplawski
2009-05-19 14:04             ` Vladimir Ivashchenko
2009-05-19 20:10               ` Jarek Poplawski
2009-05-20 22:07                 ` Vladimir Ivashchenko
2009-05-20 22:46                   ` Eric Dumazet
2009-05-21  7:20                     ` Jarek Poplawski
2009-05-21  7:44                       ` Vladimir Ivashchenko
2009-05-21  8:28                         ` Jarek Poplawski
2009-05-21  9:07                           ` Eric Dumazet
2009-05-21  9:22                             ` Jarek Poplawski
2009-05-23 10:37                           ` HTB accuracy for high speed (and bonding) Vladimir Ivashchenko
2009-05-23 14:34                             ` Jarek Poplawski
2009-05-23 15:06                               ` Vladimir Ivashchenko
2009-05-23 15:35                                 ` Jarek Poplawski
2009-05-23 15:53                                   ` Vladimir Ivashchenko
2009-05-23 16:02                                     ` Jarek Poplawski
2009-05-18 16:40     ` HTB accuracy for high speed Eric Dumazet
2009-05-18 17:23       ` Jarek Poplawski
2009-05-18 21:52         ` David Miller
2009-05-18 23:59           ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps Eric Dumazet
2009-05-19  2:27             ` David Miller
2009-05-19  7:02             ` Jarek Poplawski
2009-05-19  7:31               ` Eric Dumazet
2009-05-19  7:42                 ` Jarek Poplawski
2009-05-19  7:57                   ` Jarek Poplawski
2009-05-19 18:03                     ` Eric Dumazet
2009-05-19 19:09                       ` [PATCH] pkt_sched: gen_estimator: Fix signed integers right-shifts Jarek Poplawski
2009-05-26  5:47                         ` David Miller
2009-05-19  8:18                 ` [PATCH] pkt_sched: gen_estimator: use 64 bits intermediate counters for bps David Miller
2009-05-17 20:15   ` HTB accuracy for high speed Jarek Poplawski
2009-05-18  6:56     ` [PATCH iproute2] " Jarek Poplawski
2009-05-18 16:54       ` Antonio Almeida
2009-05-18 17:16         ` Antonio Almeida
2009-05-21  8:51           ` Jarek Poplawski
2009-05-22 17:42             ` Antonio Almeida
2009-05-23  7:32               ` Jarek Poplawski
2009-05-28 18:13                 ` Antonio Almeida
2009-05-28 21:12                   ` Jarek Poplawski
2009-05-29 17:02                     ` Antonio Almeida
2009-05-29 17:28                       ` Stephen Hemminger
2009-05-29 19:58                         ` Jarek Poplawski
2009-05-29 19:46                       ` Jarek Poplawski
2009-05-29 20:49                         ` Stephen Hemminger
2009-05-29 20:59                           ` Jarek Poplawski
2009-05-30 20:07                       ` Jarek Poplawski
2009-06-02 10:12                         ` Antonio Almeida
2009-06-02 11:45                           ` Antonio Almeida
2009-06-02 12:36                             ` Jarek Poplawski
2009-06-02 12:45                               ` Patrick McHardy
2009-06-02 13:08                                 ` Jarek Poplawski
2009-06-02 13:20                                   ` Patrick McHardy
2009-06-02 21:37                                     ` Jarek Poplawski
2009-06-02 21:50                                       ` Jarek Poplawski
2009-06-03  7:06                                         ` Patrick McHardy
2009-06-03  7:40                                           ` Jarek Poplawski
2009-06-03  7:53                                             ` Patrick McHardy
2009-06-03  8:01                                               ` Jarek Poplawski
2009-06-03  8:29                                                 ` Patrick McHardy
2009-06-03  8:45                                                   ` Jarek Poplawski
2009-06-03  9:54                                               ` Jarek Poplawski
2009-06-03 10:01                                                 ` Patrick McHardy
2009-06-03 10:05                                                   ` Patrick McHardy
2009-06-03 10:06                                                     ` Patrick McHardy
2009-06-03 10:27                                                       ` Jarek Poplawski
2009-06-04 13:50                                                 ` Antonio Almeida
     [not found]                                                   ` <20090604193013.GA2755@ami.dom.local>
     [not found]                                                     ` <4A282216.20203@trash.net>
     [not found]                                                       ` <20090604194203.GB2755@ami.dom.local>
2009-06-09  5:25                                                         ` Badalian Vyacheslav
2009-06-09  5:49                                                           ` Jarek Poplawski
2009-06-04  4:53                                               ` David Miller
2009-06-04  7:50                                                 ` Jarek Poplawski
2009-05-18 17:53         ` Jarek Poplawski
2009-05-18 18:23           ` Antonio Almeida
2009-05-18 18:32             ` Jarek Poplawski
2009-05-18 18:56               ` Antonio Almeida
2009-05-18 19:05                 ` Jarek Poplawski
2009-05-19 10:55                   ` Antonio Almeida
2009-05-19 11:04                     ` Denys Fedoryschenko
2009-05-19 11:18                       ` Jarek Poplawski
2009-05-19 11:21                         ` Denys Fedoryschenko
2009-05-19 11:28                           ` Jarek Poplawski
2009-05-19 14:31                             ` Antonio Almeida
2009-05-19 11:09                     ` Jarek Poplawski
2009-05-19 13:18                   ` Jesper Dangaard Brouer
2009-05-19 19:35                     ` Jarek Poplawski
2009-05-18  7:01     ` [PATCH iproute2 v2] " Jarek Poplawski
2009-05-17 20:29   ` Vladimir Ivashchenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.