* HFSC not working as expected
@ 2014-06-26 14:39 Alan Goodman
2014-07-01 12:25 ` Michal Soltys
` (21 more replies)
0 siblings, 22 replies; 23+ messages in thread
From: Alan Goodman @ 2014-06-26 14:39 UTC (permalink / raw)
To: lartc
Hi,
I currently operate a traffic management / QoS system at my premises
with the following setup using htb/sfq. No default handle is selected
for various reasons (VPN traffic is tagged and accounted for before
flowing out and thus don’t want to count VPN traffic).
#QoS for Upload
tc qdisc del dev ppp0 root
tc qdisc add dev ppp0 root handle 1:0 htb r2q 1
tc class add dev ppp0 parent 1: classid 1:1 htb rate 900kbit ceil 900kbit
tc class add dev ppp0 parent 1:1 classid 1:10 htb rate 85kbit ceil
900kbit quantum 2824 mtu 1412 prio 1 #syn ack rst
tc class add dev ppp0 parent 1:1 classid 1:11 htb rate 410kbit ceil
900kbit quantum 2824 mtu 1412 prio 2 #VoIP/Ping/DNS/Timecritical
tc class add dev ppp0 parent 1:1 classid 1:12 htb rate 300kbit ceil
900kbit quantum 2824 mtu 1412 prio 3 #Interactive/Web/etc
tc class add dev ppp0 parent 1:1 classid 1:13 htb rate 105kbit ceil
900kbit quantum 2824 mtu 1412 prio 4 #Bulk/Other
tc qdisc add dev ppp0 parent 1:10 handle 10: sfq perturb 10
tc qdisc add dev ppp0 parent 1:11 handle 11: sfq perturb 10
tc qdisc add dev ppp0 parent 1:12 handle 12: sfq perturb 10 limit 2
tc qdisc add dev ppp0 parent 1:13 handle 13: sfq perturb 10 limit 2
tc filter add dev ppp0 parent 1:0 protocol ip prio 1 handle 10 fw flowid
1:10
tc filter add dev ppp0 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev ppp0 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev ppp0 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
#QoS for Download
tc qdisc del dev eth1 root
tc qdisc add dev eth1 root handle 1:0 htb r2q 1
class add dev eth1 parent 1: classid 1:1 htb rate 15000kbit ceil 15000kbit
tc class add dev eth1 parent 1:1 classid 1:10 htb rate 85kbit ceil
15000kbit quantum 2824 mtu 1412 prio 1 #syn ack rst
tc class add dev eth1 parent 1:1 classid 1:11 htb rate 460kbit ceil
15000kbit quantum 2824 mtu 1412 prio 2 #VoIP/Ping/DNS/Time critical
tc class add dev eth1 parent 1:1 classid 1:12 htb rate 5955kbit ceil
15000kbit quantum 2824 mtu 1412 prio 3 #Interactive/Web/etc
tc class add dev eth1 parent 1:1 classid 1:13 htb rate 8500kbit ceil
15000kbit quantum 2824 mtu 1412 prio 4 #Bulk/Other
tc qdisc add dev eth1 parent 1:10 handle 10: sfq perturb 10
tc qdisc add dev eth1 parent 1:11 handle 11: sfq perturb 10
tc qdisc add dev eth1 parent 1:12 handle 12: sfq perturb 10 limit 10
tc qdisc add dev eth1 parent 1:13 handle 13: sfq perturb 10 limit 10
tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 10 fw flowid
1:10
tc filter add dev eth1 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev eth1 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev eth1 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
The above system coupled with carefully crafted iptables rules set is
working quite well however when upload is being actively managed
sometimes its taking a while before jitter for time critical flows
settles down. I have therefore been investigating hfsc... Hfsc appears
to be working for low numbers of flows, however once flows increase and
packets backlog in the system overall latency is climbing skywards since
the backlog is getting too large before it appears to drop packets. I
have attempted to replicate my htb system where non marked traffic flows
out without restrictions however hfsc drops unclassified traffic - thus
my attempted solution below (class 1:14) which seems to be working properly.
#QoS for Upload
tc qdisc del dev ppp0 root
tc qdisc add dev ppp0 root handle 1:0 hfsc default 14
tc class add dev ppp0 parent 1:0 classid 1:1 hfsc sc rate 100mbit ul
rate 100mbit
tc class add dev ppp0 parent 1:1 classid 1:2 hfsc sc rate 900kbit ul
rate 900kbit
tc class add dev ppp0 parent 1:2 classid 1:10 hfsc sc rate 85kbit #syn
ack rst
tc class add dev ppp0 parent 1:2 classid 1:11 hfsc sc umax 1412b dmax
20ms rate 410kbit # Time critical
tc class add dev ppp0 parent 1:2 classid 1:12 hfsc sc rate 300kbit
#Interactive
tc class add dev ppp0 parent 1:2 classid 1:13 hfsc sc rate 105kbit #bulk
tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc rate 100mbit
tc qdisc add dev ppp0 parent 1:10 handle 1010: sfq
tc qdisc add dev ppp0 parent 1:11 handle 1011: sfq
tc qdisc add dev ppp0 parent 1:12 handle 1012: sfq limit 5
tc qdisc add dev ppp0 parent 1:13 handle 1013: sfq limit 5
tc qdisc add dev ppp0 parent 1:14 handle 1014: pfifo
tc filter add dev ppp0 parent 1:0 protocol ip prio 1 handle 10 fw flowid
1:10
tc filter add dev ppp0 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev ppp0 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev ppp0 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
#QoS for Download
tc qdisc del dev eth1 root
tc qdisc add dev eth1 root handle 1:0 hfsc default 14
tc class add dev eth1 parent 1:0 classid 1:1 hfsc sc rate 100mbit ul m2
100mbit
tc class add dev eth1 parent 1:1 classid 1:2 hfsc sc rate 18000kbit ul
m2 18000kbit
tc class add dev eth1 parent 1:2 classid 1:10 hfsc sc rate 85kbit #syn
ack rst
tc class add dev eth1 parent 1:2 classid 1:11 hfsc sc umax 1412b dmax
20ms rate 460kbit #Time critical
tc class add dev eth1 parent 1:2 classid 1:12 hfsc sc rate 5955kbit
#Interactive
tc class add dev eth1 parent 1:2 classid 1:13 hfsc sc rate 11500kbit #Bulk
tc class add dev eth1 parent 1:1 classid 1:14 hfsc sc rate 100mbit
tc filter add dev eth1 parent 1:0 protocol ip prio 1 handle 10 fw flowid
1:10
tc filter add dev eth1 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev eth1 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev eth1 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
Does anybody have any tips regarding getting this working as well or
better than my existing htb based system?
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
@ 2014-07-01 12:25 ` Michal Soltys
2014-07-01 13:19 ` Alan Goodman
` (20 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-01 12:25 UTC (permalink / raw)
To: lartc
On 2014-06-26 16:39, Alan Goodman wrote:
> Hi,
>
> tc qdisc del dev ppp0 root
>
> tc qdisc add dev ppp0 root handle 1:0 hfsc default 14
>
> #QoS for Download
>
> tc qdisc del dev eth1 root
>
What is your upstream/downstream bandwidth ?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
2014-07-01 12:25 ` Michal Soltys
@ 2014-07-01 13:19 ` Alan Goodman
2014-07-01 13:30 ` Michal Soltys
` (19 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-01 13:19 UTC (permalink / raw)
To: lartc
Hi Michael,
Thank you for your response.
Sync speed is 21mbit down and 1.1mbit up.
I recognise that I pasted examples in my original email with different
rate limits included - this was due to my original hfsc testing
occurring before a line fault afflicted the line. htb paste is my
current restrictions since a line fault (now resolved) caused the speed
to drop - waiting for the ISPs dynamic line management system to allow
the lower SNR I used to use again...
Hope this makes sense,
Alan
On 01/07/14 13:25, Michal Soltys wrote:
> On 2014-06-26 16:39, Alan Goodman wrote:
>> Hi,
>>
>> tc qdisc del dev ppp0 root
>>
>> tc qdisc add dev ppp0 root handle 1:0 hfsc default 14
>>
>> #QoS for Download
>>
>> tc qdisc del dev eth1 root
>>
>
> What is your upstream/downstream bandwidth ?
>
> --
> To unsubscribe from this list: send the line "unsubscribe lartc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
2014-07-01 12:25 ` Michal Soltys
2014-07-01 13:19 ` Alan Goodman
@ 2014-07-01 13:30 ` Michal Soltys
2014-07-01 14:33 ` Alan Goodman
` (18 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-01 13:30 UTC (permalink / raw)
To: lartc
On 2014-07-01 15:19, Alan Goodman wrote:
> Hi Michael,
>
> Thank you for your response.
>
> Sync speed is 21mbit down and 1.1mbit up.
Ok,
>
> I recognise that I pasted examples in my original email with different
> rate limits included - this was due to my original hfsc testing
> occurring before a line fault afflicted the line. htb paste is my
> current restrictions since a line fault (now resolved) caused the speed
> to drop - waiting for the ISPs dynamic line management system to allow
> the lower SNR I used to use again...
>
Judging from the speed/info, is it adsl and is your modem directly
connected to your router ?
PS.
Disregard the other mail, it went without lartc in header.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (2 preceding siblings ...)
2014-07-01 13:30 ` Michal Soltys
@ 2014-07-01 14:33 ` Alan Goodman
2014-07-03 0:12 ` Michal Soltys
` (17 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-01 14:33 UTC (permalink / raw)
To: lartc
Hi Michael,
Thanks for your reply. Please see my inline comments.
On 01/07/14 14:30, Michal Soltys wrote:
> Judging from the speed/info, is it adsl and is your modem directly
> connected to your router ?
>
Connection type is ADSL, PPPoA
Topology is this:
Phone line -> splitter -> ADSL Router in bridge mode -> 10/100 ethernet
-> eth0 on CentOS 6 server
CentOS 6 server is doing PPPoE, router is connected directly to eth0.
I shape internet upload on ppp0 and internet download as it passes out
of the eth1 interface. Eth1 is the uplink to my LAN.
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (3 preceding siblings ...)
2014-07-01 14:33 ` Alan Goodman
@ 2014-07-03 0:12 ` Michal Soltys
2014-07-03 0:56 ` Alan Goodman
` (16 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-03 0:12 UTC (permalink / raw)
To: lartc
On 2014-07-01 16:33, Alan Goodman wrote:
> Hi Michael,
>
> Thanks for your reply. Please see my inline comments.
>
> On 01/07/14 14:30, Michal Soltys wrote:
>> Judging from the speed/info, is it adsl and is your modem directly
>> connected to your router ?
>>
>
> Connection type is ADSL, PPPoA
>
> Topology is this:
>
> Phone line -> splitter -> ADSL Router in bridge mode -> 10/100 ethernet
> -> eth0 on CentOS 6 server
>
> CentOS 6 server is doing PPPoE, router is connected directly to eth0.
>
> I shape internet upload on ppp0 and internet download as it passes out
> of the eth1 interface. Eth1 is the uplink to my LAN.
Ah, so essentially you have adsl router (draytek perhaps ?) that
translates pppoe and then sends using pppoa (it's still operates as a
router from what I rememeber, only mimicking pppoe server).
Anyway, long story short you should use tc-stab to match adsl speed
properly, then the rules will need some adjustment as well. I'll put
some suggestions later (for once, 100mbit in default is generally bad
idea in default class if something faster lands there, as it will
instantly saturate uplink - realtime curve will make sure of that).
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (4 preceding siblings ...)
2014-07-03 0:12 ` Michal Soltys
@ 2014-07-03 0:56 ` Alan Goodman
2014-07-06 1:18 ` Michal Soltys
` (15 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-03 0:56 UTC (permalink / raw)
To: lartc
Hi Michael,
Many thanks for your useful response.
Please see inline comments.
On 03/07/14 01:12, Michal Soltys wrote:
>> Connection type is ADSL, PPPoA
>>
>> Topology is this:
>>
>> Phone line -> splitter -> ADSL Router in bridge mode -> 10/100 ethernet
>> -> eth0 on CentOS 6 server
>>
>> CentOS 6 server is doing PPPoE, router is connected directly to eth0.
>>
>> I shape internet upload on ppp0 and internet download as it passes out
>> of the eth1 interface. Eth1 is the uplink to my LAN.
>
> Ah, so essentially you have adsl router (draytek perhaps ?) that
> translates pppoe and then sends using pppoa (it's still operates as a
> router from what I rememeber, only mimicking pppoe server).
Its a BT Business Hub 3, which I think is made by Huawei. Im not sure of
the technical underpinnings of how a router in bridgemode operates
beyond I set it to bridge mode and connect it to a device that does
PPPoE and it 'just works'. The PPP session then runs on the connected
device and the router is then able to handle as many packets/sec as the
underlying link or the connected device can support. If you could
explain how this works in more detail, or link me a related article id
be intrigued.
> Anyway, long story short you should use tc-stab to match adsl speed
> properly, then the rules will need some adjustment as well. I'll put
> some suggestions later (for once, 100mbit in default is generally bad
> idea in default class if something faster lands there, as it will
> instantly saturate uplink - realtime curve will make sure of that)
Do you have any examples (or links to correct examples online) of a good
method of utilising 'tc-stab' ? I have looked at the documentation and
am feeling a little overwhelmed at the moment!
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (5 preceding siblings ...)
2014-07-03 0:56 ` Alan Goodman
@ 2014-07-06 1:18 ` Michal Soltys
2014-07-06 15:34 ` Alan Goodman
` (14 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-06 1:18 UTC (permalink / raw)
To: lartc
On 2014-07-03 02:56, Alan Goodman wrote:
>
> Its a BT Business Hub 3, which I think is made by Huawei. Im not sure of
> the technical underpinnings of how a router in bridgemode operates
> beyond I set it to bridge mode and connect it to a device that does
> PPPoE and it 'just works'.
Hmmm, maybe the isp handles both pppoa and pppoe ?
>
> Do you have any examples (or links to correct examples online) of a good
> method of utilising 'tc-stab' ? I have looked at the documentation and
> am feeling a little overwhelmed at the moment!
>
tc-stab essentially answers "what is the real length of packet being
sent ? (later)". ATM cells are always 53 bytes long with 48 bytes
payload - so the data send is divided in 48 byte packs, some
atm/ethernert/ppp specific info (overhead) and padded to fit into 48*n size.
Assuming your link is actualle pppoe, the overhead would be 32 (vcmux)
or 40 (llc) with 'linklayer atm' option. Then you can use the speed at
which modem synchronizes (or well a tiny bit less to compensate for
small variations) when using tc.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (6 preceding siblings ...)
2014-07-06 1:18 ` Michal Soltys
@ 2014-07-06 15:34 ` Alan Goodman
2014-07-06 16:42 ` Andy Furniss
` (13 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-06 15:34 UTC (permalink / raw)
To: lartc
Hi,
Once again many thanks for your informative response.
On 06/07/14 02:18, Michal Soltys wrote:
> On 2014-07-03 02:56, Alan Goodman wrote:
>>
>> Its a BT Business Hub 3, which I think is made by Huawei. Im not sure of
>> the technical underpinnings of how a router in bridgemode operates
>> beyond I set it to bridge mode and connect it to a device that does
>> PPPoE and it 'just works'.
>
> Hmmm, maybe the isp handles both pppoa and pppoe ?
Yeah, on further investigation it seems the ISP (BT Business) supports
both PPPoA and PPPoE. As is clear I am using PPPoE.
>> Do you have any examples (or links to correct examples online) of a good
>> method of utilising 'tc-stab' ? I have looked at the documentation and
>> am feeling a little overwhelmed at the moment!
>>
>
> tc-stab essentially answers "what is the real length of packet being
> sent ? (later)". ATM cells are always 53 bytes long with 48 bytes
> payload - so the data send is divided in 48 byte packs, some
> atm/ethernert/ppp specific info (overhead) and padded to fit into 48*n
> size.
>
> Assuming your link is actualle pppoe, the overhead would be 32 (vcmux)
> or 40 (llc) with 'linklayer atm' option. Then you can use the speed at
> which modem synchronizes (or well a tiny bit less to compensate for
> small variations) when using tc.
Thanks for your clear explanation here.
I added 'stab overhead 40 linklayer atm' to my root qdisc line since I
am confident my connection uses LLC multiplexing. This transformed the
hfsc based shaper to being the most accurate I have so far experienced.
I am able to set tc upload limit by the sync speed, and tc downstream by
the sync minus 12% which accounts for some rate limiting BT do on all
lines (they limit downstream to 88.2% of sync rate).
This works great in almost every test case except 'excessive p2p'. As a
test I configured a 9mbit RATE and upper limit m2 10mbit on my bulk
class. I then started downloading a CentOS torrent with very high
maximum connection limit set. I see 10mbit coming in on my ppp0
interface however latency in my priority queue (sc umax 1412b dmax 20ms
rate 460kbit) however my priority queue roundtrip is hitting 100+ms.
Below is a clip from a ping session which shows what happens when I
pause the torrent download.
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x179 ttlT
time\x128 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x180 ttlT
time\x152 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x181 ttlT
time\x137 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x182 ttlT
time\x134 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x183 ttlT
time\x133 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x184 ttlT
time\x106 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x185 ttlT
time\x105 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x186 ttlT
time\x144 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x187 ttlT
time\x127 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x188 ttlT
time–.0 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x189 ttlT
time!.8 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x190 ttlT
time\x16.8 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x191 ttlT
time\x17.9 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x192 ttlT
time\x17.8 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x193 ttlT
time\x17.9 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x194 ttlT
time\x16.9 ms
64 bytes from ha01.multiplay.co.uk (85.236.96.26): icmp_seq\x195 ttlT
time\x16.9 ms
Is it possible to iron this out, or is my unusual extreme test just too
much?
Many thanks,
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (7 preceding siblings ...)
2014-07-06 15:34 ` Alan Goodman
@ 2014-07-06 16:42 ` Andy Furniss
2014-07-06 16:49 ` Andy Furniss
` (12 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Andy Furniss @ 2014-07-06 16:42 UTC (permalink / raw)
To: lartc
Alan Goodman wrote:
> I added 'stab overhead 40 linklayer atm' to my root qdisc line since
> I am confident my connection uses LLC multiplexing. This transformed
> the hfsc based shaper to being the most accurate I have so far
> experienced. I am able to set tc upload limit by the sync speed, and
> tc downstream by the sync minus 12% which accounts for some rate
> limiting BT do on all lines (they limit downstream to 88.2% of sync
> rate).
If you have the choice of pppoa vs pppoe why not use a so you can use
overhead 10 and be more efficient for upload.
THe 88.2 thing is not atm rate, they do limit slightly below sync,
but that is a marketing (inexact) approximate ip rate.
If you were really matching their rate after allowing for overheads your
incoming shaping would do nothing at all.
>
> This works great in almost every test case except 'excessive p2p'. As
> a test I configured a 9mbit RATE and upper limit m2 10mbit on my bulk
> class. I then started downloading a CentOS torrent with very high
> maximum connection limit set. I see 10mbit coming in on my ppp0
> interface however latency in my priority queue (sc umax 1412b dmax
> 20ms rate 460kbit) however my priority queue roundtrip is hitting
> 100+ms. Below is a clip from a ping session which shows what happens
> when I pause the torrent download.
Shaping from the wrong end of the bottleneck is not ideal, if you really
care about latency you need to set lower limit for bulk and short queue
length.
As you have found hitting hard with many connections is the worse case.
I never really go into hfsc so can't comment on that aspect, but I have
in the past done a lot of shaping on BT adsl. In the early days of
288/576 it was very hard (for downstream). As the speeds get higher the
easier it gets WRT latency - 20/60mbit vdsl2 is easy :-)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (8 preceding siblings ...)
2014-07-06 16:42 ` Andy Furniss
@ 2014-07-06 16:49 ` Andy Furniss
2014-07-06 16:49 ` Alan Goodman
` (11 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Andy Furniss @ 2014-07-06 16:49 UTC (permalink / raw)
To: lartc
Andy Furniss wrote:
> If you have the choice of pppoa vs pppoe why not use a so you can
> use overhead 10 and be more efficient for upload.
Oops - that would assume you were shaping on the actual ppp, if you were
shaping on eth then tc already sees an overhead of 14 so you can set a
negative overhead.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (9 preceding siblings ...)
2014-07-06 16:49 ` Andy Furniss
@ 2014-07-06 16:49 ` Alan Goodman
2014-07-06 16:54 ` Alan Goodman
` (10 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-06 16:49 UTC (permalink / raw)
To: lartc
Thanks Andy,
I have been playing around a bit and may have been slightly quick to
comment in regard of download... With hfsc engaged and total limit set
to 17100kbit the actual throughput I see is closer to 14mbit for some
reason.
No traffic shaping:
http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
hfsc->sfq perturb 10
http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
On 06/07/14 17:42, Andy Furniss wrote:
> If you have the choice of pppoa vs pppoe why not use a so you can use
> overhead 10 and be more efficient for upload.
>
> THe 88.2 thing is not atm rate, they do limit slightly below sync,
> but that is a marketing (inexact) approximate ip rate.
>
> If you were really matching their rate after allowing for overheads your
> incoming shaping would do nothing at all.
My understanding is that they limit the BRAS profile to 88.2% of your
downstream sync to prevent traffic backing up in the exchange links.
>> This works great in almost every test case except 'excessive p2p'. As
>> a test I configured a 9mbit RATE and upper limit m2 10mbit on my bulk
>> class. I then started downloading a CentOS torrent with very high
>> maximum connection limit set. I see 10mbit coming in on my ppp0
>> interface however latency in my priority queue (sc umax 1412b dmax
>> 20ms rate 460kbit) however my priority queue roundtrip is hitting
>> 100+ms. Below is a clip from a ping session which shows what happens
>> when I pause the torrent download.
>
> Shaping from the wrong end of the bottleneck is not ideal, if you really
> care about latency you need to set lower limit for bulk and short queue
> length.
>
> As you have found hitting hard with many connections is the worse case.
Are you saying that in addition to setting the 10mbit upper limit I
should also set sfq limit to say 25 packets?
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (10 preceding siblings ...)
2014-07-06 16:49 ` Alan Goodman
@ 2014-07-06 16:54 ` Alan Goodman
2014-07-06 20:42 ` Andy Furniss
` (9 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-06 16:54 UTC (permalink / raw)
To: lartc
Sorry to split this out into multiple replies, just noticed the below
comment...
On 06/07/14 17:42, Andy Furniss wrote:
> If you have the choice of pppoa vs pppoe why not use a so you can use
> overhead 10 and be more efficient for upload.
The answer is I dont really know how to use PPPoA with the business hub
modem in bridge mode... On the CentOS box I just do pppoe-setup and
punch in the details and it 'just works'.
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (11 preceding siblings ...)
2014-07-06 16:54 ` Alan Goodman
@ 2014-07-06 20:42 ` Andy Furniss
2014-07-06 22:18 ` Alan Goodman
` (8 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Andy Furniss @ 2014-07-06 20:42 UTC (permalink / raw)
To: lartc
Alan Goodman wrote:
> Thanks Andy,
>
> I have been playing around a bit and may have been slightly quick to
> comment in regard of download... With hfsc engaged and total limit
> set to 17100kbit the actual throughput I see is closer to 14mbit for
> some reason.
>
> No traffic shaping:
> http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
>
>
> hfsc->sfq perturb 10
> http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
>
Wrong link - but that's data throughput which is < ip throughput which
is < atm level throughput. Assuming you are using stab when setting the
17100kbit 14mbit date throughput is only a bit below expected.
I assume your mtu is 1492, also assuming you have default linux settings
of tcptimestamps on (costs 12 bytes), so with tcp + ip headers there is
only 1440 bytes data /packet each of which after allowing for ppp/aal5
overheads will probably use 32 cells = 1696 bytes.
1440 / 1696 = 0.85 * 17.1 = 14.5.
I am not sure what overhead you should add with stab for your pppoe as
tc already sees eth as ip + 14 - maybe adding 40 is too much and you are
getting 33 cells per packet.
> On 06/07/14 17:42, Andy Furniss wrote:
>> If you have the choice of pppoa vs pppoe why not use a so you can
>> use overhead 10 and be more efficient for upload.
>>
>> THe 88.2 thing is not atm rate, they do limit slightly below sync,
>> but that is a marketing (inexact) approximate ip rate.
>>
>> If you were really matching their rate after allowing for overheads
>> your incoming shaping would do nothing at all.
>
> My understanding is that they limit the BRAS profile to 88.2% of your
> downstream sync to prevent traffic backing up in the exchange
> links.
They do but they also call it the "IP Profile" so in addition to
limiting slightly below sync rate they are also allowing for atm
overheads in presenting the figure that they do.
>>> This works great in almost every test case except 'excessive
>>> p2p'. As a test I configured a 9mbit RATE and upper limit m2
>>> 10mbit on my bulk class. I then started downloading a CentOS
>>> torrent with very high maximum connection limit set. I see
>>> 10mbit coming in on my ppp0 interface however latency in my
>>> priority queue (sc umax 1412b dmax 20ms rate 460kbit) however my
>>> priority queue roundtrip is hitting 100+ms. Below is a clip from
>>> a ping session which shows what happens when I pause the torrent
>>> download.
>>
>> Shaping from the wrong end of the bottleneck is not ideal, if you
>> really care about latency you need to set lower limit for bulk and
>> short queue length.
>>
>> As you have found hitting hard with many connections is the worse
>> case.
>
> Are you saying that in addition to setting the 10mbit upper limit I
> should also set sfq limit to say 25 packets?
Well, it's quite a fast link, maybe 25 is too short - I would test IIRC
128 is default for sfq.
Thinking more about it, there could be other reasons that you got the
latency you saw.
As I said I don't know HFSC, but I notice on both your setups you give
very little bandwidth to "syn ack rst", I assume ack here means you
classified by length to get empty (s)acks as almost every packet has ack
set. Personally I would give those < prio than time critical and you
should be aware on a highly asymmetric 20:1 adsl line they can eat a
fair bit of your upstream (2 cells each, 1 for every 2 incoming best
case, 1 per incoming in recovery after loss).
When using htb years ago, I found that latency was better if I way over
allocated bandwidth for my interactive class and gave the bulks a low
rate so they had to borrow.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (12 preceding siblings ...)
2014-07-06 20:42 ` Andy Furniss
@ 2014-07-06 22:18 ` Alan Goodman
2014-07-06 22:24 ` Andy Furniss
` (7 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-06 22:18 UTC (permalink / raw)
To: lartc
Hi Andy/all,
Thanks for your further useful input.
On 06/07/14 21:42, Andy Furniss wrote:
> Wrong link - but that's data throughput which is < ip throughput which
> is < atm level throughput. Assuming you are using stab when setting the
> 17100kbit 14mbit date throughput is only a bit below expected.
>
> I assume your mtu is 1492, also assuming you have default linux settings
> of tcptimestamps on (costs 12 bytes), so with tcp + ip headers there is
> only 1440 bytes data /packet each of which after allowing for ppp/aal5
> overheads will probably use 32 cells = 1696 bytes.
>
> 1440 / 1696 = 0.85 * 17.1 = 14.5.
>
> I am not sure what overhead you should add with stab for your pppoe as
> tc already sees eth as ip + 14 - maybe adding 40 is too much and you are
> getting 33 cells per packet.
Sorry regarding the link related mistake. What you would have seen if I
had correctly linked you was that the connection without traffic shaping
is managing around 16.4mbit. With traffic shaping, upper limit set to
17100kbit and stab overhead 40 I only see 14.5 ish mbit - which means we
are likely being overly conservative?
My download shaping is being completed on the outbound leg from the
server... Traffic flows ADSL -> router in bridge mode -> eth0 -> ppp0
-> eth1 switch client device. My download shaping occurs on the eth1
device.
Quickly relating to mtu... By default CentOS has CLAMPMSS enabled on the
pppoe connection. This is set to 1412 bytes. MTU of the connection is
1492 according to ifconfig.
> Well, it's quite a fast link, maybe 25 is too short - I would test IIRC
> 128 is default for sfq.
I have experimented with loads of sfq limit settings now and really it
seems to make little difference so I've decided to set this at 128
(default) for now.
> Thinking more about it, there could be other reasons that you got the
> latency you saw.
>
> As I said I don't know HFSC, but I notice on both your setups you give
> very little bandwidth to "syn ack rst", I assume ack here means you
> classified by length to get empty (s)acks as almost every packet has ack
> set. Personally I would give those < prio than time critical and you
> should be aware on a highly asymmetric 20:1 adsl line they can eat a
> fair bit of your upstream (2 cells each, 1 for every 2 incoming best
> case, 1 per incoming in recovery after loss).
I feel thats a very valid point. I have decided to roll class 10 and
class 11 together for the time being which should mean time critical +
syn/ack etc get roughly 50% of upload capacity.
I've been playing around with my worst case bittorent scenario some
more. Whilst troubleshooting I decided to set ul 15000kbit on the
download class 1:13 (which the torrent hits). With the torrent using
around 200 flows I Immediately saw latency in the priority queue within
acceptable limits. So I thought bingo perhaps I set my class 1:2 upper
limit too high overall? So I reduced 17100kbit to 15000kbit, adjusted
sc rates so that total was 15000kbit and deleted the upper limit on
class 1:13 and reloaded the rules. Unfortunately this behaved exactly
like 17100kbit upper limit latency wise. But I dont understand why this
is? Could this be the crux of my issues - some hfsc misunderstandings?
While I had the ul 15000kbit set on the bulk class I also played around
with getting traffic to hit the 'interactive' 1:12 class. I found this
caused similar overall behaviour to when I didnt limit the bulk class at
all - roundtrip hitting over 100+ms. Its as though it doesnt like
hitting the branch linkshare/upper limit?
Below is my current script:
#QoS for Upload
tc qdisc del dev ppp0 root
tc qdisc add dev ppp0 stab mtu 1492 overhead 40 linklayer atm root
handle 1:0 hfsc default 14
tc class add dev ppp0 parent 1:0 classid 1:1 hfsc sc rate 90mbit
tc class add dev ppp0 parent 1:1 classid 1:2 hfsc sc rate 1100kbit ul
rate 1100kbit
tc class add dev ppp0 parent 1:2 classid 1:11 hfsc sc umax 1412b dmax
20ms rate 495kbit # Time critical
tc class add dev ppp0 parent 1:2 classid 1:12 hfsc sc rate 300kbit
#Interactive
tc class add dev ppp0 parent 1:2 classid 1:13 hfsc sc rate 305kbit #bulk
tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc rate 90mbit
tc qdisc add dev ppp0 parent 1:11 handle 11: sfq perturb 10
tc qdisc add dev ppp0 parent 1:12 handle 12: sfq perturb 10
tc qdisc add dev ppp0 parent 1:13 handle 13: sfq perturb 10
tc qdisc add dev ppp0 parent 1:14 handle 14: pfifo
tc filter add dev ppp0 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev ppp0 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev ppp0 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
#QoS for Download
tc qdisc del dev eth1 root
tc qdisc add dev eth1 stab overhead 40 linklayer atm root handle 1:0
hfsc default 14
tc class add dev eth1 parent 1:0 classid 1:1 hfsc sc rate 90mbit
tc class add dev eth1 parent 1:1 classid 1:2 hfsc sc rate 17100kbit ul
rate 17100kbit
tc class add dev eth1 parent 1:2 classid 1:11 hfsc sc umax 1412b dmax
20ms rate 1545kbit #Time critical
tc class add dev eth1 parent 1:2 classid 1:12 hfsc sc rate 4955kbit
#Interactive
tc class add dev eth1 parent 1:2 classid 1:13 hfsc sc rate 10600kbit #ul
rate 15000kbit #Bulk
tc class add dev eth1 parent 1:1 classid 1:14 hfsc sc rate 90mbit
tc qdisc add dev eth1 parent 1:11 handle 11: sfq perturb 10
tc qdisc add dev eth1 parent 1:12 handle 12: sfq perturb 10
tc qdisc add dev eth1 parent 1:13 handle 13: sfq perturb 10
tc qdisc add dev eth1 parent 1:14 handle 14: pfifo
tc filter add dev eth1 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev eth1 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev eth1 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
Quick note regarding class 1:14... Class 1:14 is only getting traffic
which I dont mark in iptables. My iptables ruleset guarantees that all
traffic destined for the internet leaving via ppp0 and all traffic
destined for a machine inside the NAT leaving via eth1 which hasnt
already been marked as more important gets marked 13. Therefore traffic
not marked must be source localsubnet destination local subnet. Hope
this makes sense!
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (13 preceding siblings ...)
2014-07-06 22:18 ` Alan Goodman
@ 2014-07-06 22:24 ` Andy Furniss
2014-07-07 0:01 ` Alan Goodman
` (6 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Andy Furniss @ 2014-07-06 22:24 UTC (permalink / raw)
To: lartc
Alan Goodman wrote:
> http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
A further thought, having just reminded my self how long the TBB speed
test upload takes -
If you were using sfq on upload it looks like the spikes correspond well
with perturb 10. Historically sfqs hash was quite weak, so perturb was
usefull, but it got changed to jhash ages ago so is less needed, or at
least not that short, as it causes packet reordering leading to a bit of
disruption.
If you weren't using sfq for up then ignore the above :-) it's possible
that I am out of date WRT sfq reordering and it has been fixed.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (14 preceding siblings ...)
2014-07-06 22:24 ` Andy Furniss
@ 2014-07-07 0:01 ` Alan Goodman
2014-07-07 9:54 ` Michal Soltys
` (5 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-07 0:01 UTC (permalink / raw)
To: lartc
On 06/07/14 23:24, Andy Furniss wrote:
> Alan Goodman wrote:
>
>> http://www.thinkbroadband.com/speedtest/results.html?id\x140466829057682641064
>>
>
> A further thought, having just reminded my self how long the TBB speed
> test upload takes -
>
> If you were using sfq on upload it looks like the spikes correspond well
> with perturb 10. Historically sfqs hash was quite weak, so perturb was
> usefull, but it got changed to jhash ages ago so is less needed, or at
> least not that short, as it causes packet reordering leading to a bit of
> disruption.
>
> If you weren't using sfq for up then ignore the above :-) it's possible
> that I am out of date WRT sfq reordering and it has been fixed.
> --
> To unsubscribe from this list: send the line "unsubscribe lartc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
I had also noticed the rhthmic nature in the ping roundtrip also.
I have tried removing perturb it doesnt appear to make much difference
on either front.
My htb based script also dies in a fiery mess with the excessive p2p test.
I have three gut feelings at the moment:
Firstly, With massive flow counts - at least for bittorent - the amount
of small packets being uploaded is effectively saturating the upload.
Secondly, as flow counts continue to increase upload is becoming
insufficient to keep up with the rate of small packets which is causing
the massive latency spike I am seeing.
Thirdly, my classifier might be missing some of the important ACK
packets... Should packets like the one listed below be getting prioritised?
IN=eth1 OUT= MAC\0:a0:c9:81:c9:d5:f0:de:f1:f8:83:52:08:00
SRC\x192.168.25.41 DST=5.9.123.39 LENR TOS=0x00 PREC=0x00 TTLd
IDT864 DF PROTO=TCP SPTQ413 DPT9415 WINDOW\x1379 RES=0x00 ACK URGP=0
Many thanks everybody for all the extremely valuable input received so
far, I think we're just about on the home straight now!
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (15 preceding siblings ...)
2014-07-07 0:01 ` Alan Goodman
@ 2014-07-07 9:54 ` Michal Soltys
2014-07-07 9:58 ` Michal Soltys
` (4 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-07 9:54 UTC (permalink / raw)
To: lartc
On 2014-07-06 17:34, Alan Goodman wrote:
>
> Is it possible to iron this out, or is my unusual extreme test just too
> much?
>
Certainly, I have 24/7 torrent with uplink limits done solely by hfsc,
so it's certainly possible - I can't really tell if my torrent is even
on or off (core dumped ;) ). I have few extra patches, though
they should make little/no difference (at those speeds especially).
Suggestions about your uplink rules (disabled word wrapping to make it
more readable):
> tc class add dev ppp0 parent 1:0 classid 1:1 hfsc sc rate 100mbit ul rate 100mbit
> tc class add dev ppp0 parent 1:1 classid 1:2 hfsc sc rate 900kbit ul rate 900kbit
Unless the above is a typo, this makes no sense for ppp0 interface. You
should be covering the speed for what your uplink sync is. If it say
synchronizes at 1112248 bit/s (with some variation, but e.g. never lower
than 1100000), set
tc qdisc add dev ppp0 root handle 1:0 hfsc stab overhead 40 linklayer atm default 14
tc class add dev ppp0 parent 1:0 classid 1:1 hfsc ls m2 1100000 ul m2 1100000
Now why ls ? sc is shorthand for ls+rt, and rt functions only on leaf
classes with qdiscs attached (and outside class hierarchy). ul limits
the speed at which ls can send packets. ls is also relative only and
makes child classes send at ratio proportional to their values, e.g.
A 100mbit, B 200mbit on 10mbit interface would mean that hfsc would send
data from those classes in 1:2 ratio - not try to send 300 mbit total
there (that would happen /if/ it was 'rt' and A & B were leaves).
Remaining part (just an example):
tc class add dev ppp0 parent 1:1 classid 1:10 hfsc sc m2 100kbit #syn ack rst
tc class add dev ppp0 parent 1:1 classid 1:11 hfsc sc m1 500kbit d 20ms m2 300kbit # Time critical
tc class add dev ppp0 parent 1:1 classid 1:12 hfsc sc m2 200kbit #Interactive
tc class add dev ppp0 parent 1:1 classid 1:13 hfsc sc m2 100kbit #bulk
tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc m1 100kbit d 20ms m2 300kbit # torrent and not-classified junk
'rt' sums to 1mbit, implied 'ls' will cover remaining bandwidth
proportionally.
Unless you have special needs (aka killing speed for e.g. some customer
under some hierarchy subtree), avoid using 'ul' on anything but uppermost
class.
Note: you don't have to use sc - you can use rt and ls separately - as
long as they make sense w.r.t. each other. In many situations, you don't
really need that precise and large 'rt' values when 'ls' can nicely
cover the rest.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (16 preceding siblings ...)
2014-07-07 9:54 ` Michal Soltys
@ 2014-07-07 9:58 ` Michal Soltys
2014-07-07 10:08 ` Michal Soltys
` (3 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-07 9:58 UTC (permalink / raw)
To: lartc
On 2014-07-07 11:54, Michal Soltys wrote:
One more thing - pppd sets queue length to 3 by default on created
interface. If you don't override with ip's txqueuelen /and/ use some
qdisc which inherits that value (e.g. default pfifo_fast) - then you can
end with rather tiny queue length.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (17 preceding siblings ...)
2014-07-07 9:58 ` Michal Soltys
@ 2014-07-07 10:08 ` Michal Soltys
2014-07-07 10:10 ` Michal Soltys
` (2 subsequent siblings)
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-07 10:08 UTC (permalink / raw)
To: lartc
On 2014-07-06 18:42, Andy Furniss wrote:
> Alan Goodman wrote:
>
>> I added 'stab overhead 40 linklayer atm' to my root qdisc line since
>> I am confident my connection uses LLC multiplexing. This transformed
>> the hfsc based shaper to being the most accurate I have so far
>> experienced. I am able to set tc upload limit by the sync speed, and
>> tc downstream by the sync minus 12% which accounts for some rate
>> limiting BT do on all lines (they limit downstream to 88.2% of sync
>> rate).
>
> If you have the choice of pppoa vs pppoe why not use a so you can use
> overhead 10 and be more efficient for upload.
>
For the record - choosing PPPoE usually allows switching [very cheap]
adsl routers into bridge mode - avoiding their connection tracking
limits altogether (great for torrents) and getting public ip on linux
machine easily.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (18 preceding siblings ...)
2014-07-07 10:08 ` Michal Soltys
@ 2014-07-07 10:10 ` Michal Soltys
2014-07-07 10:59 ` Alan Goodman
2014-07-07 15:38 ` Alan Goodman
21 siblings, 0 replies; 23+ messages in thread
From: Michal Soltys @ 2014-07-07 10:10 UTC (permalink / raw)
To: lartc
On 2014-07-07 11:58, Michal Soltys wrote:
> On 2014-07-07 11:54, Michal Soltys wrote:
>
> One more thing - pppd sets queue length to 3 by default on created
> interface. If you don't override with ip's txqueuelen /and/ use some
> qdisc which inherits that value (e.g. default pfifo_fast) - then you can
> end with rather tiny queue length.
>
ITOW, make sure each qdisc attached to each leaf class have queue length
sensible for its function.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (19 preceding siblings ...)
2014-07-07 10:10 ` Michal Soltys
@ 2014-07-07 10:59 ` Alan Goodman
2014-07-07 15:38 ` Alan Goodman
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-07 10:59 UTC (permalink / raw)
To: lartc
Once again thank you for your extremely useful input. I feel I am
finally onto the home straight!
I am going to experiment with the details you provided below...
My reasoning regarding the 100mbit class is that in the some cases I
have systems which are classifying VPN traffic before it is
transmitted. With HTB I avoid accidentally shaping the encrypted
trafffic by not having a default queue configured. Since all traffic is
accounted for with netfilter marks and that traffic passes through the
queues this works correctly. When I came to hfsc I was shocked to find
that hfsc DROPS all traffic which isnt accounted for in a queue, so what
you see is my attempt at emulating the 'old' functionality. Can you
think of a better way to accomplish this?
I am really confused about how I should set my downstream STAB...
BT limits downstream throughput at the atm level to 88.2% of sync
speed. Therefore with a 19560kbit sync I would expect to see 17251kbit
throughput (atm). This matches up with what I see in the real world.
My download shaping is on the upload of the my router to the network
though. I am confident I need stab, so that for small packets the
minimum atm cell size is accounted for, however I am a bit lost over
what to set overhead and speed at. Overhead 0 with overrall rate
17100kbit results in only peak 14.4mbit throughput observed. Overhead 0
with overrall rate 19000kbit sees around 16.2mbit flowing - which is
about perfect. Except BT limit the overall throughput on downstream to
88.2% of sync, which means the 19000kbit figure doesnt make sense to me?
:-S
Alan
On 07/07/14 10:54, Michal Soltys wrote:
> On 2014-07-06 17:34, Alan Goodman wrote:
>
>> Is it possible to iron this out, or is my unusual extreme test just too
>> much?
>>
> Certainly, I have 24/7 torrent with uplink limits done solely by hfsc,
> so it's certainly possible - I can't really tell if my torrent is even
> on or off (core dumped ;) ). I have few extra patches, though
> they should make little/no difference (at those speeds especially).
>
> Suggestions about your uplink rules (disabled word wrapping to make it
> more readable):
>
>> tc class add dev ppp0 parent 1:0 classid 1:1 hfsc sc rate 100mbit ul rate 100mbit
>> tc class add dev ppp0 parent 1:1 classid 1:2 hfsc sc rate 900kbit ul rate 900kbit
> Unless the above is a typo, this makes no sense for ppp0 interface. You
> should be covering the speed for what your uplink sync is. If it say
> synchronizes at 1112248 bit/s (with some variation, but e.g. never lower
> than 1100000), set
>
> tc qdisc add dev ppp0 root handle 1:0 hfsc stab overhead 40 linklayer atm default 14
> tc class add dev ppp0 parent 1:0 classid 1:1 hfsc ls m2 1100000 ul m2 1100000
>
> Now why ls ? sc is shorthand for ls+rt, and rt functions only on leaf
> classes with qdiscs attached (and outside class hierarchy). ul limits
> the speed at which ls can send packets. ls is also relative only and
> makes child classes send at ratio proportional to their values, e.g.
>
> A 100mbit, B 200mbit on 10mbit interface would mean that hfsc would send
> data from those classes in 1:2 ratio - not try to send 300 mbit total
> there (that would happen /if/ it was 'rt' and A & B were leaves).
>
> Remaining part (just an example):
>
> tc class add dev ppp0 parent 1:1 classid 1:10 hfsc sc m2 100kbit #syn ack rst
> tc class add dev ppp0 parent 1:1 classid 1:11 hfsc sc m1 500kbit d 20ms m2 300kbit # Time critical
> tc class add dev ppp0 parent 1:1 classid 1:12 hfsc sc m2 200kbit #Interactive
> tc class add dev ppp0 parent 1:1 classid 1:13 hfsc sc m2 100kbit #bulk
> tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc m1 100kbit d 20ms m2 300kbit # torrent and not-classified junk
>
> 'rt' sums to 1mbit, implied 'ls' will cover remaining bandwidth
> proportionally.
>
> Unless you have special needs (aka killing speed for e.g. some customer
> under some hierarchy subtree), avoid using 'ul' on anything but uppermost
> class.
>
> Note: you don't have to use sc - you can use rt and ls separately - as
> long as they make sense w.r.t. each other. In many situations, you don't
> really need that precise and large 'rt' values when 'ls' can nicely
> cover the rest.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: HFSC not working as expected
2014-06-26 14:39 HFSC not working as expected Alan Goodman
` (20 preceding siblings ...)
2014-07-07 10:59 ` Alan Goodman
@ 2014-07-07 15:38 ` Alan Goodman
21 siblings, 0 replies; 23+ messages in thread
From: Alan Goodman @ 2014-07-07 15:38 UTC (permalink / raw)
To: lartc
On 07/07/14 10:54, Michal Soltys wrote:
> tc qdisc add dev ppp0 root handle 1:0 hfsc stab overhead 40 linklayer atm default 14
> tc class add dev ppp0 parent 1:0 classid 1:1 hfsc ls m2 1100000 ul m2 1100000
> tc class add dev ppp0 parent 1:1 classid 1:10 hfsc sc m2 100kbit #syn ack rst
> tc class add dev ppp0 parent 1:1 classid 1:11 hfsc sc m1 500kbit d 20ms m2 300kbit # Time critical
> tc class add dev ppp0 parent 1:1 classid 1:12 hfsc sc m2 200kbit #Interactive
> tc class add dev ppp0 parent 1:1 classid 1:13 hfsc sc m2 100kbit #bulk
> tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc m1 100kbit d 20ms m2 300kbit # torrent and not-classified junk
Hi,
Please note that the below relates to the updated traffic classes
detailed within this email...
I have been spending further time testing stuff this morning/afternoon.
I noticed that when download was busy and I started an upload the
download speed was fluctuating up and down in a rhythmic pattern. I
theorise that this is likely caused by the downloads ACK packets (which
are small and categorised the same as the download traffic) not getting
sent. During a download at line rate with no downstream traffic shaping
enabled I see around 350kbit/sec flowing 'into' eth1 on the server.
This will be traffic that is destined for the internet.
I am categorising the traffic as follows:
If a rule is matched we stop processing further rules.
Destination port 80 Packet length 0->512 bytes class 1:12
Destination port 80 Packet length 512+ AND bytes out LESS than 5MB class
1:12
Destination port 80 Packet length 512+ AND bytes out MORE than 5MB class
1:13
The bytes out rules look in conntrack to see how much data has flowed in
that connection since it started.
I have confirmed that packets flowing up relating to this download are
being categorised by the first rule, they are showing as 52 bytes long
and have the ACK flag set.
With downstream traffic management disabled and idle upstream I am
seeing steady line rate transfer on my download... (Output is wget
--progress=dot:mega on a massive file hosted on a fast server.)
24576K ........ ........ ........ ........ ........ ........ 0% 2.03M
53m9s
27648K ........ ........ ........ ........ ........ ........ 0% 2.03M
51m16s
30720K ........ ........ ........ ........ ........ ........ 0% 2.00M
49m46s
33792K ........ ........ ........ ........ ........ ........ 0% 2.03M
48m28s
36864K ........ ........ ........ ........ ........ ........ 0% 2.03M
47m23s
However once I start an sftp upload (which is categorised 1:13) my
download speed becomes eratic with rhythmic slowdowns:
86016K ........ ........ ........ ........ ........ ........ 2% 2.00M
40m20s
89088K ........ ........ ........ ........ ........ ........ 2% 2.05M
40m5s
92160K ........ ........ ........ ........ ........ ........ 2% 2.03M
39m52s
95232K ........ ........ ........ ........ ........ ........ 2% 2.00M
39m41s
98304K ........ ........ ........ ........ ........ ........ 2% 1.21M
40m11s
101376K ........ ........ ........ ........ ........ ........ 2% 1.20M
40m41s
104448K ........ ........ ........ ........ ........ ........ 2% 1.39M
40m55s
107520K ........ ........ ........ ........ ........ ........ 2% 1.66M
40m55s
110592K ........ ........ ........ ........ ........ ........ 2% 1.66M
40m54s
113664K ........ ........ ........ ........ ........ ........ 2% 1.58M
40m57s
116736K ........ ........ ........ ........ ........ ........ 2% 1.75M
40m53s
119808K ........ ........ ........ ........ ........ ........ 2% 1.73M
40m50s
122880K ........ ........ ........ ........ ........ ........ 2% 1.86M
40m42s
125952K ........ ........ ........ ........ ........ ........ 2% 1.94M
40m33s
129024K ........ ........ ........ ........ ........ ........ 3% 1.89M
40m26s
132096K ........ ........ ........ ........ ........ ........ 3% 1.71M
40m24s
135168K ........ ........ ........ ........ ........ ........ 3% 1.70M
40m22s
138240K ........ ........ ........ ........ ........ ........ 3% 1.54M
40m26s
141312K ........ ........ ........ ........ ........ ........ 3% 1.17M
40m47s
144384K ........ ........ ........ ........ ........ ........ 3% 1.21M
41m5s
147456K ........ ........ ........ ........ ........ ........ 3% 1.54M
41m8s
150528K ........ ........ ........ ........ ........ ........ 3% 1.80M
41m2s
153600K ........ ........ ........ ........ ........ ........ 3% 1.77M
40m58s
156672K ........ ........ ........ ........ ........ ........ 3% 1.89M
40m51s
159744K ........ ........ ........ ........ ........ ........ 3% 1.80M
40m46s
162816K ........ ........ ........ ........ ........ ........ 3% 1.62M
40m45s
165888K ........ ........ ........ ........ ........ ........ 3% 1.84M
40m40s
168960K ........ ........ ........ ........ ........ ........ 3% 1.92M
40m32s
172032K ........ ........ ........ ........ ........ ........ 4% 1.71M
40m30s
175104K ........ ........ ........ ........ ........ ........ 4% 1.56M
40m32s
178176K ........ ........ ........ ........ ........ ........ 4% 1.70M
40m29s
181248K ........ ........ ........ ........ ........ ........ 4% 1.31M
40m39s
184320K ........ ........ ........ ........ ........ ........ 4% 1.21M
40m53s
187392K ........ ........ ........ ........ ........ ........ 4% 1.26M
41m4s
190464K ........ ........ ........ ........ ........ ........ 4% 1.74M
41m0s
193536K ........ ........ ........ ........ ........ ........ 4% 1.75M
40m56s
196608K ........ ........ ........ ........ ........ ........ 4% 1.87M
40m50s
199680K ........ ........ ........ ........ ........ ........ 4% 1.91M
40m43s
202752K ........ ........ ........ ........ ........ ........ 4% 1.89M
40m37s
205824K ........ ........ ........ ........ ........ ........ 4% 1.92M
40m30s
208896K ........ ........ ........ ........ ........ ........ 4% 1.96M
40m23s
Here is the my tc structure
#QoS for Upload
tc qdisc del dev ppp0 root
tc qdisc add dev ppp0 stab overhead 42 linklayer atm mtu 1458 root
handle 1:0 hfsc default 14
tc class add dev ppp0 parent 1:0 classid 1:1 hfsc sc rate 98mbit
tc class add dev ppp0 parent 1:1 classid 1:2 hfsc ls m2 1100kbit ul m2
1100kbit
tc class add dev ppp0 parent 1:2 classid 1:11 hfsc sc m1 600kbit d 20ms
m2 400kbit # Time critical
tc class add dev ppp0 parent 1:2 classid 1:12 hfsc sc m2 300kbit
#Interactive
tc class add dev ppp0 parent 1:2 classid 1:13 hfsc sc m2 75kbit #bulk
tc class add dev ppp0 parent 1:1 classid 1:14 hfsc sc rate 98mbit
tc qdisc add dev ppp0 parent 1:11 handle 11: sfq (have also tried pfifo)
tc qdisc add dev ppp0 parent 1:12 handle 12: sfq limit 2 (have also
tried "pfifo limit 2" "bfifo limit 2916b")
tc qdisc add dev ppp0 parent 1:13 handle 13: sfq limit 2 (have also
tried "pfifo limit 2" "bfifo limit 2916b")
tc qdisc add dev ppp0 parent 1:14 handle 14: pfifo
tc filter add dev ppp0 parent 1:0 protocol ip prio 2 handle 11 fw flowid
1:11
tc filter add dev ppp0 parent 1:0 protocol ip prio 3 handle 12 fw flowid
1:12
tc filter add dev ppp0 parent 1:0 protocol ip prio 4 handle 13 fw flowid
1:13
My method for devising the above was:
Total rt traffic: 975kbit
Time critical (voip, ping etc) guaranteed 400kbit
Left over bandwidth shared by ratio 3:1. EG voip is using 400kbit this
means there is 700kbit left. If all classes were saturated this would
mean interactive got 466kbit and bulk got 233kbit.
My testing has around 64bytes/sec hitting the time critical queue (just
a ping packet every seconnd). This should mean that an http download
sending 350kbit of ACKs will be able to send all of its packets because
it should be allocated 3/4 of the remaining upload (777kbit) with the
remaining 259kbit used by the bulk upload...
However I see more traffic dropping in my queue 1:12:
class hfsc 1:11 parent 1:2 leaf 11: sc m1 600000bit d 20.0ms m2 400000bit
Sent 703310 bytes 4124 pkt (dropped 56, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 3987 work 703310 bytes rtwork 618669 bytes level 0
class hfsc 1:12 parent 1:2 leaf 13: sc m1 0bit d 0us m2 75000bit
Sent 5553128 bytes 5024 pkt (dropped 452, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 4954 work 5553128 bytes rtwork 915469 bytes level 0
class hfsc 1:13 parent 1:2 leaf 12: sc m1 0bit d 0us m2 300000bit
Sent 6719923 bytes 60826 pkt (dropped 1176, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
period 33210 work 6719923 bytes rtwork 3253617 bytes level 0
qdisc pfifo 11: parent 1:11 limit 3p
Sent 736170 bytes 4324 pkt (dropped 56, overlimits 0 requeues 0)
qdisc bfifo 12: parent 1:12 limit 2916b
Sent 6733385 bytes 60906 pkt (dropped 1176, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc bfifo 13: parent 1:13 limit 2916b
Sent 5554877 bytes 5030 pkt (dropped 452, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
I cant understand why queue 12 is dropping packets and not getting 3/4
of the bandwidth share? Maybe my STAB is still incorrect and all the
small packets are throwing out the calculations?
Other information you might need:
I chose limit 2 based upon my desire to have no more than 60ms jitter.
(1458/110000)*4=0.05301818 or 53ms.
Queue 1:11 is for time critical traffic. I am categorising small tcp
connection related packets (see below), VoIP, online games, ping and DNS
here.
Queue 1:12 is for interactive traffic. I categorise short ssh packets,
short dport 80 short dport 443, RDP, email, and web flows less than 5MB
but large packets here.
Queue 1:13 is for bulk traffic. I categorise web flows over 5MB and
everything else else into this queue.
TCP related packets hitting queue 1:11 are categorised by the following
iptables rule:
iptables -t mangle -A ULQOS -p tcp --syn -m length --length 40:68 -j
MARK --set-mark 11
iptables -t mangle -A ULQOS -p tcp --tcp-flags ALL SYN,ACK -m length
--length 40:68 -j MARK --set-mark 11
iptables -t mangle -A ULQOS -p tcp --tcp-flags ALL RST -j MARK --set-mark 11
iptables -t mangle -A ULQOS -p tcp --tcp-flags ALL ACK,RST -j MARK
--set-mark 11
iptables -t mangle -A ULQOS -p tcp --tcp-flags ALL ACK,FIN -j MARK
--set-mark 11
Thanks again for your patience and support. Sorry for continuing to be a
bit of a dunce!
Alan
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2014-07-07 15:38 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-26 14:39 HFSC not working as expected Alan Goodman
2014-07-01 12:25 ` Michal Soltys
2014-07-01 13:19 ` Alan Goodman
2014-07-01 13:30 ` Michal Soltys
2014-07-01 14:33 ` Alan Goodman
2014-07-03 0:12 ` Michal Soltys
2014-07-03 0:56 ` Alan Goodman
2014-07-06 1:18 ` Michal Soltys
2014-07-06 15:34 ` Alan Goodman
2014-07-06 16:42 ` Andy Furniss
2014-07-06 16:49 ` Andy Furniss
2014-07-06 16:49 ` Alan Goodman
2014-07-06 16:54 ` Alan Goodman
2014-07-06 20:42 ` Andy Furniss
2014-07-06 22:18 ` Alan Goodman
2014-07-06 22:24 ` Andy Furniss
2014-07-07 0:01 ` Alan Goodman
2014-07-07 9:54 ` Michal Soltys
2014-07-07 9:58 ` Michal Soltys
2014-07-07 10:08 ` Michal Soltys
2014-07-07 10:10 ` Michal Soltys
2014-07-07 10:59 ` Alan Goodman
2014-07-07 15:38 ` Alan Goodman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.