* taprio testing - Any help? @ 2019-10-11 19:35 Murali Karicheri 2019-10-11 20:12 ` Vinicius Costa Gomes 0 siblings, 1 reply; 17+ messages in thread From: Murali Karicheri @ 2019-10-11 19:35 UTC (permalink / raw) To: netdev Hi, I am testing the taprio (802.1Q Time Aware Shaper) as part of my pre-work to implement taprio hw offload and test. I was able to configure tap prio on my board and looking to do some traffic test and wondering how to play with the tc command to direct traffic to a specfic queue. For example I have setup taprio to create 5 traffic classes as shows below;- Now I plan to create iperf streams to pass through different gates. Now how do I use tc filters to mark the packets to go through these gates/queues? I heard about skbedit action in tc filter to change the priority field of SKB to allow the above mapping to happen. Any example that some one can point me to? Here is what I have tried so far. tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 5 \ map 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 1@2 1@3 1@4 \ base-time 1564628923967325838 \ sched-entry S 01 4000000 \ sched-entry S 02 4000000 \ sched-entry S 04 4000000 \ sched-entry S 08 4000000 \ sched-entry S 10 4000000 \ clockid CLOCK_TAI root@am57xx-evm:~# tc qdisc show dev eth0 qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 offset 4 count 1 clockid TAI offload 0 base-time 1564628923967325838 cycle-time 20000000 cycle-time-extension 0 index 0 cmd S gatemask 0x1 interval 4000000 index 1 cmd S gatemask 0x2 interval 4000000 index 2 cmd S gatemask 0x4 interval 4000000 index 3 cmd S gatemask 0x8 interval 4000000 index 4 cmd S gatemask 0x10 interval 4000000 qdisc pfifo 0: parent 100:5 limit 1000p qdisc pfifo 0: parent 100:4 limit 1000p qdisc pfifo 0: parent 100:3 limit 1000p qdisc pfifo 0: parent 100:2 limit 1000p qdisc pfifo 0: parent 100:1 limit 1000p Murali ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-11 19:35 taprio testing - Any help? Murali Karicheri @ 2019-10-11 20:12 ` Vinicius Costa Gomes 2019-10-11 20:56 ` Murali Karicheri 0 siblings, 1 reply; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-11 20:12 UTC (permalink / raw) To: Murali Karicheri, netdev Hi Murali, Murali Karicheri <m-karicheri2@ti.com> writes: > Hi, > > I am testing the taprio (802.1Q Time Aware Shaper) as part of my > pre-work to implement taprio hw offload and test. > > I was able to configure tap prio on my board and looking to do > some traffic test and wondering how to play with the tc command > to direct traffic to a specfic queue. For example I have setup > taprio to create 5 traffic classes as shows below;- > > Now I plan to create iperf streams to pass through different > gates. Now how do I use tc filters to mark the packets to > go through these gates/queues? I heard about skbedit action > in tc filter to change the priority field of SKB to allow > the above mapping to happen. Any example that some one can > point me to? What I have been using for testing these kinds of use cases (like iperf) is to use an iptables rule to set the priority for some kinds of traffic. Something like this: sudo iptables -t mangle -A POSTROUTING -p udp --dport 7788 -j CLASSIFY --set-class 0:3 This will set the skb->priority of UDP packets matching that rule to 3. Another alternative is to create a net_prio cgroup, and the sockets created under that hierarchy would have have that priority. I don't have an example handy for this right now, sorry. Is this what you were looking for? Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-11 20:12 ` Vinicius Costa Gomes @ 2019-10-11 20:56 ` Murali Karicheri 2019-10-11 21:26 ` Vinicius Costa Gomes 0 siblings, 1 reply; 17+ messages in thread From: Murali Karicheri @ 2019-10-11 20:56 UTC (permalink / raw) To: Vinicius Costa Gomes, netdev Hi Vinicius, On 10/11/2019 04:12 PM, Vinicius Costa Gomes wrote: > Hi Murali, > > Murali Karicheri <m-karicheri2@ti.com> writes: > >> Hi, >> >> I am testing the taprio (802.1Q Time Aware Shaper) as part of my >> pre-work to implement taprio hw offload and test. >> >> I was able to configure tap prio on my board and looking to do >> some traffic test and wondering how to play with the tc command >> to direct traffic to a specfic queue. For example I have setup >> taprio to create 5 traffic classes as shows below;- >> >> Now I plan to create iperf streams to pass through different >> gates. Now how do I use tc filters to mark the packets to >> go through these gates/queues? I heard about skbedit action >> in tc filter to change the priority field of SKB to allow >> the above mapping to happen. Any example that some one can >> point me to? > > What I have been using for testing these kinds of use cases (like iperf) > is to use an iptables rule to set the priority for some kinds of traffic. > > Something like this: > > sudo iptables -t mangle -A POSTROUTING -p udp --dport 7788 -j CLASSIFY --set-class 0:3 Let me try this. Yes. This is what I was looking for. I was trying something like this and I was getting an error tc filter add dev eth0 parent 100: protocol ip prio 10 u32 match ip dport 10000 0xffff flowid 100:3 RTNETLINK answers: Operation not supported We have an error talking to the kernel, -1 Not sure why the above throws an error for me. If I understand it right, match rule will add a filter to the parent to send packet to 100:3 which is for TC3 or Q3. My taprio configuration is as follows:- root@am57xx-evm:~# tc qdisc show dev eth0 qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 offset 4 count 1 clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 base-time 1564536613194451433 cycle-time 20000000 cycle- time-extension 0 index 0 cmd S gatemask 0x1 interval 4000000 index 1 cmd S gatemask 0x2 interval 4000000 index 2 cmd S gatemask 0x4 interval 4000000 index 3 cmd S gatemask 0x8 interval 4000000 index 4 cmd S gatemask 0x10 interval 4000000 qdisc pfifo 0: parent 100:5 limit 1000p qdisc pfifo 0: parent 100:4 limit 1000p qdisc pfifo 0: parent 100:3 limit 1000p qdisc pfifo 0: parent 100:2 limit 1000p qdisc pfifo 0: parent 100:1 limit 1000p Thanks for your response! > > This will set the skb->priority of UDP packets matching that rule to 3. > > Another alternative is to create a net_prio cgroup, and the sockets > created under that hierarchy would have have that priority. I don't have > an example handy for this right now, sorry. > > Is this what you were looking for? > > > Cheers, > -- > Vinicius > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-11 20:56 ` Murali Karicheri @ 2019-10-11 21:26 ` Vinicius Costa Gomes 2019-10-13 21:10 ` Vladimir Oltean 0 siblings, 1 reply; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-11 21:26 UTC (permalink / raw) To: Murali Karicheri, netdev Hi, Murali Karicheri <m-karicheri2@ti.com> writes: > Hi Vinicius, > > On 10/11/2019 04:12 PM, Vinicius Costa Gomes wrote: >> Hi Murali, >> >> Murali Karicheri <m-karicheri2@ti.com> writes: >> >>> Hi, >>> >>> I am testing the taprio (802.1Q Time Aware Shaper) as part of my >>> pre-work to implement taprio hw offload and test. >>> >>> I was able to configure tap prio on my board and looking to do >>> some traffic test and wondering how to play with the tc command >>> to direct traffic to a specfic queue. For example I have setup >>> taprio to create 5 traffic classes as shows below;- >>> >>> Now I plan to create iperf streams to pass through different >>> gates. Now how do I use tc filters to mark the packets to >>> go through these gates/queues? I heard about skbedit action >>> in tc filter to change the priority field of SKB to allow >>> the above mapping to happen. Any example that some one can >>> point me to? >> >> What I have been using for testing these kinds of use cases (like iperf) >> is to use an iptables rule to set the priority for some kinds of traffic. >> >> Something like this: >> >> sudo iptables -t mangle -A POSTROUTING -p udp --dport 7788 -j CLASSIFY --set-class 0:3 > Let me try this. Yes. This is what I was looking for. I was trying > something like this and I was getting an error > > tc filter add dev eth0 parent 100: protocol ip prio 10 u32 match ip > dport 10000 0xffff flowid 100:3 > RTNETLINK answers: Operation not supported > We have an error talking to the kernel, -1 Hmm, taprio (or mqprio for that matter) doesn't support tc filter blocks, so this won't work for those qdiscs. I never thought about adding support for it, it looks very interesting. Thanks for pointing this out. I will add this to my todo list, but anyone should feel free to beat me to it :-) Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-11 21:26 ` Vinicius Costa Gomes @ 2019-10-13 21:10 ` Vladimir Oltean 2019-10-14 15:33 ` Murali Karicheri 2019-10-14 23:14 ` Vinicius Costa Gomes 0 siblings, 2 replies; 17+ messages in thread From: Vladimir Oltean @ 2019-10-13 21:10 UTC (permalink / raw) To: Vinicius Costa Gomes; +Cc: Murali Karicheri, netdev Hi Vinicius, On Sat, 12 Oct 2019 at 00:28, Vinicius Costa Gomes <vinicius.gomes@intel.com> wrote: > > Hi, > > Murali Karicheri <m-karicheri2@ti.com> writes: > > > Hi Vinicius, > > > > On 10/11/2019 04:12 PM, Vinicius Costa Gomes wrote: > >> Hi Murali, > >> > >> Murali Karicheri <m-karicheri2@ti.com> writes: > >> > >>> Hi, > >>> > >>> I am testing the taprio (802.1Q Time Aware Shaper) as part of my > >>> pre-work to implement taprio hw offload and test. > >>> > >>> I was able to configure tap prio on my board and looking to do > >>> some traffic test and wondering how to play with the tc command > >>> to direct traffic to a specfic queue. For example I have setup > >>> taprio to create 5 traffic classes as shows below;- > >>> > >>> Now I plan to create iperf streams to pass through different > >>> gates. Now how do I use tc filters to mark the packets to > >>> go through these gates/queues? I heard about skbedit action > >>> in tc filter to change the priority field of SKB to allow > >>> the above mapping to happen. Any example that some one can > >>> point me to? > >> > >> What I have been using for testing these kinds of use cases (like iperf) > >> is to use an iptables rule to set the priority for some kinds of traffic. > >> > >> Something like this: > >> > >> sudo iptables -t mangle -A POSTROUTING -p udp --dport 7788 -j CLASSIFY --set-class 0:3 > > Let me try this. Yes. This is what I was looking for. I was trying > > something like this and I was getting an error > > > > tc filter add dev eth0 parent 100: protocol ip prio 10 u32 match ip > > dport 10000 0xffff flowid 100:3 > > RTNETLINK answers: Operation not supported > > We have an error talking to the kernel, -1 > > Hmm, taprio (or mqprio for that matter) doesn't support tc filter > blocks, so this won't work for those qdiscs. > > I never thought about adding support for it, it looks very interesting. > Thanks for pointing this out. I will add this to my todo list, but > anyone should feel free to beat me to it :-) > > > Cheers, > -- > Vinicius What do you mean taprio doesn't support tc filter blocks? What do you think there is to do in taprio to support that? I don't think Murali is asking for filter offloading, but merely for a way to direct frames to a certain traffic class on xmit from Linux. Something like this works perfectly fine: sudo tc qdisc add dev swp2 root handle 1: taprio num_tc 2 map 0 1 queues 1@0 1@1 base-time 1000 sched-entry S 03 300000 flags 2 # Add the qdisc holding the classifiers sudo tc qdisc add dev swp2 clsact # Steer L2 PTP to TC 1 (see with "tc filter show dev swp2 egress") sudo tc filter add dev swp2 egress prio 1 u32 match u16 0x88f7 0xffff at -2 action skbedit priority 1 However, the clsact qdisc and tc u32 egress filter can be replaced with proper use of the SO_PRIORITY API, which is preferable for new applications IMO. I'm trying to send a demo application to tools/testing/selftests/ which sends cyclic traffic through a raw L2 socket at a configurable base-time and cycle-time, along with the accompanying scripts to set up the receiver and bandwidth reservation on an in-between switch. But I have some trouble getting the sender application to work reliably at 100 us cycle-time, so it may take a while until I figure out with kernelshark what's going on. Regards, -Vladimir ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-13 21:10 ` Vladimir Oltean @ 2019-10-14 15:33 ` Murali Karicheri 2019-10-14 16:18 ` taprio testing with multiple streams Murali Karicheri 2019-10-14 23:39 ` taprio testing - Any help? Vinicius Costa Gomes 2019-10-14 23:14 ` Vinicius Costa Gomes 1 sibling, 2 replies; 17+ messages in thread From: Murali Karicheri @ 2019-10-14 15:33 UTC (permalink / raw) To: Vladimir Oltean, Vinicius Costa Gomes; +Cc: netdev On 10/13/2019 05:10 PM, Vladimir Oltean wrote: > Hi Vinicius, > > On Sat, 12 Oct 2019 at 00:28, Vinicius Costa Gomes > <vinicius.gomes@intel.com> wrote: >> >> Hi, >> >> Murali Karicheri <m-karicheri2@ti.com> writes: >> >>> Hi Vinicius, >>> >>> On 10/11/2019 04:12 PM, Vinicius Costa Gomes wrote: >>>> Hi Murali, >>>> >>>> Murali Karicheri <m-karicheri2@ti.com> writes: >>>> >>>>> Hi, >>>>> >>>>> I am testing the taprio (802.1Q Time Aware Shaper) as part of my >>>>> pre-work to implement taprio hw offload and test. >>>>> >>>>> I was able to configure tap prio on my board and looking to do >>>>> some traffic test and wondering how to play with the tc command >>>>> to direct traffic to a specfic queue. For example I have setup >>>>> taprio to create 5 traffic classes as shows below;- >>>>> >>>>> Now I plan to create iperf streams to pass through different >>>>> gates. Now how do I use tc filters to mark the packets to >>>>> go through these gates/queues? I heard about skbedit action >>>>> in tc filter to change the priority field of SKB to allow >>>>> the above mapping to happen. Any example that some one can >>>>> point me to? >>>> >>>> What I have been using for testing these kinds of use cases (like iperf) >>>> is to use an iptables rule to set the priority for some kinds of traffic. >>>> >>>> Something like this: >>>> >>>> sudo iptables -t mangle -A POSTROUTING -p udp --dport 7788 -j CLASSIFY --set-class 0:3 >>> Let me try this. Yes. This is what I was looking for. I was trying >>> something like this and I was getting an error >>> >>> tc filter add dev eth0 parent 100: protocol ip prio 10 u32 match ip >>> dport 10000 0xffff flowid 100:3 >>> RTNETLINK answers: Operation not supported >>> We have an error talking to the kernel, -1 >> >> Hmm, taprio (or mqprio for that matter) doesn't support tc filter >> blocks, so this won't work for those qdiscs. >> >> I never thought about adding support for it, it looks very interesting. >> Thanks for pointing this out. I will add this to my todo list, but >> anyone should feel free to beat me to it :-) >> >> >> Cheers, >> -- >> Vinicius > > What do you mean taprio doesn't support tc filter blocks? What do you > think there is to do in taprio to support that? > I don't think Murali is asking for filter offloading, but merely for a > way to direct frames to a certain traffic class on xmit from Linux. Yes. Thanks Vladimir for clarifying this. > Something like this works perfectly fine: > > sudo tc qdisc add dev swp2 root handle 1: taprio num_tc 2 map 0 1 > queues 1@0 1@1 base-time 1000 sched-entry S 03 300000 flags 2 > # Add the qdisc holding the classifiers > sudo tc qdisc add dev swp2 clsact May be that is what is missing in my step. Is this a required step to enable classifier? > # Steer L2 PTP to TC 1 (see with "tc filter show dev swp2 egress") > sudo tc filter add dev swp2 egress prio 1 u32 match u16 0x88f7 0xffff > at -2 action skbedit priority 1 Yes. Perfect if this can be done currently. > > However, the clsact qdisc and tc u32 egress filter can be replaced > with proper use of the SO_PRIORITY API, which is preferable for new > applications IMO. > I have used this with success. But in our regular release flow, we would like to use standard tools like tc or something like that to direct packets to appropriate queues/tc. > I'm trying to send a demo application to tools/testing/selftests/ > which sends cyclic traffic through a raw L2 socket at a configurable > base-time and cycle-time, along with the accompanying scripts to set > up the receiver and bandwidth reservation on an in-between switch. But > I have some trouble getting the sender application to work reliably at > 100 us cycle-time, so it may take a while until I figure out with > kernelshark what's going on. That will be great! Using taprio, I am trying the below testcase and expect traffic for 4 msec from a specific traffic class (port in this case) and then follwed traffic from the other based on the taprio schedule.But a wireshark capture shows inter-mixed traffic at the wire. For example I set up my taprio as follows:- tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 5 \ map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 \ queues 1@0 1@1 1@2 1@3 1@4 \ base-time 1564768921123459533 \ sched-entry S 01 4000000 \ sched-entry S 02 4000000 \ sched-entry S 04 4000000 \ sched-entry S 08 4000000 \ sched-entry S 10 4000000 \ clockid CLOCK_TAI My taprio schedule shows a 20 msec cycle-time as below root@am57xx-evm:~# tc qdisc show dev eth0 qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 offset 4 count 1 clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 base-time 1564768921123459533 cycle-time 20000000 cycle- time-extension 0 index 0 cmd S gatemask 0x1 interval 4000000 index 1 cmd S gatemask 0x2 interval 4000000 index 2 cmd S gatemask 0x4 interval 4000000 index 3 cmd S gatemask 0x8 interval 4000000 index 4 cmd S gatemask 0x10 interval 4000000 qdisc pfifo 0: parent 100:5 limit 1000p qdisc pfifo 0: parent 100:4 limit 1000p qdisc pfifo 0: parent 100:3 limit 1000p qdisc pfifo 0: parent 100:2 limit 1000p qdisc pfifo 0: parent 100:1 limit 1000p Now I classify packets using the iptables command from Vincius which works to do the job for this test. iptables -t mangle -A POSTROUTING -p udp --dport 10000 -j CLASSIFY --set-class 0:1 iptables -t mangle -A POSTROUTING -p udp --dport 20000 -j CLASSIFY --set-class 0:2 iptables -t mangle -A POSTROUTING -p udp --dport 30000 -j CLASSIFY --set-class 0:3 iptables -t mangle -A POSTROUTING -p udp --dport 40000 -j CLASSIFY --set-class 0:4 I set up 4 iperf UDP streams as follows:- iperf -c 192.168.2.10 -u -p 10000 -t60& iperf -c 192.168.2.10 -u -p 20000 -t60& iperf -c 192.168.2.10 -u -p 30000 -t60& iperf -c 192.168.2.10 -u -p 40000 -t60& My expectation is as follows AAAAAABBBBBCCCCCDDDDDEEEEE Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. Each can be max of 4 msec. Is the expection correct? At least that is my understanding. But what I see is alternating packets with port 10000/20000/30000/40000 at the wireshark capture and it doesn't make sense to me. If you look at the timestamp, there is nothing showing the Gate is honored for Tx. Am I missing something? The tc stats shows packets are going through specific TC/Gate root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1 offset 4 count 1 clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 base-time 1564768921123459533 cycle-time 20000000 cycle- time-extension 0 index 0 cmd S gatemask 0x1 interval 4000000 index 1 cmd S gatemask 0x2 interval 4000000 index 2 cmd S gatemask 0x4 interval 4000000 index 3 cmd S gatemask 0x8 interval 4000000 index 4 cmd S gatemask 0x10 interval 4000000 Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo 0: parent 100:5 limit 1000p Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo 0: parent 100:4 limit 1000p Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo 0: parent 100:3 limit 1000p Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo 0: parent 100:2 limit 1000p Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc pfifo 0: parent 100:1 limit 1000p Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Also my hardware queue stats shows frames going through correct queues. Am I missing something? root@am57xx-evm:~# ethtool -S eth0 NIC statistics: Good Rx Frames: 251 Broadcast Rx Frames: 223 Multicast Rx Frames: 2 Pause Rx Frames: 0 Rx CRC Errors: 0 Rx Align/Code Errors: 0 Oversize Rx Frames: 0 Rx Jabbers: 0 Undersize (Short) Rx Frames: 0 Rx Fragments: 0 Rx Octets: 99747 Good Tx Frames: 75837 Broadcast Tx Frames: 89 Multicast Tx Frames: 97 Pause Tx Frames: 0 Deferred Tx Frames: 0 Collisions: 0 Single Collision Tx Frames: 0 Multiple Collision Tx Frames: 0 Excessive Collisions: 0 Late Collisions: 0 Tx Underrun: 0 Carrier Sense Errors: 0 Tx Octets: 114715759 Rx + Tx 64 Octet Frames: 11 Rx + Tx 65-127 Octet Frames: 89 Rx + Tx 128-255 Octet Frames: 6 Rx + Tx 256-511 Octet Frames: 326 Rx + Tx 512-1023 Octet Frames: 0 Rx + Tx 1024-Up Octet Frames: 75656 Net Octets: 114815506 Rx Start of Frame Overruns: 0 Rx Middle of Frame Overruns: 0 Rx DMA Overruns: 0 Rx DMA chan 0: head_enqueue: 2 Rx DMA chan 0: tail_enqueue: 505 Rx DMA chan 0: pad_enqueue: 0 Rx DMA chan 0: misqueued: 0 Rx DMA chan 0: desc_alloc_fail: 0 Rx DMA chan 0: pad_alloc_fail: 0 Rx DMA chan 0: runt_receive_buf: 0 Rx DMA chan 0: runt_transmit_bu: 0 Rx DMA chan 0: empty_dequeue: 0 Rx DMA chan 0: busy_dequeue: 244 Rx DMA chan 0: good_dequeue: 252 Rx DMA chan 0: requeue: 1 Rx DMA chan 0: teardown_dequeue: 127 Tx DMA chan 0: head_enqueue: 15631 Tx DMA chan 0: tail_enqueue: 1 Tx DMA chan 0: pad_enqueue: 0 Tx DMA chan 0: misqueued: 1 Tx DMA chan 0: desc_alloc_fail: 0 Tx DMA chan 0: pad_alloc_fail: 0 Tx DMA chan 0: runt_receive_buf: 0 Tx DMA chan 0: runt_transmit_bu: 11 Tx DMA chan 0: empty_dequeue: 15632 Tx DMA chan 0: busy_dequeue: 0 Tx DMA chan 0: good_dequeue: 15632 Tx DMA chan 0: requeue: 0 Tx DMA chan 0: teardown_dequeue: 0 Tx DMA chan 1: head_enqueue: 15284 Tx DMA chan 1: tail_enqueue: 0 Tx DMA chan 1: pad_enqueue: 0 Tx DMA chan 1: misqueued: 0 Tx DMA chan 1: desc_alloc_fail: 0 Tx DMA chan 1: pad_alloc_fail: 0 Tx DMA chan 1: runt_receive_buf: 0 Tx DMA chan 1: runt_transmit_bu: 0 Tx DMA chan 1: empty_dequeue: 15284 Tx DMA chan 1: busy_dequeue: 0 Tx DMA chan 1: good_dequeue: 15284 Tx DMA chan 1: requeue: 0 Tx DMA chan 1: teardown_dequeue: 0 Tx DMA chan 2: head_enqueue: 23513 Tx DMA chan 2: tail_enqueue: 0 Tx DMA chan 2: pad_enqueue: 0 Tx DMA chan 2: misqueued: 0 Tx DMA chan 2: desc_alloc_fail: 0 Tx DMA chan 2: pad_alloc_fail: 0 Tx DMA chan 2: runt_receive_buf: 0 Tx DMA chan 2: runt_transmit_bu: 2 Tx DMA chan 2: empty_dequeue: 23513 Tx DMA chan 2: busy_dequeue: 0 Tx DMA chan 2: good_dequeue: 23513 Tx DMA chan 2: requeue: 0 Tx DMA chan 2: teardown_dequeue: 0 Tx DMA chan 3: head_enqueue: 10704 Tx DMA chan 3: tail_enqueue: 0 Tx DMA chan 3: pad_enqueue: 0 Tx DMA chan 3: misqueued: 0 Tx DMA chan 3: desc_alloc_fail: 0 Tx DMA chan 3: pad_alloc_fail: 0 Tx DMA chan 3: runt_receive_buf: 0 Tx DMA chan 3: runt_transmit_bu: 0 Tx DMA chan 3: empty_dequeue: 10704 Tx DMA chan 3: busy_dequeue: 0 Tx DMA chan 3: good_dequeue: 10704 Tx DMA chan 3: requeue: 0 Tx DMA chan 3: teardown_dequeue: 0 Tx DMA chan 4: head_enqueue: 10704 Tx DMA chan 4: tail_enqueue: 0 Tx DMA chan 4: pad_enqueue: 0 Tx DMA chan 4: misqueued: 0 Tx DMA chan 4: desc_alloc_fail: 0 Tx DMA chan 4: pad_alloc_fail: 0 Tx DMA chan 4: runt_receive_buf: 0 Tx DMA chan 4: runt_transmit_bu: 0 Tx DMA chan 4: empty_dequeue: 10704 Tx DMA chan 4: busy_dequeue: 0 Tx DMA chan 4: good_dequeue: 10704 Tx DMA chan 4: requeue: 0 Tx DMA chan 4: teardown_dequeue: 0 I am on a 4.19.y kernel with patches specific to taprio backported. Am I missing anything related to taprio. I will try on the latest master branch as well. But if you can point out anything that will be helpful. ce4ca3f9dd9b6fc9652d65f4a9ddf29b58f8db33 (HEAD -> LCPD-17228-v1) net: sched: sch_taprio: fix memleak in error path for sched list parse 11b521a046feff4373a499eadf4ecf884a9d8624 net: sched: sch_taprio: back reverted counterpart for rebase 372d2da4ce26ff832a5693e909689fe2b76712c6 taprio: Adjust timestamps for TCP packets 72039934f2f6c959010e2ba848e7e627a2432bfd taprio: make clock reference conversions easier 7529028441203b3d80a3700e3694fd4147ed16fd taprio: Add support for txtime-assist mode 9d96c8e4518b643b848de44986847408e9b6194a taprio: Remove inline directive 0294b3b4bc059427fceff20948f9b30fbb4e2d43 etf: Add skip_sock_check e4eb6c594326ea9d4dfe26241fc78df0c765d994 taprio: calculate cycle_time when schedule is installed a4ed85ac58c396d257b823003c5c9a9d684563d9 etf: Don't use BIT() in UAPI headers. fc5e4f7fcc9d39c3581e0a5ae15a04e609260dc7 taprio: add null check on sched_nest to avoid potential null pointer dereference bc0749127e5a423ed9ef95962a83f526352a38a6 taprio: Add support for cycle-time-extension aad8879bfc49bfb7a259377f64a1d44809b97552 taprio: Add support for setting the cycle-time manually 768089594ced5386b748d9049f86355d620be92d taprio: Add support adding an admin schedule 1298a8a6fe373234f0d7506cc9dc6182133151ef net: sched: sch_taprio: align code the following patches to be applied 8c87fe004222625d79025f54eaa893b906f19cd4 taprio: Fix potencial use of invalid memory during dequeue() 26cdb540c971b20b9b68a5fe8fff45099991c081 net/sched: taprio: fix build without 64bit div 35e63bea024f00f0cb22e40b6e47d0c7290d588b net: sched: taprio: Fix taprio_dequeue() 50d8609b6a09a12f019728474449d0e4f19ed297 net: sched: taprio: Fix taprio_peek() bd6ad2179e7d0371d7051155626e11a5ce8eeb40 net: sched: taprio: Remove should_restart_cycle() 4841094710044dd7f2556dd808e6c34ff0f40697 net: sched: taprio: Refactor taprio_get_start_time() ec9b38a4a789cc82f1198f5135eb222852776d21 net/sched: taprio: fix picos_per_byte miscalculation 28b669c38bf8d03c8d800cf63e33861886ea0100 tc: Add support for configuring the taprio scheduler eed00d0cfbc52d8899d935f3c6f740aa6dadfa39 net_sched: sch_fq: remove dead code dealing with retransmits d9c6403ca03399062a219a61f86dd3c86ed573d8 tcp: switch tcp_internal_pacing() to tcp_wstamp_ns 1e7d920773743f58c3de660e47c3b72e5a50d912 tcp: switch tcp and sch_fq to new earliest departure time model c0554b84146dd58893f0e0cb6ccdeadb0893e22c tcp: switch internal pacing timer to CLOCK_TAI 5ac72108dd580ba9ace028e7dd7e325347bcbe69 tcp: provide earliest departure time in skb->tstamp a59ff4b92003169483b2f3548e6f8245b1ae1f28 tcp: add tcp_wstamp_ns socket field fb534f9a4e8e96f3688491a901df363d14f6806d net_sched: sch_fq: switch to CLOCK_TAI fb898b71da8caadb6221e3f8a71417389cb58c46 tcp: introduce tcp_skb_timestamp_us() helper 1948ca0e3cab893be3375b12e552e6c0751458b1 net: sched: rename qdisc_destroy() to qdisc_put() 0cabba2b47949524cacbb68678767307a4f0a23e (tag: ti2019.04-rc4, lcpd/ti-linux-4.19.y) Merged TI feature connectivity into ti-linux-4.19.y > > Regards, > -Vladimir > ^ permalink raw reply [flat|nested] 17+ messages in thread
* taprio testing with multiple streams 2019-10-14 15:33 ` Murali Karicheri @ 2019-10-14 16:18 ` Murali Karicheri 2019-10-14 23:39 ` taprio testing - Any help? Vinicius Costa Gomes 1 sibling, 0 replies; 17+ messages in thread From: Murali Karicheri @ 2019-10-14 16:18 UTC (permalink / raw) To: Vladimir Oltean, Vinicius Costa Gomes; +Cc: netdev Hi Vinicius, Would like to branch off to a separate discussion on taprio testing with multiple streams. Below is what I have discussed earlier in the context of the other thread on 'Re: taprio testing - Any help?' Could you please review the below and let me know if I am doing anything wrong to configure taprio. I understand that Gate open/close wouldn't be acurate for software implementation, but this behavior I am seeing is not correct IMO. So wondering if I missed any steps or my understanding is wrong. Additionally I see a kernel crash when I configure using 8 Gates where my driver has support for 8 hw queues. With 7 Gates, it works fine without crash. I can provide more logs on this if needed. Thanks and regards, Murali On 10/14/2019 11:33 AM, Murali Karicheri wrote: > On 10/13/2019 05:10 PM, Vladimir Oltean wrote: >> Hi Vinicius, >> > > Using taprio, I am trying the below testcase and expect traffic for > 4 msec from a specific traffic class (port in this case) and then > follwed traffic from the other based on the taprio schedule.But a > wireshark capture shows inter-mixed traffic at the wire. > > For example I set up my taprio as follows:- > > tc qdisc replace dev eth0 parent root handle 100 taprio \ > num_tc 5 \ > map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 \ > queues 1@0 1@1 1@2 1@3 1@4 \ > base-time 1564768921123459533 \ > sched-entry S 01 4000000 \ > sched-entry S 02 4000000 \ > sched-entry S 04 4000000 \ > sched-entry S 08 4000000 \ > sched-entry S 10 4000000 \ > clockid CLOCK_TAI > > > My taprio schedule shows a 20 msec cycle-time as below > > root@am57xx-evm:~# tc qdisc show dev eth0 > qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 > queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count > 1 offset 4 count 1 > clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 > base-time 1564768921123459533 cycle-time 20000000 cycle- > time-extension 0 > index 0 cmd S gatemask 0x1 interval 4000000 > index 1 cmd S gatemask 0x2 interval 4000000 > index 2 cmd S gatemask 0x4 interval 4000000 > index 3 cmd S gatemask 0x8 interval 4000000 > index 4 cmd S gatemask 0x10 interval 4000000 > > qdisc pfifo 0: parent 100:5 limit 1000p > qdisc pfifo 0: parent 100:4 limit 1000p > qdisc pfifo 0: parent 100:3 limit 1000p > qdisc pfifo 0: parent 100:2 limit 1000p > qdisc pfifo 0: parent 100:1 limit 1000p > > > Now I classify packets using the iptables command from Vincius which > works to do the job for this test. > > > iptables -t mangle -A POSTROUTING -p udp --dport 10000 -j CLASSIFY > --set-class 0:1 > iptables -t mangle -A POSTROUTING -p udp --dport 20000 -j CLASSIFY > --set-class 0:2 > iptables -t mangle -A POSTROUTING -p udp --dport 30000 -j CLASSIFY > --set-class 0:3 > iptables -t mangle -A POSTROUTING -p udp --dport 40000 -j CLASSIFY > --set-class 0:4 > > I set up 4 iperf UDP streams as follows:- > > iperf -c 192.168.2.10 -u -p 10000 -t60& > iperf -c 192.168.2.10 -u -p 20000 -t60& > iperf -c 192.168.2.10 -u -p 30000 -t60& > iperf -c 192.168.2.10 -u -p 40000 -t60& > > My expectation is as follows > > AAAAAABBBBBCCCCCDDDDDEEEEE > > Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 > CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. > Each can be max of 4 msec. Is the expection correct? At least that > is my understanding. > > But what I see is alternating packets with port 10000/20000/30000/40000 > at the wireshark capture and it doesn't make sense to me. If you > look at the timestamp, there is nothing showing the Gate is honored > for Tx. Am I missing something? > > The tc stats shows packets are going through specific TC/Gate > > root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 > qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 > queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count > 1 offset 4 count 1 > clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 > base-time 1564768921123459533 cycle-time 20000000 cycle- > time-extension 0 > index 0 cmd S gatemask 0x1 interval 4000000 > index 1 cmd S gatemask 0x2 interval 4000000 > index 2 cmd S gatemask 0x4 interval 4000000 > index 3 cmd S gatemask 0x8 interval 4000000 > index 4 cmd S gatemask 0x10 interval 4000000 > > Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:5 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:4 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:3 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:2 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:1 limit 1000p > Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > Also my hardware queue stats shows frames going through correct queues. > Am I missing something? > > root@am57xx-evm:~# ethtool -S eth0 > NIC statistics: > Good Rx Frames: 251 > Broadcast Rx Frames: 223 > Multicast Rx Frames: 2 > Pause Rx Frames: 0 > Rx CRC Errors: 0 > Rx Align/Code Errors: 0 > Oversize Rx Frames: 0 > Rx Jabbers: 0 > Undersize (Short) Rx Frames: 0 > Rx Fragments: 0 > Rx Octets: 99747 > Good Tx Frames: 75837 > Broadcast Tx Frames: 89 > Multicast Tx Frames: 97 > Pause Tx Frames: 0 > Deferred Tx Frames: 0 > Collisions: 0 > Single Collision Tx Frames: 0 > Multiple Collision Tx Frames: 0 > Excessive Collisions: 0 > Late Collisions: 0 > Tx Underrun: 0 > Carrier Sense Errors: 0 > Tx Octets: 114715759 > Rx + Tx 64 Octet Frames: 11 > Rx + Tx 65-127 Octet Frames: 89 > Rx + Tx 128-255 Octet Frames: 6 > Rx + Tx 256-511 Octet Frames: 326 > Rx + Tx 512-1023 Octet Frames: 0 > Rx + Tx 1024-Up Octet Frames: 75656 > Net Octets: 114815506 > Rx Start of Frame Overruns: 0 > Rx Middle of Frame Overruns: 0 > Rx DMA Overruns: 0 > Rx DMA chan 0: head_enqueue: 2 > Rx DMA chan 0: tail_enqueue: 505 > Rx DMA chan 0: pad_enqueue: 0 > Rx DMA chan 0: misqueued: 0 > Rx DMA chan 0: desc_alloc_fail: 0 > Rx DMA chan 0: pad_alloc_fail: 0 > Rx DMA chan 0: runt_receive_buf: 0 > Rx DMA chan 0: runt_transmit_bu: 0 > Rx DMA chan 0: empty_dequeue: 0 > Rx DMA chan 0: busy_dequeue: 244 > Rx DMA chan 0: good_dequeue: 252 > Rx DMA chan 0: requeue: 1 > Rx DMA chan 0: teardown_dequeue: 127 > Tx DMA chan 0: head_enqueue: 15631 > Tx DMA chan 0: tail_enqueue: 1 > Tx DMA chan 0: pad_enqueue: 0 > Tx DMA chan 0: misqueued: 1 > Tx DMA chan 0: desc_alloc_fail: 0 > Tx DMA chan 0: pad_alloc_fail: 0 > Tx DMA chan 0: runt_receive_buf: 0 > Tx DMA chan 0: runt_transmit_bu: 11 > Tx DMA chan 0: empty_dequeue: 15632 > Tx DMA chan 0: busy_dequeue: 0 > Tx DMA chan 0: good_dequeue: 15632 > Tx DMA chan 0: requeue: 0 > Tx DMA chan 0: teardown_dequeue: 0 > Tx DMA chan 1: head_enqueue: 15284 > Tx DMA chan 1: tail_enqueue: 0 > Tx DMA chan 1: pad_enqueue: 0 > Tx DMA chan 1: misqueued: 0 > Tx DMA chan 1: desc_alloc_fail: 0 > Tx DMA chan 1: pad_alloc_fail: 0 > Tx DMA chan 1: runt_receive_buf: 0 > Tx DMA chan 1: runt_transmit_bu: 0 > Tx DMA chan 1: empty_dequeue: 15284 > Tx DMA chan 1: busy_dequeue: 0 > Tx DMA chan 1: good_dequeue: 15284 > Tx DMA chan 1: requeue: 0 > Tx DMA chan 1: teardown_dequeue: 0 > Tx DMA chan 2: head_enqueue: 23513 > Tx DMA chan 2: tail_enqueue: 0 > Tx DMA chan 2: pad_enqueue: 0 > Tx DMA chan 2: misqueued: 0 > Tx DMA chan 2: desc_alloc_fail: 0 > Tx DMA chan 2: pad_alloc_fail: 0 > Tx DMA chan 2: runt_receive_buf: 0 > Tx DMA chan 2: runt_transmit_bu: 2 > Tx DMA chan 2: empty_dequeue: 23513 > Tx DMA chan 2: busy_dequeue: 0 > Tx DMA chan 2: good_dequeue: 23513 > Tx DMA chan 2: requeue: 0 > Tx DMA chan 2: teardown_dequeue: 0 > Tx DMA chan 3: head_enqueue: 10704 > Tx DMA chan 3: tail_enqueue: 0 > Tx DMA chan 3: pad_enqueue: 0 > Tx DMA chan 3: misqueued: 0 > Tx DMA chan 3: desc_alloc_fail: 0 > Tx DMA chan 3: pad_alloc_fail: 0 > Tx DMA chan 3: runt_receive_buf: 0 > Tx DMA chan 3: runt_transmit_bu: 0 > Tx DMA chan 3: empty_dequeue: 10704 > Tx DMA chan 3: busy_dequeue: 0 > Tx DMA chan 3: good_dequeue: 10704 > Tx DMA chan 3: requeue: 0 > Tx DMA chan 3: teardown_dequeue: 0 > Tx DMA chan 4: head_enqueue: 10704 > Tx DMA chan 4: tail_enqueue: 0 > Tx DMA chan 4: pad_enqueue: 0 > Tx DMA chan 4: misqueued: 0 > Tx DMA chan 4: desc_alloc_fail: 0 > Tx DMA chan 4: pad_alloc_fail: 0 > Tx DMA chan 4: runt_receive_buf: 0 > Tx DMA chan 4: runt_transmit_bu: 0 > Tx DMA chan 4: empty_dequeue: 10704 > Tx DMA chan 4: busy_dequeue: 0 > Tx DMA chan 4: good_dequeue: 10704 > Tx DMA chan 4: requeue: 0 > Tx DMA chan 4: teardown_dequeue: 0 > > > I am on a 4.19.y kernel with patches specific to taprio > backported. Am I missing anything related to taprio. I will > try on the latest master branch as well. But if you can point out > anything that will be helpful. > Just want to let you know that I see the same behavior with the v5.3 on upstream master branch. > ce4ca3f9dd9b6fc9652d65f4a9ddf29b58f8db33 (HEAD -> LCPD-17228-v1) net: > sched: sch_taprio: fix memleak in error path for sched list parse > 11b521a046feff4373a499eadf4ecf884a9d8624 net: sched: sch_taprio: back > reverted counterpart for rebase > 372d2da4ce26ff832a5693e909689fe2b76712c6 taprio: Adjust timestamps for > TCP packets > 72039934f2f6c959010e2ba848e7e627a2432bfd taprio: make clock reference > conversions easier > 7529028441203b3d80a3700e3694fd4147ed16fd taprio: Add support for > txtime-assist mode > 9d96c8e4518b643b848de44986847408e9b6194a taprio: Remove inline directive > 0294b3b4bc059427fceff20948f9b30fbb4e2d43 etf: Add skip_sock_check > e4eb6c594326ea9d4dfe26241fc78df0c765d994 taprio: calculate cycle_time > when schedule is installed > a4ed85ac58c396d257b823003c5c9a9d684563d9 etf: Don't use BIT() in UAPI > headers. > fc5e4f7fcc9d39c3581e0a5ae15a04e609260dc7 taprio: add null check on > sched_nest to avoid potential null pointer dereference > bc0749127e5a423ed9ef95962a83f526352a38a6 taprio: Add support for > cycle-time-extension > aad8879bfc49bfb7a259377f64a1d44809b97552 taprio: Add support for setting > the cycle-time manually > 768089594ced5386b748d9049f86355d620be92d taprio: Add support adding an > admin schedule > 1298a8a6fe373234f0d7506cc9dc6182133151ef net: sched: sch_taprio: align > code the following patches to be applied > 8c87fe004222625d79025f54eaa893b906f19cd4 taprio: Fix potencial use of > invalid memory during dequeue() > 26cdb540c971b20b9b68a5fe8fff45099991c081 net/sched: taprio: fix build > without 64bit div > 35e63bea024f00f0cb22e40b6e47d0c7290d588b net: sched: taprio: Fix > taprio_dequeue() > 50d8609b6a09a12f019728474449d0e4f19ed297 net: sched: taprio: Fix > taprio_peek() > bd6ad2179e7d0371d7051155626e11a5ce8eeb40 net: sched: taprio: Remove > should_restart_cycle() > 4841094710044dd7f2556dd808e6c34ff0f40697 net: sched: taprio: Refactor > taprio_get_start_time() > ec9b38a4a789cc82f1198f5135eb222852776d21 net/sched: taprio: fix > picos_per_byte miscalculation > 28b669c38bf8d03c8d800cf63e33861886ea0100 tc: Add support for configuring > the taprio scheduler > eed00d0cfbc52d8899d935f3c6f740aa6dadfa39 net_sched: sch_fq: remove dead > code dealing with retransmits > d9c6403ca03399062a219a61f86dd3c86ed573d8 tcp: switch > tcp_internal_pacing() to tcp_wstamp_ns > 1e7d920773743f58c3de660e47c3b72e5a50d912 tcp: switch tcp and sch_fq to > new earliest departure time model > c0554b84146dd58893f0e0cb6ccdeadb0893e22c tcp: switch internal pacing > timer to CLOCK_TAI > 5ac72108dd580ba9ace028e7dd7e325347bcbe69 tcp: provide earliest departure > time in skb->tstamp > a59ff4b92003169483b2f3548e6f8245b1ae1f28 tcp: add tcp_wstamp_ns socket > field > fb534f9a4e8e96f3688491a901df363d14f6806d net_sched: sch_fq: switch to > CLOCK_TAI > fb898b71da8caadb6221e3f8a71417389cb58c46 tcp: introduce > tcp_skb_timestamp_us() helper > 1948ca0e3cab893be3375b12e552e6c0751458b1 net: sched: rename > qdisc_destroy() to qdisc_put() > 0cabba2b47949524cacbb68678767307a4f0a23e (tag: ti2019.04-rc4, > lcpd/ti-linux-4.19.y) Merged TI feature connectivity into ti-linux-4.19.y > >> >> Regards, >> -Vladimir >> > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-14 15:33 ` Murali Karicheri 2019-10-14 16:18 ` taprio testing with multiple streams Murali Karicheri @ 2019-10-14 23:39 ` Vinicius Costa Gomes 2019-10-16 17:02 ` Murali Karicheri 1 sibling, 1 reply; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-14 23:39 UTC (permalink / raw) To: Murali Karicheri, Vladimir Oltean; +Cc: netdev Murali Karicheri <m-karicheri2@ti.com> writes: > > My expectation is as follows > > AAAAAABBBBBCCCCCDDDDDEEEEE > > Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 > CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. > Each can be max of 4 msec. Is the expection correct? At least that > is my understanding. Your expectation is correct. > > But what I see is alternating packets with port 10000/20000/30000/40000 > at the wireshark capture and it doesn't make sense to me. If you > look at the timestamp, there is nothing showing the Gate is honored > for Tx. Am I missing something? Remember that taprio (in software mode) has no control after the packet is delivered to the driver. So, even if taprio obeys your traffic schedule perfectly, the driver/controller may decide to send packets according to some other logic. > > The tc stats shows packets are going through specific TC/Gate > > root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 > qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 > queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count > 1 offset 4 count 1 > clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 > base-time 1564768921123459533 cycle-time 20000000 cycle- > time-extension 0 > index 0 cmd S gatemask 0x1 interval 4000000 > index 1 cmd S gatemask 0x2 interval 4000000 > index 2 cmd S gatemask 0x4 interval 4000000 > index 3 cmd S gatemask 0x8 interval 4000000 > index 4 cmd S gatemask 0x10 interval 4000000 > > Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:5 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:4 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:3 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:2 limit 1000p > Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc pfifo 0: parent 100:1 limit 1000p > Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > Also my hardware queue stats shows frames going through correct queues. > Am I missing something? > What I usually see in these cases, are that the borders (from A to B, for example) are usually messy, the middle of each entry are more well behaved. But there are things that could improve the behavior: reducing TX DMA coalescing, reducing the number of packet buffers in use in the controller, disabling power saving features, that kind of thing. If you are already doing something like this, then I would like to know more, that could indicate a problem. [...] > I am on a 4.19.y kernel with patches specific to taprio > backported. Am I missing anything related to taprio. I will > try on the latest master branch as well. But if you can point out > anything that will be helpful. > [...] > lcpd/ti-linux-4.19.y) Merged TI feature connectivity into > ti-linux-4.19.y I can't think of anything else. > >> >> Regards, >> -Vladimir >> Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-14 23:39 ` taprio testing - Any help? Vinicius Costa Gomes @ 2019-10-16 17:02 ` Murali Karicheri 2019-10-16 17:14 ` Murali Karicheri 2019-10-16 20:32 ` Vinicius Costa Gomes 0 siblings, 2 replies; 17+ messages in thread From: Murali Karicheri @ 2019-10-16 17:02 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev Hi Vinicius, On 10/14/2019 07:39 PM, Vinicius Costa Gomes wrote: > Murali Karicheri <m-karicheri2@ti.com> writes: >> >> My expectation is as follows >> >> AAAAAABBBBBCCCCCDDDDDEEEEE >> >> Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 >> CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. >> Each can be max of 4 msec. Is the expection correct? At least that >> is my understanding. > > Your expectation is correct. > >> >> But what I see is alternating packets with port 10000/20000/30000/40000 >> at the wireshark capture and it doesn't make sense to me. If you >> look at the timestamp, there is nothing showing the Gate is honored >> for Tx. Am I missing something? > > Remember that taprio (in software mode) has no control after the packet > is delivered to the driver. So, even if taprio obeys your traffic > schedule perfectly, the driver/controller may decide to send packets > according to some other logic. > That is true. I think I get why it can't work without ETF offload which is missing in our hardware. Here is what my understanding. Please correct it if wrong. Our hardware has priority queues implemented. So if there are no packets in the higher priority queue, it would send from the lower priority ones. Assuming packets gets dequeue-ed correctly by taprio and that packets are only in one of the lower priority TC. i.e in the above example, BBBBBB are present when TC1 Gate is open. Assuming there are more packets than actually sent out during TC1 window, and assuming no packets in the TC0 queue (AAAAA is absent) then hardware will continue to send from TC1 queue. So that might be what is happening, right? So it is required to deliver frames to driver only when the Gate for the specific traffic class is open. Is that what is done by ETF qdisc? From ETF description at http://man7.org/linux/man-pages/man8/tc-etf.8.html 'The ETF (Earliest TxTime First) qdisc allows applications to control the instant when a packet should be dequeued from the traffic control layer into the netdevice'. So I assume, when I use iperf (there is no txtime information in the packet), I still can use ETF and packet time will be modified to match with schedule and then get dequeue-ed at correct time to arrive at the driver during the Gate open of taprio. Is this correct? If ETF can schedule packet to arrive at the driver just during th Gate open and work in sync with taprio scheduler, that would do the work.I understand the border may be difficult to manage. However if we add a guard band by adding an extra entry with all Gates closed between schedules for guard band duration, it should allow hardware to flush out any remaining frames from the queue outside its Gate duration. If my understanding is correct, can I use software ETF qdisc in this case? If so how do I configure it? Any example? >> >> The tc stats shows packets are going through specific TC/Gate >> >> root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 >> qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 >> queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count >> 1 offset 4 count 1 >> clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 >> base-time 1564768921123459533 cycle-time 20000000 cycle- >> time-extension 0 >> index 0 cmd S gatemask 0x1 interval 4000000 >> index 1 cmd S gatemask 0x2 interval 4000000 >> index 2 cmd S gatemask 0x4 interval 4000000 >> index 3 cmd S gatemask 0x8 interval 4000000 >> index 4 cmd S gatemask 0x10 interval 4000000 >> >> Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc pfifo 0: parent 100:5 limit 1000p >> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc pfifo 0: parent 100:4 limit 1000p >> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc pfifo 0: parent 100:3 limit 1000p >> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc pfifo 0: parent 100:2 limit 1000p >> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> qdisc pfifo 0: parent 100:1 limit 1000p >> Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> >> Also my hardware queue stats shows frames going through correct queues. >> Am I missing something? >> > > What I usually see in these cases, are that the borders (from A to B, > for example) are usually messy, the middle of each entry are more well > behaved. > OK > But there are things that could improve the behavior: reducing TX DMA > coalescing, reducing the number of packet buffers in use in the > controller, disabling power saving features, that kind of thing. I can try playing with the number if descriptors used. But from my above response, I might have to use software ETF qdisc along with taprio to have packets in the correct order on the wire. So will wait on this for now. > > If you are already doing something like this, then I would like to know > more, that could indicate a problem. > No. The hardware just implement a priority queue scheme. I can control the number of buffers or descriptors. Murali > [...] > >> I am on a 4.19.y kernel with patches specific to taprio >> backported. Am I missing anything related to taprio. I will >> try on the latest master branch as well. But if you can point out >> anything that will be helpful. >> > > [...] > >> lcpd/ti-linux-4.19.y) Merged TI feature connectivity into >> ti-linux-4.19.y > > I can't think of anything else. > >> >>> >>> Regards, >>> -Vladimir >>> > > Cheers, > -- > Vinicius > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-16 17:02 ` Murali Karicheri @ 2019-10-16 17:14 ` Murali Karicheri 2019-10-16 17:22 ` Murali Karicheri 2019-10-16 20:32 ` Vinicius Costa Gomes 1 sibling, 1 reply; 17+ messages in thread From: Murali Karicheri @ 2019-10-16 17:14 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev On 10/16/2019 01:02 PM, Murali Karicheri wrote: > Hi Vinicius, > > On 10/14/2019 07:39 PM, Vinicius Costa Gomes wrote: >> Murali Karicheri <m-karicheri2@ti.com> writes: >>> >>> My expectation is as follows >>> >>> AAAAAABBBBBCCCCCDDDDDEEEEE >>> >>> Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 >>> CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. >>> Each can be max of 4 msec. Is the expection correct? At least that >>> is my understanding. >> >> Your expectation is correct. >> >>> >>> But what I see is alternating packets with port 10000/20000/30000/40000 >>> at the wireshark capture and it doesn't make sense to me. If you >>> look at the timestamp, there is nothing showing the Gate is honored >>> for Tx. Am I missing something? >> >> Remember that taprio (in software mode) has no control after the packet >> is delivered to the driver. So, even if taprio obeys your traffic >> schedule perfectly, the driver/controller may decide to send packets >> according to some other logic. >> > That is true. > > I think I get why it can't work without ETF offload which is missing in > our hardware. Here is what my understanding. Please correct it if wrong. > > Our hardware has priority queues implemented. So if there are no > packets in the higher priority queue, it would send from the lower > priority ones. Assuming packets gets dequeue-ed correctly by > taprio and that packets are only in one of the lower priority TC. > i.e in the above example, BBBBBB are present when TC1 Gate is open. > Assuming there are more packets than actually sent out during TC1 > window, and assuming no packets in the TC0 queue (AAAAA is absent) > then hardware will continue to send from TC1 queue. So that might > be what is happening, right? > > So it is required to deliver frames to driver only when the Gate for > the specific traffic class is open. Plus, number of packets delivered should be based on available time in the current window. >Is that what is done by ETF qdisc? > From ETF description at > http://man7.org/linux/man-pages/man8/tc-etf.8.html > 'The ETF (Earliest TxTime First) qdisc allows applications to control > the instant when a packet should be dequeued from the traffic control > layer into the netdevice'. So I assume, when I use iperf (there is > no txtime information in the packet), I still can use ETF and > packet time will be modified to match with schedule and then get > dequeue-ed at correct time to arrive at the driver during the Gate > open of taprio. Is this correct? > > If ETF can schedule packet to arrive at the driver just during th > Gate open and work in sync with taprio scheduler, that would do the > work.I understand the border may be difficult to manage. However if we > add a guard band by adding an extra entry with all Gates closed > between schedules for guard band duration, it should allow hardware to > flush out any remaining frames from the queue outside its Gate duration. > If my understanding is correct, can I use software ETF qdisc in this > case? If so how do I configure it? Any example? > >>> >>> The tc stats shows packets are going through specific TC/Gate >>> >>> root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 >>> qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 4 4 >>> queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count >>> 1 offset 4 count 1 >>> clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 >>> base-time 1564768921123459533 cycle-time 20000000 cycle- >>> time-extension 0 >>> index 0 cmd S gatemask 0x1 interval 4000000 >>> index 1 cmd S gatemask 0x2 interval 4000000 >>> index 2 cmd S gatemask 0x4 interval 4000000 >>> index 3 cmd S gatemask 0x8 interval 4000000 >>> index 4 cmd S gatemask 0x10 interval 4000000 >>> >>> Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> qdisc pfifo 0: parent 100:5 limit 1000p >>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> qdisc pfifo 0: parent 100:4 limit 1000p >>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> qdisc pfifo 0: parent 100:3 limit 1000p >>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> qdisc pfifo 0: parent 100:2 limit 1000p >>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> qdisc pfifo 0: parent 100:1 limit 1000p >>> Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) >>> backlog 0b 0p requeues 0 >>> >>> Also my hardware queue stats shows frames going through correct queues. >>> Am I missing something? >>> >> >> What I usually see in these cases, are that the borders (from A to B, >> for example) are usually messy, the middle of each entry are more well >> behaved. >> > OK > >> But there are things that could improve the behavior: reducing TX DMA >> coalescing, reducing the number of packet buffers in use in the >> controller, disabling power saving features, that kind of thing. > I can try playing with the number if descriptors used. But from my above > response, I might have to use software ETF qdisc along with taprio to > have packets in the correct order on the wire. So will wait on this for > now. >> >> If you are already doing something like this, then I would like to know >> more, that could indicate a problem. >> > No. The hardware just implement a priority queue scheme. I can control > the number of buffers or descriptors. > > Murali >> [...] >> >>> I am on a 4.19.y kernel with patches specific to taprio >>> backported. Am I missing anything related to taprio. I will >>> try on the latest master branch as well. But if you can point out >>> anything that will be helpful. >>> >> >> [...] >> >>> lcpd/ti-linux-4.19.y) Merged TI feature connectivity into >>> ti-linux-4.19.y >> >> I can't think of anything else. >> >>> >>>> >>>> Regards, >>>> -Vladimir >>>> >> >> Cheers, >> -- >> Vinicius >> > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-16 17:14 ` Murali Karicheri @ 2019-10-16 17:22 ` Murali Karicheri 0 siblings, 0 replies; 17+ messages in thread From: Murali Karicheri @ 2019-10-16 17:22 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev On 10/16/2019 01:14 PM, Murali Karicheri wrote: > On 10/16/2019 01:02 PM, Murali Karicheri wrote: >> Hi Vinicius, >> >> On 10/14/2019 07:39 PM, Vinicius Costa Gomes wrote: >>> Murali Karicheri <m-karicheri2@ti.com> writes: >>>> >>>> My expectation is as follows >>>> >>>> AAAAAABBBBBCCCCCDDDDDEEEEE >>>> >>>> Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 >>>> CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. >>>> Each can be max of 4 msec. Is the expection correct? At least that >>>> is my understanding. >>> >>> Your expectation is correct. >>> >>>> >>>> But what I see is alternating packets with port 10000/20000/30000/40000 >>>> at the wireshark capture and it doesn't make sense to me. If you >>>> look at the timestamp, there is nothing showing the Gate is honored >>>> for Tx. Am I missing something? >>> >>> Remember that taprio (in software mode) has no control after the packet >>> is delivered to the driver. So, even if taprio obeys your traffic >>> schedule perfectly, the driver/controller may decide to send packets >>> according to some other logic. >>> >> That is true. >> >> I think I get why it can't work without ETF offload which is missing in >> our hardware. Here is what my understanding. Please correct it if wrong. >> >> Our hardware has priority queues implemented. So if there are no >> packets in the higher priority queue, it would send from the lower >> priority ones. Assuming packets gets dequeue-ed correctly by >> taprio and that packets are only in one of the lower priority TC. >> i.e in the above example, BBBBBB are present when TC1 Gate is open. >> Assuming there are more packets than actually sent out during TC1 >> window, and assuming no packets in the TC0 queue (AAAAA is absent) >> then hardware will continue to send from TC1 queue. So that might >> be what is happening, right? >> >> So it is required to deliver frames to driver only when the Gate for >> the specific traffic class is open. > Plus, number of packets delivered should be based on available time > in the current window. > Also I see in the taprio code /* There are a few scenarios where we will have to modify the txtime from * what is read from next_txtime in sched_entry. They are: * 1. If txtime is in the past, * a. The gate for the traffic class is currently open and packet can be * transmitted before it closes, schedule the packet right away. * b. If the gate corresponding to the traffic class is going to open later * in the cycle, set the txtime of packet to the interval start. * 2. If txtime is in the future, there are packets corresponding to the * current traffic class waiting to be transmitted. So, the following * possibilities exist: * a. We can transmit the packet before the window containing the txtime * closes. * b. The window might close before the transmission can be completed * successfully. So, schedule the packet in the next open window. */ static long get_packet_txtime(struct sk_buff *skb, struct Qdisc *sch) { So if I enable ETF, looks like packets get dequeue-ed based on txtime which match with the schedule entries. So packets would arrive at the driver at the correct time. Of course need to play with delta of the ETF configuration. Thanks Murali >> Is that what is done by ETF qdisc? >> From ETF description at >> http://man7.org/linux/man-pages/man8/tc-etf.8.html >> 'The ETF (Earliest TxTime First) qdisc allows applications to control >> the instant when a packet should be dequeued from the traffic control >> layer into the netdevice'. So I assume, when I use iperf (there is >> no txtime information in the packet), I still can use ETF and >> packet time will be modified to match with schedule and then get >> dequeue-ed at correct time to arrive at the driver during the Gate >> open of taprio. Is this correct? >> >> If ETF can schedule packet to arrive at the driver just during th >> Gate open and work in sync with taprio scheduler, that would do the >> work.I understand the border may be difficult to manage. However if we >> add a guard band by adding an extra entry with all Gates closed >> between schedules for guard band duration, it should allow hardware to >> flush out any remaining frames from the queue outside its Gate duration. >> If my understanding is correct, can I use software ETF qdisc in this >> case? If so how do I configure it? Any example? >> >>>> >>>> The tc stats shows packets are going through specific TC/Gate >>>> >>>> root@am57xx-evm:~# tc -d -p -s qdisc show dev eth0 >>>> qdisc taprio 100: root refcnt 9 tc 5 map 0 1 2 3 4 4 4 4 4 4 4 4 4 4 >>>> 4 4 >>>> queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 >>>> count >>>> 1 offset 4 count 1 >>>> clockid TAI offload 0 base-time 0 cycle-time 0 cycle-time-extension 0 >>>> base-time 1564768921123459533 cycle-time 20000000 cycle- >>>> time-extension 0 >>>> index 0 cmd S gatemask 0x1 interval 4000000 >>>> index 1 cmd S gatemask 0x2 interval 4000000 >>>> index 2 cmd S gatemask 0x4 interval 4000000 >>>> index 3 cmd S gatemask 0x8 interval 4000000 >>>> index 4 cmd S gatemask 0x10 interval 4000000 >>>> >>>> Sent 80948029 bytes 53630 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> qdisc pfifo 0: parent 100:5 limit 1000p >>>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> qdisc pfifo 0: parent 100:4 limit 1000p >>>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> qdisc pfifo 0: parent 100:3 limit 1000p >>>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> qdisc pfifo 0: parent 100:2 limit 1000p >>>> Sent 16184448 bytes 10704 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> qdisc pfifo 0: parent 100:1 limit 1000p >>>> Sent 16210237 bytes 10814 pkt (dropped 0, overlimits 0 requeues 0) >>>> backlog 0b 0p requeues 0 >>>> >>>> Also my hardware queue stats shows frames going through correct queues. >>>> Am I missing something? >>>> >>> >>> What I usually see in these cases, are that the borders (from A to B, >>> for example) are usually messy, the middle of each entry are more well >>> behaved. >>> >> OK >> >>> But there are things that could improve the behavior: reducing TX DMA >>> coalescing, reducing the number of packet buffers in use in the >>> controller, disabling power saving features, that kind of thing. >> I can try playing with the number if descriptors used. But from my above >> response, I might have to use software ETF qdisc along with taprio to >> have packets in the correct order on the wire. So will wait on this for >> now. >>> >>> If you are already doing something like this, then I would like to know >>> more, that could indicate a problem. >>> >> No. The hardware just implement a priority queue scheme. I can control >> the number of buffers or descriptors. >> >> Murali >>> [...] >>> >>>> I am on a 4.19.y kernel with patches specific to taprio >>>> backported. Am I missing anything related to taprio. I will >>>> try on the latest master branch as well. But if you can point out >>>> anything that will be helpful. >>>> >>> >>> [...] >>> >>>> lcpd/ti-linux-4.19.y) Merged TI feature connectivity into >>>> ti-linux-4.19.y >>> >>> I can't think of anything else. >>> >>>> >>>>> >>>>> Regards, >>>>> -Vladimir >>>>> >>> >>> Cheers, >>> -- >>> Vinicius >>> >> >> > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-16 17:02 ` Murali Karicheri 2019-10-16 17:14 ` Murali Karicheri @ 2019-10-16 20:32 ` Vinicius Costa Gomes 2019-10-17 13:56 ` Murali Karicheri 1 sibling, 1 reply; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-16 20:32 UTC (permalink / raw) To: Murali Karicheri, Vladimir Oltean; +Cc: netdev Murali Karicheri <m-karicheri2@ti.com> writes: > Hi Vinicius, > > On 10/14/2019 07:39 PM, Vinicius Costa Gomes wrote: >> Murali Karicheri <m-karicheri2@ti.com> writes: >>> >>> My expectation is as follows >>> >>> AAAAAABBBBBCCCCCDDDDDEEEEE >>> >>> Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 >>> CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. >>> Each can be max of 4 msec. Is the expection correct? At least that >>> is my understanding. >> >> Your expectation is correct. >> >>> >>> But what I see is alternating packets with port 10000/20000/30000/40000 >>> at the wireshark capture and it doesn't make sense to me. If you >>> look at the timestamp, there is nothing showing the Gate is honored >>> for Tx. Am I missing something? >> >> Remember that taprio (in software mode) has no control after the packet >> is delivered to the driver. So, even if taprio obeys your traffic >> schedule perfectly, the driver/controller may decide to send packets >> according to some other logic. >> > That is true. > > I think I get why it can't work without ETF offload which is missing in > our hardware. Here is what my understanding. Please correct it if > wrong. For taprio, to get good results, you have to have some kind of offloading, so right now, there are two alternatives for offloading: (1) full offloading, something similar to what Vladimir added for the SJA1105; (2) txtime-assisted mode, what Vedang added to support running Qbv-like schedules in controllers that only support controlling the transmission time of individual packets (the LaunchTime feature of the i210 controller, for example). If your hardware doesn't have any of those capabilities, then you are basically stuck with the software mode, or you can come up with some other "assisted mode" that might work for your hardware. > > Our hardware has priority queues implemented. So if there are no > packets in the higher priority queue, it would send from the lower > priority ones. Assuming packets gets dequeue-ed correctly by > taprio and that packets are only in one of the lower priority TC. > i.e in the above example, BBBBBB are present when TC1 Gate is open. > Assuming there are more packets than actually sent out during TC1 > window, and assuming no packets in the TC0 queue (AAAAA is absent) > then hardware will continue to send from TC1 queue. So that might > be what is happening, right? > > So it is required to deliver frames to driver only when the Gate for > the specific traffic class is open. Is that what is done by ETF qdisc? > From ETF description at > http://man7.org/linux/man-pages/man8/tc-etf.8.html > 'The ETF (Earliest TxTime First) qdisc allows applications to control > the instant when a packet should be dequeued from the traffic control > layer into the netdevice'. So I assume, when I use iperf (there is > no txtime information in the packet), I still can use ETF and > packet time will be modified to match with schedule and then get > dequeue-ed at correct time to arrive at the driver during the Gate > open of taprio. Is this correct? > taprio in the txtime-assisted mode does exactly that: "packet time will be modified to match with schedule", but it needs ETF offloading to be supported to get good results, ETF has the same "problem" as taprio when running in the software mode (no offloading), it has no control after the packet is delivered to the driver. > If ETF can schedule packet to arrive at the driver just during th > Gate open and work in sync with taprio scheduler, that would do the > work.I understand the border may be difficult to manage. However if we > add a guard band by adding an extra entry with all Gates closed > between schedules for guard band duration, it should allow hardware to > flush out any remaining frames from the queue outside its Gate duration. > If my understanding is correct, can I use software ETF qdisc in this > case? If so how do I configure it? Any example? Without any offloading, I think you are better off running taprio standalone (i.e. without ETF, so you don't have yet another layer of packet scheduling based solely on hrtimers), and just adding the guard-bands, something like this: $ tc qdisc replace dev $IFACE parent root handle 100 taprio \\ num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@1 2@2 \ base-time $BASE_TIME \ sched-entry S 01 2000000 \ sched-entry S 02 3000000 \ sched-entry S 04 4000000 \ sched-entry S 00 1000000 \ clockid CLOCK_TAI Thinking a bit more, taprio in txtime-assisted mode and ETF with no offloading, *might* be better, if its limitation of only being able to use a single TX queue isn't a blocker. Something like this: $ tc qdisc replace dev $IFACE parent root handle 100 taprio \ num_tc 4 \ map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ queues 1@0 1@0 1@0 1@0 \ base-time $BASE_TIME \ sched-entry S 0xC 250000 \ sched-entry S 0x1 250000 \ sched-entry S 0x2 250000 \ sched-entry S 0x4 250000 \ txtime-delay 300000 \ flags 0x1 \ clockid CLOCK_TAI $ tc qdisc replace dev $IFACE parent 100:1 etf \ delta 200000 clockid CLOCK_TAI skip_sock_check Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-16 20:32 ` Vinicius Costa Gomes @ 2019-10-17 13:56 ` Murali Karicheri 2019-10-17 19:32 ` Vinicius Costa Gomes 0 siblings, 1 reply; 17+ messages in thread From: Murali Karicheri @ 2019-10-17 13:56 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev Hi Vinicius, On 10/16/2019 04:32 PM, Vinicius Costa Gomes wrote: > Murali Karicheri <m-karicheri2@ti.com> writes: > >> Hi Vinicius, >> >> On 10/14/2019 07:39 PM, Vinicius Costa Gomes wrote: >>> Murali Karicheri <m-karicheri2@ti.com> writes: >>>> >>>> My expectation is as follows >>>> >>>> AAAAAABBBBBCCCCCDDDDDEEEEE >>>> >>>> Where AAAAA is traffic from TC0, BBBBB is udp stream for port 10000 >>>> CCCCC is stream for port 20000, DDDDD for 30000 and EEEEE for 40000. >>>> Each can be max of 4 msec. Is the expection correct? At least that >>>> is my understanding. >>> >>> Your expectation is correct. >>> >>>> >>>> But what I see is alternating packets with port 10000/20000/30000/40000 >>>> at the wireshark capture and it doesn't make sense to me. If you >>>> look at the timestamp, there is nothing showing the Gate is honored >>>> for Tx. Am I missing something? >>> >>> Remember that taprio (in software mode) has no control after the packet >>> is delivered to the driver. So, even if taprio obeys your traffic >>> schedule perfectly, the driver/controller may decide to send packets >>> according to some other logic. >>> >> That is true. >> >> I think I get why it can't work without ETF offload which is missing in >> our hardware. Here is what my understanding. Please correct it if >> wrong. > > For taprio, to get good results, you have to have some kind of > offloading, so right now, there are two alternatives for offloading: (1) > full offloading, something similar to what Vladimir added for the > SJA1105; (2) txtime-assisted mode, what Vedang added to support running > Qbv-like schedules in controllers that only support controlling the > transmission time of individual packets (the LaunchTime feature of the > i210 controller, for example). > > If your hardware doesn't have any of those capabilities, then you are > basically stuck with the software mode, or you can come up with some > other "assisted mode" that might work for your hardware. > >> >> Our hardware has priority queues implemented. So if there are no >> packets in the higher priority queue, it would send from the lower >> priority ones. Assuming packets gets dequeue-ed correctly by >> taprio and that packets are only in one of the lower priority TC. >> i.e in the above example, BBBBBB are present when TC1 Gate is open. >> Assuming there are more packets than actually sent out during TC1 >> window, and assuming no packets in the TC0 queue (AAAAA is absent) >> then hardware will continue to send from TC1 queue. So that might >> be what is happening, right? >> >> So it is required to deliver frames to driver only when the Gate for >> the specific traffic class is open. Is that what is done by ETF qdisc? >> From ETF description at >> http://man7.org/linux/man-pages/man8/tc-etf.8.html >> 'The ETF (Earliest TxTime First) qdisc allows applications to control >> the instant when a packet should be dequeued from the traffic control >> layer into the netdevice'. So I assume, when I use iperf (there is >> no txtime information in the packet), I still can use ETF and >> packet time will be modified to match with schedule and then get >> dequeue-ed at correct time to arrive at the driver during the Gate >> open of taprio. Is this correct? >> > > taprio in the txtime-assisted mode does exactly that: "packet time will > be modified to match with schedule", but it needs ETF offloading to be > supported to get good results, ETF has the same "problem" as taprio when > running in the software mode (no offloading), it has no control after > the packet is delivered to the driver. > >> If ETF can schedule packet to arrive at the driver just during th >> Gate open and work in sync with taprio scheduler, that would do the >> work.I understand the border may be difficult to manage. However if we >> add a guard band by adding an extra entry with all Gates closed >> between schedules for guard band duration, it should allow hardware to >> flush out any remaining frames from the queue outside its Gate duration. >> If my understanding is correct, can I use software ETF qdisc in this >> case? If so how do I configure it? Any example? > > Without any offloading, I think you are better off running taprio > standalone (i.e. without ETF, so you don't have yet another layer of > packet scheduling based solely on hrtimers), and just adding the > guard-bands, something like this: > > $ tc qdisc replace dev $IFACE parent root handle 100 taprio \\ > num_tc 3 \ > map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ > queues 1@0 1@1 2@2 \ > base-time $BASE_TIME \ > sched-entry S 01 2000000 \ > sched-entry S 02 3000000 \ > sched-entry S 04 4000000 \ > sched-entry S 00 1000000 \ > clockid CLOCK_TAI > > Thinking a bit more, taprio in txtime-assisted mode and ETF with no > offloading, *might* be better, if its limitation of only being able to > use a single TX queue isn't a blocker. > I thought about the same and was playing with ETF with no success so far. After I add it it drops all frames. But I was not using flags and txtime-delay for taprio. So that is the missing part. So there is a limitation that in this mode only one HW queue can be specified. > Something like this: > > $ tc qdisc replace dev $IFACE parent root handle 100 taprio \ > num_tc 4 \ > map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ > queues 1@0 1@0 1@0 1@0 \ So here you are specifying all TCs the same Q0. So that is the limitation you have mentioned above. > base-time $BASE_TIME \ > sched-entry S 0xC 250000 \ > sched-entry S 0x1 250000 \ > sched-entry S 0x2 250000 \ > sched-entry S 0x4 250000 \ > txtime-delay 300000 \ > flags 0x1 \ > clockid CLOCK_TAI > I get an error when I do this in my setup. root@am57xx-evm:~# tc qdisc replace dev eth0 parent root handle 100 taprio \ > num_tc 4 \ > map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ > queues 1@0 1@0 1@0 1@0 \ > base-time 1564535762845777831 \ > sched-entry S 0xC 15000000 \ > sched-entry S 0x2 15000000 \ > sched-entry S 0x4 15000000 \ > sched-entry S 0x8 15000000 \ > txtime-delay 300000 \ > flags 0x1 \ > clockid CLOCK_TAI RTNETLINK answers: Invalid argument Anything wrong with the command syntax? Thanks and regards, Murali > $ tc qdisc replace dev $IFACE parent 100:1 etf \ > delta 200000 clockid CLOCK_TAI skip_sock_check > > Cheers, > -- > Vinicius > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-17 13:56 ` Murali Karicheri @ 2019-10-17 19:32 ` Vinicius Costa Gomes 2019-10-17 21:02 ` Murali Karicheri 0 siblings, 1 reply; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-17 19:32 UTC (permalink / raw) To: Murali Karicheri, Vladimir Oltean; +Cc: netdev Murali Karicheri <m-karicheri2@ti.com> writes: > > root@am57xx-evm:~# tc qdisc replace dev eth0 parent root handle 100 taprio \ > > num_tc 4 \ > > map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ > > queues 1@0 1@0 1@0 1@0 \ > > base-time 1564535762845777831 \ > > sched-entry S 0xC 15000000 \ > > sched-entry S 0x2 15000000 \ > > sched-entry S 0x4 15000000 \ > > sched-entry S 0x8 15000000 \ > > txtime-delay 300000 \ > > flags 0x1 \ > > clockid CLOCK_TAI > RTNETLINK answers: Invalid argument > > Anything wrong with the command syntax? I tried this example here, and it got accepted ok. I am using the current net-next master. The first thing that comes to mind is that perhaps you backported some old version of some of the patches (so it's different than what's upstream now). Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-17 19:32 ` Vinicius Costa Gomes @ 2019-10-17 21:02 ` Murali Karicheri 2019-10-17 22:26 ` Murali Karicheri 0 siblings, 1 reply; 17+ messages in thread From: Murali Karicheri @ 2019-10-17 21:02 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev On 10/17/2019 03:32 PM, Vinicius Costa Gomes wrote: > Murali Karicheri <m-karicheri2@ti.com> writes: >> >> root@am57xx-evm:~# tc qdisc replace dev eth0 parent root handle 100 taprio \ >> > num_tc 4 \ >> > map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ >> > queues 1@0 1@0 1@0 1@0 \ >> > base-time 1564535762845777831 \ >> > sched-entry S 0xC 15000000 \ >> > sched-entry S 0x2 15000000 \ >> > sched-entry S 0x4 15000000 \ >> > sched-entry S 0x8 15000000 \ >> > txtime-delay 300000 \ >> > flags 0x1 \ >> > clockid CLOCK_TAI >> RTNETLINK answers: Invalid argument >> >> Anything wrong with the command syntax? > > I tried this example here, and it got accepted ok. I am using the > current net-next master. The first thing that comes to mind is that > perhaps you backported some old version of some of the patches (so it's > different than what's upstream now). Was on master of kernel.org. Will try net-next master now. Murali > > > Cheers, > -- > Vinicius > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-17 21:02 ` Murali Karicheri @ 2019-10-17 22:26 ` Murali Karicheri 0 siblings, 0 replies; 17+ messages in thread From: Murali Karicheri @ 2019-10-17 22:26 UTC (permalink / raw) To: Vinicius Costa Gomes, Vladimir Oltean; +Cc: netdev Vinicius, On 10/17/2019 05:02 PM, Murali Karicheri wrote: > On 10/17/2019 03:32 PM, Vinicius Costa Gomes wrote: >> Murali Karicheri <m-karicheri2@ti.com> writes: >>> >>> root@am57xx-evm:~# tc qdisc replace dev eth0 parent root handle 100 >>> taprio \ >>> > num_tc 4 \ >>> > map 2 3 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \ >>> > queues 1@0 1@0 1@0 1@0 \ >>> > base-time 1564535762845777831 \ >>> > sched-entry S 0xC 15000000 \ >>> > sched-entry S 0x2 15000000 \ >>> > sched-entry S 0x4 15000000 \ >>> > sched-entry S 0x8 15000000 \ >>> > txtime-delay 300000 \ >>> > flags 0x1 \ >>> > clockid CLOCK_TAI >>> RTNETLINK answers: Invalid argument >>> >>> Anything wrong with the command syntax? >> >> I tried this example here, and it got accepted ok. I am using the >> current net-next master. The first thing that comes to mind is that >> perhaps you backported some old version of some of the patches (so it's >> different than what's upstream now). > Was on master of kernel.org. Will try net-next master now. > > Murali >> >> >> Cheers, >> -- >> Vinicius >> > > Today I have tried with RT kernel and it looks great. I can see the frames in the correct order. Here are the complete set of commands used with RT kernel. Just used taprio, no ETF. I introduced a guard band in between schedules. Not sure if that is needed. I will try without it and see if the capture looks the same. ifconfig eth0 192.168.2.20 ethtool -L eth0 tx 4 tc qdisc replace dev eth0 parent root handle 100 taprio \ num_tc 4 \ map 0 0 2 3 1 1 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 \ base-time 1564535762845777831 \ sched-entry S 0x3 30000000 \ sched-entry S 0x0 10000000 \ sched-entry S 0x4 30000000 \ sched-entry S 0x0 10000000 \ sched-entry S 0x8 30000000 \ clockid CLOCK_TAI iptables -t mangle -A POSTROUTING -p udp --dport 10000 -j CLASSIFY --set-class 0:0 iptables -t mangle -A POSTROUTING -p udp --dport 20000 -j CLASSIFY --set-class 0:2 iptables -t mangle -A POSTROUTING -p udp --dport 30000 -j CLASSIFY --set-class 0:3 iperf -c 192.168.2.10 -u -b50M -p 10000 -t20& iperf -c 192.168.2.10 -u -b50M -p 20000 -t20& iperf -c 192.168.2.10 -u -b100M -p 30000 -t20& Thanks for all your help. Regards, Murali ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: taprio testing - Any help? 2019-10-13 21:10 ` Vladimir Oltean 2019-10-14 15:33 ` Murali Karicheri @ 2019-10-14 23:14 ` Vinicius Costa Gomes 1 sibling, 0 replies; 17+ messages in thread From: Vinicius Costa Gomes @ 2019-10-14 23:14 UTC (permalink / raw) To: Vladimir Oltean; +Cc: Murali Karicheri, netdev Vladimir Oltean <olteanv@gmail.com> writes: > > What do you mean taprio doesn't support tc filter blocks? What do you > think there is to do in taprio to support that? > I don't think Murali is asking for filter offloading, but merely for a > way to direct frames to a certain traffic class on xmit from Linux. > Something like this works perfectly fine: > > sudo tc qdisc add dev swp2 root handle 1: taprio num_tc 2 map 0 1 > queues 1@0 1@1 base-time 1000 sched-entry S 03 300000 flags 2 > # Add the qdisc holding the classifiers > sudo tc qdisc add dev swp2 clsact > # Steer L2 PTP to TC 1 (see with "tc filter show dev swp2 egress") > sudo tc filter add dev swp2 egress prio 1 u32 match u16 0x88f7 0xffff > at -2 action skbedit priority 1 > That's cool. Everyday I'm learning something new :-) > However, the clsact qdisc and tc u32 egress filter can be replaced > with proper use of the SO_PRIORITY API, which is preferable for new > applications IMO. > > I'm trying to send a demo application to tools/testing/selftests/ > which sends cyclic traffic through a raw L2 socket at a configurable > base-time and cycle-time, along with the accompanying scripts to set > up the receiver and bandwidth reservation on an in-between switch. But > I have some trouble getting the sender application to work reliably at > 100 us cycle-time, so it may take a while until I figure out with > kernelshark what's going on. Yeah, 100us cycle-time for software mode is kind of hard to make it work reliably. i.e. without any offloading, I can only get something close to that to work with a PREEMPT_RT kernel and disabling all kinds of power saving features. Cheers, -- Vinicius ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-10-17 22:20 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-11 19:35 taprio testing - Any help? Murali Karicheri 2019-10-11 20:12 ` Vinicius Costa Gomes 2019-10-11 20:56 ` Murali Karicheri 2019-10-11 21:26 ` Vinicius Costa Gomes 2019-10-13 21:10 ` Vladimir Oltean 2019-10-14 15:33 ` Murali Karicheri 2019-10-14 16:18 ` taprio testing with multiple streams Murali Karicheri 2019-10-14 23:39 ` taprio testing - Any help? Vinicius Costa Gomes 2019-10-16 17:02 ` Murali Karicheri 2019-10-16 17:14 ` Murali Karicheri 2019-10-16 17:22 ` Murali Karicheri 2019-10-16 20:32 ` Vinicius Costa Gomes 2019-10-17 13:56 ` Murali Karicheri 2019-10-17 19:32 ` Vinicius Costa Gomes 2019-10-17 21:02 ` Murali Karicheri 2019-10-17 22:26 ` Murali Karicheri 2019-10-14 23:14 ` Vinicius Costa Gomes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).