From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from Galois.linutronix.de ([146.0.238.70]:40302 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751548AbeCVWLS (ORCPT ); Thu, 22 Mar 2018 18:11:18 -0400 Date: Thu, 22 Mar 2018 23:11:10 +0100 (CET) From: Thomas Gleixner To: Jesus Sanchez-Palencia cc: netdev@vger.kernel.org, jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, vinicius.gomes@intel.com, richardcochran@gmail.com, anna-maria@linutronix.de, henrik@austad.us, john.stultz@linaro.org, levi.pearson@harman.com, edumazet@google.com, willemb@google.com, mlichvar@redhat.com Subject: Re: [RFC v3 net-next 13/18] net/sched: Introduce the TBS Qdisc In-Reply-To: <7c3f5a9f-cc16-8483-cb77-b5548d46cd5b@intel.com> Message-ID: References: <20180307011230.24001-1-jesus.sanchez-palencia@intel.com> <20180307011230.24001-14-jesus.sanchez-palencia@intel.com> <7c3f5a9f-cc16-8483-cb77-b5548d46cd5b@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: netdev-owner@vger.kernel.org List-ID: On Thu, 22 Mar 2018, Jesus Sanchez-Palencia wrote: > On 03/21/2018 06:46 AM, Thomas Gleixner wrote: > > If you look at the use cases of TDM in various fields then FIFO mode is > > pretty much useless. In industrial/automotive fieldbus applications the > > various time slices are filled by different threads or even processes. > > > > Sure, the rbtree queue/dequeue has overhead compared to a simple linked > > list, but you pay for that with more indirections and lots of mostly > > duplicated code. And in the worst case one of these code pathes is going to > > be rarely used and prone to bitrot. > > > Our initial version (on RFC v2) was performing the sorting for all modes. After > all the feedback we got we decided to make it optional and provide FIFO modes as > well. For the SW fallback we need the scheduled FIFO, and for "pure" hw offload > we need the "raw" FIFO. I don't see how FIFO ever works without the issue that a newly qeueud packet which has an earlier time stamp than the head of the FIFO list will lose. Why would you even want to have that mode? Just because some weird existing application misdesign thinks its required? That doesn't make it a good idea. With pure hardware offload the packets are immediately handed off to the network card and that one is responsible for sending it on time. So there is no FIFO at all. It's actually a bypass mode. > This was a way to accommodate all the use cases without imposing too much of a > burden onto anyone, regardless of their application's segment (i.e. industrial, > pro a/v, automotive, etc). I'm not buying that argument at all. That's all handwaving. The whole approach is a burden on every application segment because it pushes the whole schedule and time slice management out to user space, which also requires that you route general traffic down to that user space scheduling entity and then queue it back into the proper time slice. And FIFO makes that even worse. > Having the sorting always enabled requires that a valid static clockid is passed > to the qdisc. For the hw offload mode, that means that the PHC and one of the > system clocks must be synchronized since hrtimers do not support dynamic clocks. > Not all systems do that or want to, and given that we do not want to perform > crosstimestamping between the packets' clock reference and the qdisc's one, the > only solution for these systems would be using the raw hw offload mode. There are two variants of hardware offload: 1) Full hardware offload That bypasses the queue completely. You just stick the thing into the scatter gather buffers. Except when there is no room anymore, then you have to queue, but it does not make any difference if you queue in FIFO or in time order. The packets go out in time order anyway. 2) Single packet hardware offload What you do here is to schedule a hrtimer a bit earlier than the first packet tx time and when it fires stick the packet into the hardware and rearm the timer for the next one. The whole point of TSN with hardware support is that you have: - Global network time and - Frequency adjustment of the system time base PTP is TAI based and the kernel exposes clock TAI directly through hrtimers. You don't need dynamic clocks for that. You can even use clock MONOTONIC as it basically is just TAI - offset If the network card uses anything else than TAI or a time stamp with a strict correlation to TAI for actual TX scheduling then the whole thing is broken to begin with. Thanks, tglx