From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Gleixner Subject: Re: [RFC v3 net-next 13/18] net/sched: Introduce the TBS Qdisc Date: Tue, 10 Apr 2018 14:37:00 +0200 (CEST) Message-ID: References: <20180307011230.24001-1-jesus.sanchez-palencia@intel.com> <20180307011230.24001-14-jesus.sanchez-palencia@intel.com> <65da0648-b835-a171-3986-2d1ddcb8ea10@intel.com> <2897b562-06e0-0fcc-4fb1-e8c4469c0faa@intel.com> <60799930-56a0-3692-9482-e733d7277152@intel.com> <0369f48c-b48e-ce27-1988-8bc0ec65bf13@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Cc: netdev@vger.kernel.org, jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, vinicius.gomes@intel.com, richardcochran@gmail.com, anna-maria@linutronix.de, henrik@austad.us, John Stultz , levi.pearson@harman.com, edumazet@google.com, willemb@google.com, mlichvar@redhat.com To: Jesus Sanchez-Palencia Return-path: Received: from Galois.linutronix.de ([146.0.238.70]:53726 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753087AbeDJMhE (ORCPT ); Tue, 10 Apr 2018 08:37:04 -0400 In-Reply-To: <0369f48c-b48e-ce27-1988-8bc0ec65bf13@intel.com> Sender: netdev-owner@vger.kernel.org List-ID: Jesus, On Mon, 9 Apr 2018, Jesus Sanchez-Palencia wrote: > On 03/28/2018 12:48 AM, Thomas Gleixner wrote: > > Yes, you have the root qdisc, which is in charge of the overall scheduling > > plan, how complex or not it is defined does not matter. It exposes traffic > > classes which have properties defined by the configuration. > > Perfect. Let's see if we can agree on an overall plan, then. Hopefully I'm not > missing anything. > > For the above we'll develop a new qdisc, designed along the 'taprio' ideas, thus > a Qbv style scheduler, to be used as root qdisc. It can run the schedule inside > the kernel or just offload it to the NIC if supported. Similarly to the other > multiqueue qdiscs, it will expose the HW Tx queues. > > What is new here from the ideas we shared last year is that this new root qdisc > will be responsible for calling the attached qdiscs' dequeue functions during > their timeslices, making it the only entity capable of enqueueing packets into > the NIC. Correct. Aside of that it's the entity which is in charge of the overall scheduling. > This is the "global scheduler", but we still need the txtime aware > qdisc. For that, we'll modify tbs to accommodate the feedback from this > thread. More below. > > The qdiscs which are attached to those traffic classes can be anything > > including: > > > > - Simple feed through (Applications are time contraints aware and set the > > exact schedule). qdisc has admission control. > > This will be provided by the tbs qdisc. It will still provide a txtime sorted > list and hw offload, but now there will be a per-socket option that tells the > qdisc if the per-packet timestamp is the txtime (i.e. explicit mode, as you've > called it) or a deadline. The drop_if_late flag will be removed. > > When in explicit mode, packets from that socket are dequeued from the qdisc > during its time slice if their [(txtime - delta) < now]. > > > > > - Deadline aware qdisc to handle e.g. A/V streams. Applications are aware > > of time constraints and provide the packet deadline. qdisc has admission > > control. This can be a simple first comes, first served scheduler or > > something like EDF which allows optimized utilization. The qdisc sets > > the TX time depending on the deadline and feeds into the root. > > This will be provided by tbs if the socket which is transmitting packets is > configured for deadline mode. You don't want the socket to decide that. The qdisc into which a socket feeds defines the mode and the qdisc rejects requests with the wrong mode. Making a qdisc doing both and let the user decide what he wants it to be is not really going to fly. Especially if you have different users which want a different mode. It's clearly distinct functionality. Please stop trying to develop swiss army knifes with integrated coffee machines. > For the deadline -> txtime conversion, what I have in mind is: when dequeue is > called tbs will just change the skbuff's timestamp from the deadline to 'now' > (i.e. as soon as possible) and dequeue the packet. Would that be enough or > should we use the delta parameter of the qdisc on this case add make [txtime = > now + delta]? The only benefit of doing so would be to provide a configurable > 'fudge' factor. Well, that really depends on how your deadline scheduler works. > Another question for this mode (but perhaps that applies to both modes) is, what > if the qdisc misses the deadline for *any* reason? I'm assuming it should drop > the packet during dequeue. There the question is how user space is notified about that issue. The application which queued the packet on time does rightfully assume that it's going to be on the wire on time. This is a violation of the overall scheduling plan, so you need to have a sane design to handle that. > Putting it all together, we end up with: > > 1) a new txtime aware qdisc, tbs, to be used per queue. Its cli will look like: > $ tc qdisc add (...) tbs clockid CLOCK_REALTIME delta 150000 offload sorting Why CLOCK_REALTIME? The only interesting time in a TSN network is CLOCK_TAI, really. > 2) a new cmsg-interface for setting a per-packet timestamp that will be used > either as a txtime or as deadline by tbs (and further the NIC driver for the > offlaod case): SCM_TXTIME. > > 3) a new socket option: SO_TXTIME. It will be used to enable the feature for a > socket, and will have as parameters a clockid and a txtime mode (deadline or > explicit), that defines the semantics of the timestamp set on packets using > SCM_TXTIME. > > 4) a new #define DYNAMIC_CLOCKID 15 added to include/uapi/linux/time.h . Can you remind me why we would need that? > 5) a new schedule-aware qdisc, 'tas' or 'taprio', to be used per port. Its cli > will look like what was proposed for taprio (base time being an absolute timestamp). > > If we all agree with the above, we will start by closing on 1-4 asap and will > focus on 5 next. > > How does that sound? Backwards to be honest. You should start with the NIC facing qdisc because that's the key part of all this and the design might have implications on how the qdiscs which feed into it need to be designed. Thanks, tglx