From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <netdev-owner@vger.kernel.org>
Received: from Galois.linutronix.de ([146.0.238.70]:38331 "EHLO
        Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753761AbeCUW3b (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 21 Mar 2018 18:29:31 -0400
Date: Wed, 21 Mar 2018 23:29:26 +0100 (CET)
From: Thomas Gleixner <tglx@linutronix.de>
To: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
cc: netdev@vger.kernel.org, jhs@mojatatu.com, xiyou.wangcong@gmail.com,
        jiri@resnulli.us, vinicius.gomes@intel.com,
        richardcochran@gmail.com, anna-maria@linutronix.de,
        henrik@austad.us, John Stultz <john.stultz@linaro.org>,
        levi.pearson@harman.com, edumazet@google.com, willemb@google.com,
        mlichvar@redhat.com
Subject: Re: [RFC v3 net-next 13/18] net/sched: Introduce the TBS Qdisc
In-Reply-To: <alpine.DEB.2.21.1803211407520.3754@nanos.tec.linutronix.de>
Message-ID: <alpine.DEB.2.21.1803211758140.3754@nanos.tec.linutronix.de>
References: <20180307011230.24001-1-jesus.sanchez-palencia@intel.com> <20180307011230.24001-14-jesus.sanchez-palencia@intel.com> <alpine.DEB.2.21.1803211407520.3754@nanos.tec.linutronix.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Wed, 21 Mar 2018, Thomas Gleixner wrote:
> If you look at the use cases of TDM in various fields then FIFO mode is
> pretty much useless. In industrial/automotive fieldbus applications the
> various time slices are filled by different threads or even processes.

That brings me to a related question. The TDM cases I'm familiar with which
aim to use this utilize multiple periodic time slices, aka 802.1Qbv
time-aware scheduling.

Simple example:

[1a][1b][1c][1d]		[1a][1b][1c][1d]		[.....
		[2a][2b]			[2c][2d]
			[3a]				[3b]
			    [4a]			    [4b]
---------------------------------------------------------------------->	t		    

where 1-4 is the slice level and a-d are network nodes.

In most cases the slice levels on a node are handled by different
applications or threads. Some of the protocols utilize dedicated time slice
levels - lets assume '4' in the above example - to run general network
traffic which might even be allowed to have collisions, i.e. [4a-d] would
become [4] and any node can send; the involved componets like switches are
supposed to handle that.

I'm not seing how TBS is going to assist with any of that. It requires
everything to be handled at the application level. Not really useful
especially not for general traffic which does not know about the scheduling
bands at all.

If you look at an industrial control node. It basically does:

	queue_first_packet(tx, slice1);
   	while (!stop) {
		if (wait_for_packet(rx) == ERROR)
			goto errorhandling;
		tx = do_computation(rx);
		queue_next_tx(tx, slice1);
	}

that's a pretty common pattern for these kind of applications. For audio
sources queue_next() might be triggered by the input sampler which needs to
be synchronized to the network slices anyway in order to work properly.

TBS per current implementation is nice as a proof of concept, but it solves
just a small portion of the complete problem space. I have the suspicion
that this was 'designed' to replace the user space hack in the AVNU stack
with something close to it. Not really a good plan to be honest.

I think what we really want is a strict periodic scheduler which supports
multiple slices as shown above because thats what all relevant TDM use
cases need: A/V, industrial fieldbusses .....

  |---------------------------------------------------------|
  |                                                         |
  |                           TAS                           |<- Config
  |    1               2               3               4    |
  |---------------------------------------------------------|
       |               |               |               |
       |               |               |               |
       |               |               |               |
       |               |               |               |
  [DirectSocket]   [Qdisc FIFO]   [Qdisc Prio]     [Qdisc FIFO]
                       |               |               |
		       |               |               |
		    [Socket]   	    [Socket]     [General traffic]


The interesting thing here is that it does not require any time stamp
information brought in from the application. That's especially good for
general network traffic which is routed through a dedicated time slot. If
we don't have that then we need a user space scheduler which does exactly
the same thing and we have to route the general traffic out to user space
and back into the kernel, which is obviously a pointless exercise.

There are all kind of TDM schemes out there which are not directly driven
by applications, but rather route categorized traffic like VLANs through
dedicated time slices. That works pretty well with the above scheme because
in that case the applications might be completely oblivious about the tx
time schedule.

Surely there are protocols which do not utilize every time slice they could
use, so we need a way to tell the number of empty slices between two
consecutive packets. There are also different policies vs. the unused time
slices, like sending dummy frames or just nothing which wants to be
addressed, but I don't think that changes the general approach.

There might be some special cases for setup or node hotplug, but the
protocols I'm familiar with handle these in dedicated time slices or
through general traffic so it should just fit in.

I'm surely missing some details, but from my knowledge about the protocols
which want to utilize this, the general direction should be fine.

Feel free to tell me that I'm missing the point completely though :)

Thoughts?

Thanks,

	tglx