All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Cc: netdev@vger.kernel.org, jhs@mojatatu.com,
	xiyou.wangcong@gmail.com, jiri@resnulli.us,
	vinicius.gomes@intel.com, richardcochran@gmail.com,
	intel-wired-lan@lists.osuosl.org, anna-maria@linutronix.de,
	henrik@austad.us, john.stultz@linaro.org,
	levi.pearson@harman.com, edumazet@google.com, willemb@google.com,
	mlichvar@redhat.com
Subject: Re: [RFC v3 net-next 14/18] net/sched: Add HW offloading capability to TBS
Date: Wed, 21 Mar 2018 15:22:11 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.21.1803211448310.3754@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20180307011230.24001-15-jesus.sanchez-palencia@intel.com>

On Tue, 6 Mar 2018, Jesus Sanchez-Palencia wrote:
> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>            map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
> 
> $ tc qdisc add dev enp2s0 parent 100:1 tbs offload
> 
> In this example, the Qdisc will use HW offload for the control of the
> transmission time through the network adapter. It's assumed the timestamp
> in skbuffs are in reference to the interface's PHC and setting any other
> valid clockid would be treated as an error. Because there is no
> scheduling being performed in the qdisc, setting a delta != 0 would also
> be considered an error.

Which clockid will be handed in from the application? The network adapter
time has no fixed clockid. The only way you can get to it is via a fd based
posix clock and that does not work at all because the qdisc setup might
have a different FD than the application which queues packets.

I think this should look like this:

    clock_adapter:	1 = clock of the network adapter
    			0 = system clock selected by clock_system

    clock_system:	0 = CLOCK_REALTIME
    			1 = CLOCK_MONOTONIC

or something like that.

> Example 2:
> 
> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>            map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
> 
> $ tc qdisc add dev enp2s0 parent 100:1 tbs offload delta 100000 \
> 	   clockid CLOCK_REALTIME sorting
> 
> Here, the Qdisc will use HW offload for the txtime control again,
> but now sorting will be enabled, and thus there will be scheduling being
> performed by the qdisc. That is done based on the clockid CLOCK_REALTIME
> reference and packets leave the Qdisc "delta" (100000) nanoseconds before
> their transmission time. Because this will be using HW offload and
> since dynamic clocks are not supported by the hrtimer, the system clock
> and the PHC clock must be synchronized for this mode to behave as expected.

So what you do here is queueing the packets in the qdisk and then schedule
them at some point ahead of actual transmission time for delivery to the
hardware. That delivery uses the same txtime as used for qdisc scheduling
to tell the hardware when the packet should go on the wire. That's needed
when the network adapter does not support queueing of multiple packets.

Bah, and probably there you need CLOCK_TAI because that's what PTP is based
on, so clock_system needs to accomodate that as well. Dammit, there goes
the simple 2 bits implementation. CLOCK_TAI is 11, so we'd need 4 clock
bits plus the adapter bit.

Though we could spare a bit. The fixed CLOCK_* space goes from 0 to 15. I
don't see us adding new fixed clocks, so we really can reserve #15 for
selecting the adapter clock if sparing that extra bit is truly required.

Thanks,

	tglx

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [RFC v3 net-next 14/18] net/sched: Add HW offloading capability to TBS
Date: Wed, 21 Mar 2018 15:22:11 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.21.1803211448310.3754@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20180307011230.24001-15-jesus.sanchez-palencia@intel.com>

On Tue, 6 Mar 2018, Jesus Sanchez-Palencia wrote:
> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>            map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1 at 0 1 at 1 2 at 2 hw 0
> 
> $ tc qdisc add dev enp2s0 parent 100:1 tbs offload
> 
> In this example, the Qdisc will use HW offload for the control of the
> transmission time through the network adapter. It's assumed the timestamp
> in skbuffs are in reference to the interface's PHC and setting any other
> valid clockid would be treated as an error. Because there is no
> scheduling being performed in the qdisc, setting a delta != 0 would also
> be considered an error.

Which clockid will be handed in from the application? The network adapter
time has no fixed clockid. The only way you can get to it is via a fd based
posix clock and that does not work at all because the qdisc setup might
have a different FD than the application which queues packets.

I think this should look like this:

    clock_adapter:	1 = clock of the network adapter
    			0 = system clock selected by clock_system

    clock_system:	0 = CLOCK_REALTIME
    			1 = CLOCK_MONOTONIC

or something like that.

> Example 2:
> 
> $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \
>            map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1 at 0 1 at 1 2 at 2 hw 0
> 
> $ tc qdisc add dev enp2s0 parent 100:1 tbs offload delta 100000 \
> 	   clockid CLOCK_REALTIME sorting
> 
> Here, the Qdisc will use HW offload for the txtime control again,
> but now sorting will be enabled, and thus there will be scheduling being
> performed by the qdisc. That is done based on the clockid CLOCK_REALTIME
> reference and packets leave the Qdisc "delta" (100000) nanoseconds before
> their transmission time. Because this will be using HW offload and
> since dynamic clocks are not supported by the hrtimer, the system clock
> and the PHC clock must be synchronized for this mode to behave as expected.

So what you do here is queueing the packets in the qdisk and then schedule
them at some point ahead of actual transmission time for delivery to the
hardware. That delivery uses the same txtime as used for qdisc scheduling
to tell the hardware when the packet should go on the wire. That's needed
when the network adapter does not support queueing of multiple packets.

Bah, and probably there you need CLOCK_TAI because that's what PTP is based
on, so clock_system needs to accomodate that as well. Dammit, there goes
the simple 2 bits implementation. CLOCK_TAI is 11, so we'd need 4 clock
bits plus the adapter bit.

Though we could spare a bit. The fixed CLOCK_* space goes from 0 to 15. I
don't see us adding new fixed clocks, so we really can reserve #15 for
selecting the adapter clock if sparing that extra bit is truly required.

Thanks,

	tglx



  reply	other threads:[~2018-03-21 14:22 UTC|newest]

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-07  1:12 [RFC v3 net-next 00/18] Time based packet transmission Jesus Sanchez-Palencia
2018-03-07  1:12 ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 01/18] sock: Fix SO_ZEROCOPY switch case Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07 16:58   ` Willem de Bruijn
2018-03-07 16:58     ` [Intel-wired-lan] " Willem de Bruijn
2018-03-07  1:12 ` [RFC v3 net-next 02/18] net: Clear skb->tstamp only on the forwarding path Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07 16:59   ` Willem de Bruijn
2018-03-07 16:59     ` [Intel-wired-lan] " Willem de Bruijn
2018-03-07 22:03     ` Jesus Sanchez-Palencia
2018-03-07 22:03       ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 03/18] posix-timers: Add CLOCKID_INVALID mask Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 04/18] net: Add a new socket option for a future transmit time Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 05/18] net: ipv4: raw: Hook into time based transmission Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07 17:00   ` Willem de Bruijn
2018-03-07 17:00     ` [Intel-wired-lan] " Willem de Bruijn
2018-03-07  1:12 ` [RFC v3 net-next 06/18] net: ipv4: udp: " Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07 17:00   ` Willem de Bruijn
2018-03-07 17:00     ` [Intel-wired-lan] " Willem de Bruijn
2018-03-07  1:12 ` [RFC v3 net-next 07/18] net: packet: " Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 08/18] net: SO_TXTIME: Add clockid and drop_if_late params Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  2:53   ` Eric Dumazet
2018-03-07  2:53     ` [Intel-wired-lan] " Eric Dumazet
2018-03-07  5:24     ` Richard Cochran
2018-03-07  5:24       ` [Intel-wired-lan] " Richard Cochran
2018-03-07 17:01       ` Willem de Bruijn
2018-03-07 17:01         ` [Intel-wired-lan] " Willem de Bruijn
2018-03-07 17:35         ` Richard Cochran
2018-03-07 17:35           ` [Intel-wired-lan] " Richard Cochran
2018-03-07 17:37           ` Richard Cochran
2018-03-07 17:37             ` [Intel-wired-lan] " Richard Cochran
2018-03-07 17:47             ` Eric Dumazet
2018-03-07 17:47               ` [Intel-wired-lan] " Eric Dumazet
2018-03-08 16:44               ` Richard Cochran
2018-03-08 16:44                 ` [Intel-wired-lan] " Richard Cochran
2018-03-08 17:56                 ` Jesus Sanchez-Palencia
2018-03-08 17:56                   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-21 12:58       ` Thomas Gleixner
2018-03-21 12:58         ` [Intel-wired-lan] " Thomas Gleixner
2018-03-21 14:59         ` Richard Cochran
2018-03-21 14:59           ` [Intel-wired-lan] " Richard Cochran
2018-03-21 15:11           ` Thomas Gleixner
2018-03-07 21:52     ` Jesus Sanchez-Palencia
2018-03-07 21:52       ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07 22:45       ` Eric Dumazet
2018-03-07 22:45         ` [Intel-wired-lan] " Eric Dumazet
2018-03-07 23:03         ` David Miller
2018-03-07 23:03           ` [Intel-wired-lan] " David Miller
2018-03-08 11:37         ` Miroslav Lichvar
2018-03-08 11:37           ` [Intel-wired-lan] " Miroslav Lichvar
2018-03-08 16:25           ` David Miller
2018-03-08 16:25             ` [Intel-wired-lan] " David Miller
2018-03-07  1:12 ` [RFC v3 net-next 09/18] net: ipv4: raw: Handle remaining txtime parameters Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 10/18] net: ipv4: udp: " Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 11/18] net: packet: " Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 12/18] net/sched: Allow creating a Qdisc watchdog with other clocks Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 13/18] net/sched: Introduce the TBS Qdisc Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-21 13:46   ` Thomas Gleixner
2018-03-21 13:46     ` [Intel-wired-lan] " Thomas Gleixner
2018-03-21 22:29     ` Thomas Gleixner
2018-03-22 20:25       ` Jesus Sanchez-Palencia
2018-03-22 22:52         ` Thomas Gleixner
2018-03-24  0:34           ` Jesus Sanchez-Palencia
2018-03-25 11:46             ` Thomas Gleixner
2018-03-27 23:26               ` Jesus Sanchez-Palencia
2018-03-28  7:48                 ` Thomas Gleixner
2018-03-28 13:07                   ` Henrik Austad
2018-04-09 16:36                   ` Jesus Sanchez-Palencia
2018-04-10 12:37                     ` Thomas Gleixner
2018-04-10 21:24                       ` Jesus Sanchez-Palencia
2018-04-11 20:16                         ` Thomas Gleixner
2018-04-11 20:31                           ` Ivan Briano
2018-04-11 23:38                           ` Jesus Sanchez-Palencia
2018-04-12 15:03                             ` Richard Cochran
2018-04-12 15:19                               ` Miroslav Lichvar
2018-04-19 10:03                             ` Thomas Gleixner
2018-03-22 20:29     ` Jesus Sanchez-Palencia
2018-03-22 22:11       ` Thomas Gleixner
2018-03-22 23:26         ` Jesus Sanchez-Palencia
2018-03-23  8:49           ` Thomas Gleixner
2018-03-23 23:34             ` Jesus Sanchez-Palencia
2018-04-23 18:21     ` Jesus Sanchez-Palencia
2018-04-23 18:21       ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-04-24  8:50       ` Thomas Gleixner
2018-04-24  8:50         ` [Intel-wired-lan] " Thomas Gleixner
2018-04-24 13:50         ` David Miller
2018-04-24 13:50           ` [Intel-wired-lan] " David Miller
2018-03-07  1:12 ` [RFC v3 net-next 14/18] net/sched: Add HW offloading capability to TBS Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-21 14:22   ` Thomas Gleixner [this message]
2018-03-21 14:22     ` Thomas Gleixner
2018-03-21 15:03     ` Richard Cochran
2018-03-21 15:03       ` [Intel-wired-lan] " Richard Cochran
2018-03-21 16:18       ` Thomas Gleixner
2018-03-22 22:01         ` Jesus Sanchez-Palencia
2018-03-22 23:15     ` Jesus Sanchez-Palencia
2018-03-22 23:15       ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-23  8:51       ` Thomas Gleixner
2018-03-23  8:51         ` [Intel-wired-lan] " Thomas Gleixner
2018-03-07  1:12 ` [RFC v3 net-next 15/18] igb: Refactor igb_configure_cbs() Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 16/18] igb: Only change Tx arbitration when CBS is on Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 17/18] igb: Refactor igb_offload_cbs() Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  1:12 ` [RFC v3 net-next 18/18] igb: Add support for TBS offload Jesus Sanchez-Palencia
2018-03-07  1:12   ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-07  5:28 ` [RFC v3 net-next 00/18] Time based packet transmission Richard Cochran
2018-03-07  5:28   ` [Intel-wired-lan] " Richard Cochran
2018-03-08 14:09 ` Henrik Austad
2018-03-08 14:09   ` [Intel-wired-lan] " Henrik Austad
2018-03-08 18:06   ` Jesus Sanchez-Palencia
2018-03-08 18:06     ` [Intel-wired-lan] " Jesus Sanchez-Palencia
2018-03-08 22:54     ` Henrik Austad
2018-03-08 22:54       ` [Intel-wired-lan] " Henrik Austad
2018-03-08 23:58       ` Jesus Sanchez-Palencia
2018-03-08 23:58         ` [Intel-wired-lan] " Jesus Sanchez-Palencia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1803211448310.3754@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=anna-maria@linutronix.de \
    --cc=edumazet@google.com \
    --cc=henrik@austad.us \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jesus.sanchez-palencia@intel.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.stultz@linaro.org \
    --cc=levi.pearson@harman.com \
    --cc=mlichvar@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=richardcochran@gmail.com \
    --cc=vinicius.gomes@intel.com \
    --cc=willemb@google.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.