linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Geva, Erez" <erez.geva.ext@siemens.com>
To: Vinicius Costa Gomes <vinicius.gomes@intel.com>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Network Development <netdev@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Arnd Bergmann <arnd@arndb.de>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@resnulli.us>,
	Alexei Starovoitov <ast@kernel.org>,
	Colin Ian King <colin.king@canonical.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Dumazet <edumazet@google.com>,
	Eyal Birger <eyal.birger@gmail.com>,
	"Gustavo A . R . Silva" <gustavoars@kernel.org>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	John Ogness <john.ogness@linutronix.de>,
	Jon Rosen <jrosen@cisco.com>, Kees Cook <keescook@chromium.org>,
	Marc Kleine-Budde <mkl@pengutronix.de>,
	Martin KaFai Lau <kafai@fb.com>,
	Matthieu Baerts <matthieu.baerts@tessares.net>,
	Andrei Vagin <avagin@gmail.com>,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	Ingo Molnar <mingo@kernel.org>,
	John Stultz <john.stultz@linaro.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Michal Kubecek <mkubecek@suse.cz>,
	Or Cohen <orcohen@paloaltonetworks.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Stefan Schmidt <stefan@datenfreihafen.org>,
	Xie He <xie.he.0141@gmail.com>, Stephen Boyd <sboyd@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vladis Dronov <vdronov@redhat.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Frederic Weisbecker <frederic@kernel.org>,
	Vedang Patel <vedang.patel@intel.com>,
	"Sudler, Simon" <simon.sudler@siemens.com>,
	"Meisinger, Andreas" <andreas.meisinger@siemens.com>,
	"henning.schild@siemens.com" <henning.schild@siemens.com>,
	"jan.kiszka@siemens.com" <jan.kiszka@siemens.com>,
	"Zirkler, Andreas" <andreas.zirkler@siemens.com>
Subject: Re: [PATCH 1/3] Add TX sending hardware timestamp.
Date: Fri, 11 Dec 2020 14:44:21 +0000	[thread overview]
Message-ID: <VI1PR10MB24469F42655B66B16DF25B6DABCA0@VI1PR10MB2446.EURPRD10.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <87r1nxxk3u.fsf@intel.com>


On 11/12/2020 01:27, Vinicius Costa Gomes wrote:
> Willem de Bruijn <willemdebruijn.kernel@gmail.com> writes:
>
>>>> If I understand correctly, you are trying to achieve a single delivery time.
>>>> The need for two separate timestamps passed along is only because the
>>>> kernel is unable to do the time base conversion.
>>>
>>> Yes, a correct point.
>>>
>>>>
>>>> Else, ETF could program the qdisc watchdog in system time and later,
>>>> on dequeue, convert skb->tstamp to the h/w time base before
>>>> passing it to the device.
>>>
>>> Or the skb->tstamp is HW time-stamp and the ETF convert it to system clock based.
>>>
>>>>
>>>> It's still not entirely clear to me why the packet has to be held by
>>>> ETF initially first, if it is held until delivery time by hardware
>>>> later. But more on that below.
>>>
>>> Let plot a simple scenario.
>>> App A send a packet with time-stamp 100.
>>> After arrive a second packet from App B with time-stamp 90.
>>> Without ETF, the second packet will have to wait till the interface hardware send the first packet on 100.
>>> Making the second packet late by 10 + first packet send time.
>>> Obviously other "normal" packets are send to the non-ETF queue, though they do not block ETF packets
>>> The ETF delta is a barrier that the application have to send the packet before to ensure the packet do not tossed.
>>
>> Got it. The assumption here is that devices are FIFO. That is not
>> necessarily the case, but I do not know whether it is in practice,
>> e.g., on the i210.
>
> On the i210 and i225, that's indeed the case, i.e. only the launch time
> of the packet at the front of the queue is considered.
>
> [...]
>
>>>>>>>> It only requires that pacing qdiscs, both sch_etf and sch_fq,
>>>>>>>> optionally skip queuing in their .enqueue callback and instead allow
>>>>>>>> the skb to pass to the device driver as is, with skb->tstamp set. Only
>>>>>>>> to devices that advertise support for h/w pacing offload.
>>>>>>>>
>>>>>>> I did not use "Fair Queue traffic policing".
>>>>>>> As for ETF, it is all about ordering packets from different applications.
>>>>>>> How can we achive it with skiping queuing?
>>>>>>> Could you elaborate on this point?
>>>>>>
>>>>>> The qdisc can only defer pacing to hardware if hardware can ensure the
>>>>>> same invariants on ordering, of course.
>>>>>
>>>>> Yes, this is why we suggest ETF order packets using the hardware time-stamp.
>>>>> And pass the packet based on system time.
>>>>> So ETF query the system clock only and not the PHC.
>>>>
>>>> On which note: with this patch set all applications have to agree to
>>>> use h/w time base in etf_enqueue_timesortedlist. In practice that
>>>> makes this h/w mode a qdisc used by a single process?
>>>
>>> A single process theoretically does not need ETF, just set the skb-> tstamp and use a pass through queue.
>>> However the only way now to set TC_SETUP_QDISC_ETF in the driver is using ETF.
>>
>> Yes, and I'd like to eventually get rid of this constraint.
>>
>
> I'm interested in these kind of ideas :-)
>
> What would be your end goal? Something like:
>   - Any application is able to set SO_TXTIME;
>   - We would have a best effort support for scheduling packets based on
>   their transmission time enabled by default;
>   - If the hardware supports, there would be a "offload" flag that could
>   be enabled;
>
> More or less this?

Activate the SO_TXTIME is what cause the SKB to enter the matching ETF QDISC.
If the ETF QDISC is not set the SKB will pass directly to the driver.
Or if the SO_TXTIME Clock ID is not TAI.
So application can use the SO_TXTIME as is and set the skb-> tstamp.
No need to change anything for SO_TXTIME.

As for setting TC_SETUP_QDISC_ETF on a driver queue.
We can add net-link message using the net-link protocol.
How about other TC_SETUP_QDISC_XXX like CBS?

>
>
> Cheers.
>

  reply	other threads:[~2020-12-11 15:12 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-09 14:37 [PATCH 0/3] Add sending TX hardware timestamp for TC ETF Qdisc Erez Geva
2020-12-09 14:37 ` [PATCH 1/3] Add TX sending hardware timestamp Erez Geva
2020-12-09 14:48   ` Willem de Bruijn
2020-12-09 15:21     ` Geva, Erez
2020-12-09 17:37       ` Willem de Bruijn
2020-12-09 20:18         ` Geva, Erez
2020-12-10 19:11           ` Willem de Bruijn
2020-12-10 22:37             ` Geva, Erez
2020-12-10 23:30               ` Willem de Bruijn
2020-12-11  0:27                 ` Vinicius Costa Gomes
2020-12-11 14:44                   ` Geva, Erez [this message]
2020-12-11 15:15                   ` Willem de Bruijn
2020-12-11 14:22                 ` Geva, Erez
2020-12-10  3:11   ` kernel test robot
2020-12-10 12:41     ` Geva, Erez
2020-12-10 18:17       ` Geva, Erez
2020-12-12  8:47       ` [kbuild-all] " Philip Li
2020-12-16  2:01         ` Rong Chen
2020-12-09 14:37 ` [PATCH 2/3] Pass TX sending hardware timestamp to a socket's buffer Erez Geva
2020-12-09 14:37 ` [PATCH 3/3] The TC ETF Qdisc pass the hardware timestamp to the interface driver Erez Geva

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR10MB24469F42655B66B16DF25B6DABCA0@VI1PR10MB2446.EURPRD10.PROD.OUTLOOK.COM \
    --to=erez.geva.ext@siemens.com \
    --cc=0x7f454c46@gmail.com \
    --cc=andreas.meisinger@siemens.com \
    --cc=andreas.zirkler@siemens.com \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=avagin@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=colin.king@canonical.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=edumazet@google.com \
    --cc=eyal.birger@gmail.com \
    --cc=frederic@kernel.org \
    --cc=gustavoars@kernel.org \
    --cc=henning.schild@siemens.com \
    --cc=jakub@cloudflare.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.ogness@linutronix.de \
    --cc=john.stultz@linaro.org \
    --cc=jrosen@cisco.com \
    --cc=kafai@fb.com \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linmiaohe@huawei.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthieu.baerts@tessares.net \
    --cc=mingo@kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=orcohen@paloaltonetworks.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=sboyd@kernel.org \
    --cc=simon.sudler@siemens.com \
    --cc=stefan@datenfreihafen.org \
    --cc=tglx@linutronix.de \
    --cc=vdronov@redhat.com \
    --cc=vedang.patel@intel.com \
    --cc=vinicius.gomes@intel.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=xie.he.0141@gmail.com \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).