linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Fred Klassen <fklassen@appneta.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Shuah Khan <shuah@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH net 1/4] net/udp_gso: Allow TX timestamp with UDP GSO
Date: Sun, 26 May 2019 21:09:03 -0500	[thread overview]
Message-ID: <CAF=yD-+h2qJP0M5XQrcFVfyn3TP7Jd0UJ1zFf0kbUeC9uKKNxQ@mail.gmail.com> (raw)
In-Reply-To: <CAF=yD-KTJGYY-yf=+zwa8SyrCNAfZjqjomJ=B=yFcs+juDeShA@mail.gmail.com>

On Sun, May 26, 2019 at 8:30 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Sat, May 25, 2019 at 1:47 PM Fred Klassen <fklassen@appneta.com> wrote:
> >
> >
> >
> > > On May 25, 2019, at 8:20 AM, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > On Fri, May 24, 2019 at 6:01 PM Fred Klassen <fklassen@appneta.com> wrote:
> > >>
> > >>
> > >>
> > >>> On May 24, 2019, at 12:29 PM, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > >>>
> > >>> It is the last moment that a timestamp can be generated for the last
> > >>> byte, I don't see how that is "neither the start nor the end of a GSO
> > >>> packet”.
> > >>
> > >> My misunderstanding. I thought TCP did last segment timestamping, not
> > >> last byte. In that case, your statements make sense.
> > >>
> > >>>> It would be interesting if a practical case can be made for timestamping
> > >>>> the last segment. In my mind, I don’t see how that would be valuable.
> > >>>
> > >>> It depends whether you are interested in measuring network latency or
> > >>> host transmit path latency.
> > >>>
> > >>> For the latter, knowing the time from the start of the sendmsg call to
> > >>> the moment the last byte hits the wire is most relevant. Or in absence
> > >>> of (well defined) hardware support, the last byte being queued to the
> > >>> device is the next best thing.
> > >
> > > Sounds to me like both cases have a legitimate use case, and we want
> > > to support both.
> > >
> > > Implementation constraints are that storage for this timestamp
> > > information is scarce and we cannot add new cold cacheline accesses in
> > > the datapath.
> > >
> > > The simplest approach would be to unconditionally timestamp both the
> > > first and last segment. With the same ID. Not terribly elegant. But it
> > > works.
> > >
> > > If conditional, tx_flags has only one bit left. I think we can harvest
> > > some, as not all defined bits are in use at the same stages in the
> > > datapath, but that is not a trivial change. Some might also better be
> > > set in the skb, instead of skb_shinfo. Which would also avoids
> > > touching that cacheline. We could possibly repurpose bits from u32
> > > tskey.
> > >
> > > All that can come later. Initially, unless we can come up with
> > > something more elegant, I would suggest that UDP follows the rule
> > > established by TCP and timestamps the last byte. And we add an
> > > explicit SOF_TIMESTAMPING_OPT_FIRSTBYTE that is initially only
> > > supported for UDP, sets a new SKBTX_TX_FB_TSTAMP bit in
> > > __sock_tx_timestamp and is interpreted in __udp_gso_segment.
> > >
> >
> > I don’t see how to practically TX timestamp the last byte of any packet
> > (UDP GSO or otherwise). The best we could do is timestamp the last
> > segment,  or rather the time that the last segment is queued. Let me
> > attempt to explain.
> >
> > First let’s look at software TX timestamps which are for are generated
> > by skb_tx_timestamp() in nearly every network driver’s xmit routine. It
> > states:
> >
> > —————————— cut ————————————
> >  * Ethernet MAC Drivers should call this function in their hard_xmit()
> >  * function immediately before giving the sk_buff to the MAC hardware.
> > —————————— cut ————————————
> >
> > That means that the sk_buff will get timestamped just before rather
> > than just after it is sent. To truly capture the timestamp of the last
> > byte, this routine routine would have to be called a second time, right
> > after sending to MAC hardware. Then the user program would have
> > sort out the 2 timestamps. My guess is that this isn’t something that
> > NIC vendors would be willing to implement in their drivers.
> >
> > So, the best we can do is timestamp is just before the last segment.
> > Suppose UDP GSO sends 3000 bytes to a 1500 byte MTU adapter.
> > If we set SKBTX_HW_TSTAMP flag on the last segment, the timestamp
> > occurs half way through the burst. But it may not be exactly half way
> > because the segments may get queued much faster than wire rate.
> > Therefore the time between segment 1 and segment 2 may be much
> > much smaller than their spacing on the wire. I would not find this
> > useful.
>
> For measuring host queueing latency, a timestamp at the existing
> skb_tx_timestamp() for the last segment is perfectly informative.

In most cases all segments will be sent in a single xmit_more train.
In which case the device doorbell is rung when the last segment is
queued.

A device may also pause in the middle of a train, causing the rest of
the list to be requeued and resent after a tx completion frees up
descriptors and wakes the device. This seems like a relevant exception
to be able to measure.

That said, I am not opposed to the first segment, if we have to make a
binary choice for a default. Either option has cons. See more specific
revision requests in the v2 patch.

  reply	other threads:[~2019-05-27  2:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 21:06 [PATCH net 0/4] Allow TX timestamp with UDP GSO Fred Klassen
2019-05-23 21:06 ` [PATCH net 1/4] net/udp_gso: " Fred Klassen
2019-05-23 21:39   ` Willem de Bruijn
2019-05-24  1:38     ` Fred Klassen
2019-05-24  4:53       ` Willem de Bruijn
2019-05-24 16:34         ` Fred Klassen
2019-05-24 19:29           ` Willem de Bruijn
2019-05-24 22:01             ` Fred Klassen
2019-05-25 15:20               ` Willem de Bruijn
2019-05-25 18:47                 ` Fred Klassen
2019-05-27  1:30                   ` Willem de Bruijn
2019-05-27  2:09                     ` Willem de Bruijn [this message]
2019-05-25 20:46     ` Fred Klassen
2019-05-23 21:59   ` Willem de Bruijn
2019-05-25 20:09     ` Fred Klassen
2019-05-25 20:47     ` Fred Klassen
2019-05-23 21:06 ` [PATCH net 2/4] net/udpgso_bench_tx: options to exercise TX CMSG Fred Klassen
2019-05-23 21:45   ` Willem de Bruijn
2019-05-23 21:52   ` Willem de Bruijn
2019-05-24  2:10     ` Fred Klassen
2019-05-23 21:06 ` [PATCH net 3/4] net/udpgso_bench_tx: fix sendmmsg on unconnected socket Fred Klassen
2019-05-23 21:06 ` [PATCH net 4/4] net/udpgso_bench_tx: audit error queue Fred Klassen
2019-05-23 21:56   ` Willem de Bruijn
2019-05-24  1:27     ` Fred Klassen
2019-05-24  5:02       ` Willem de Bruijn
2019-05-27 21:30     ` Fred Klassen
2019-05-27 21:46       ` Willem de Bruijn
2019-05-27 22:56         ` Fred Klassen
2019-05-28  1:15           ` Willem de Bruijn
2019-05-28  5:19             ` Fred Klassen
2019-05-28 15:08               ` Willem de Bruijn
2019-05-28 16:57                 ` Fred Klassen
2019-05-28 17:07                   ` Willem de Bruijn
2019-05-28 17:11                     ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-+h2qJP0M5XQrcFVfyn3TP7Jd0UJ1zFf0kbUeC9uKKNxQ@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=fklassen@appneta.com \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).