All of lore.kernel.org
 help / color / mirror / Atom feed
* Extending socket timestamping API for NTP
@ 2017-02-07 14:01 Miroslav Lichvar
  2017-02-07 17:45 ` Keller, Jacob E
                   ` (6 more replies)
  0 siblings, 7 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-07 14:01 UTC (permalink / raw)
  To: netdev
  Cc: Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

I'd like to propose some changes and new options for the timestamping
interface that I think would be useful for NTP implementations and
maybe also other applications. Before I or someone else tries to
implement them, do you think they would actually make sense and fit
well in the current code?

1) new rx_filter for NTP

   Some NICs can't timestamp all received packets and are currently
   unusable for NTP with HW timestamping. The new filter would allow
   NTP support in new NICs and adding support to existing NICs with
   firmware/driver updates. The filter would apply to IPv4 and IPv6
   UDP packets received from or sent to the port number 123.

   Should be the current drivers of HW that can timestamp all packets
   updated to fall back to HWTSTAMP_FILTER_ALL?

2) new SO_TIMESTAMPING option to receive from the error queue only
   user data as was passed to sendmsg() instead of Ethernet frames

   Parsing Ethernet and IP headers (especially IPv6 options) is not
   fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
   applications which process messages from the error queue
   asynchronously and don't bind/connect their sockets.

3) target address in msg_name of messages from the error queue

   With 2) and unconnected sockets, there needs to be a way to get the
   address to which the packet was sent. Is it ok to always fill
   msg_name, or does it need to be a new option?

4) allow sockets to use both SW and HW TX timestamping at the same time

   When using a socket which is not bound to a specific interface, it
   would be nice to get transmit SW timestamps when HW timestamps are
   missing. I suspect it's difficult to predict if a HW timestamp will
   be available. Maybe it would be acceptable to get from the error
   queue two messages per transmission if the interface supports both
   SW and HW timestamping?

5) new SO_TIMESTAMPING options to get transposed RX timestamps

   PTP uses preamble RX timestamps, but NTP works with trailer RX
   timestamps. This means NTP implementations currently need to
   transpose HW RX timestamps. The calculation requires the link speed
   and the length of the packet at layer 2. It seems this can be
   reliably done only using raw sockets. It would be very nice if the
   kernel could tranpose the timestamps automatically.

   The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
   SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
   SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.

   PTP has a similar problem with SW RX timestamps, which are closer
   to the trailer timestamps rather than preamble timestamps. A new
   SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
   implementations to get transposed timestamps in order to improve
   accuracy.

6) new SO_TIMESTAMPING option to get PHC index with HW timestamps

   With bridges, bonding and other things it's difficult to determine
   which PHC timestamped the packet. It would be very useful if the
   PHC index was provided with each HW timestamp.

   I'm not sure what would be the best place to put it. I guess the
   second timespec in scm_timestamping could be reused for this, but
   that sounds like a gross hack. Do we need to define a new struct?

Thoughts?

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
@ 2017-02-07 17:45 ` Keller, Jacob E
  2017-02-07 22:32   ` Willem de Bruijn
  2017-02-08  1:52   ` Denny Page
  2017-02-07 18:54 ` Soheil Hassas Yeganeh
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 47+ messages in thread
From: Keller, Jacob E @ 2017-02-07 17:45 UTC (permalink / raw)
  To: Miroslav Lichvar, netdev
  Cc: Richard Cochran, Jiri Benc, Denny Page, Willem de Bruijn

Hi Miroslav,

> -----Original Message-----
> From: Miroslav Lichvar [mailto:mlichvar@redhat.com]
> Sent: Tuesday, February 07, 2017 6:02 AM
> To: netdev@vger.kernel.org
> Cc: Richard Cochran <richardcochran@gmail.com>; Jiri Benc
> <jbenc@redhat.com>; Keller, Jacob E <jacob.e.keller@intel.com>; Denny Page
> <dennypage@me.com>; Willem de Bruijn <willemb@google.com>
> Subject: Extending socket timestamping API for NTP
> 
> I'd like to propose some changes and new options for the timestamping
> interface that I think would be useful for NTP implementations and
> maybe also other applications. Before I or someone else tries to
> implement them, do you think they would actually make sense and fit
> well in the current code?
> 
> 1) new rx_filter for NTP
> 
>    Some NICs can't timestamp all received packets and are currently
>    unusable for NTP with HW timestamping. The new filter would allow
>    NTP support in new NICs and adding support to existing NICs with
>    firmware/driver updates. The filter would apply to IPv4 and IPv6
>    UDP packets received from or sent to the port number 123.

The main problem here is that most hardware that *can't* timestamp all packets is pretty limited to timestamping only PTP frames. It's possible with firmware upgrades this could be worked around, but I do not know if it would actually happen. Still, it can't really hurt too much to add a new filter, and those drivers which can support it already should be easy to implement.

> 
>    Should be the current drivers of HW that can timestamp all packets
>    updated to fall back to HWTSTAMP_FILTER_ALL?

Generally, the drivers I am aware of that support timestamping all packets do so for any filter request, rather than actually limiting the timestamping.

> 
> 2) new SO_TIMESTAMPING option to receive from the error queue only
>    user data as was passed to sendmsg() instead of Ethernet frames
> 
>    Parsing Ethernet and IP headers (especially IPv6 options) is not
>    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>    applications which process messages from the error queue
>    asynchronously and don't bind/connect their sockets.

This would be useful for application writing.

> 
> 3) target address in msg_name of messages from the error queue
> 
>    With 2) and unconnected sockets, there needs to be a way to get the
>    address to which the packet was sent. Is it ok to always fill
>    msg_name, or does it need to be a new option?


I'm not sure.

> 
> 4) allow sockets to use both SW and HW TX timestamping at the same time
> 
>    When using a socket which is not bound to a specific interface, it
>    would be nice to get transmit SW timestamps when HW timestamps are
>    missing. I suspect it's difficult to predict if a HW timestamp will
>    be available. Maybe it would be acceptable to get from the error
>    queue two messages per transmission if the interface supports both
>    SW and HW timestamping?


This seems useful, but not sure how best to implement it.

> Thoughts?
> 
> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
  2017-02-07 17:45 ` Keller, Jacob E
@ 2017-02-07 18:54 ` Soheil Hassas Yeganeh
  2017-02-08 10:14   ` Miroslav Lichvar
  2017-02-07 20:37 ` sdncurious
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-02-07 18:54 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> 2) new SO_TIMESTAMPING option to receive from the error queue only
>    user data as was passed to sendmsg() instead of Ethernet frames
>
>    Parsing Ethernet and IP headers (especially IPv6 options) is not
>    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>    applications which process messages from the error queue
>    asynchronously and don't bind/connect their sockets.

This is going to be quite useful. However, I'm not sure if sending
back the original packet would be a proper API. Instead, one option is
to add a control message, so that applications can set the OPT_ID for
the timestamp. Perhaps, something like from user's perspective:

cmsg->cmsg_level             = SOL_SOCKET;
cmsg->cmsg_type              = SCM_TIMESTAMPING_OPT_ID;
cmsg->cmsg_len               = CMSG_LEN(sizeof(__u32));
*((__u32 *) CMSG_DATA(cmsg)) = my_id;

Thanks,
Soheil

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
  2017-02-07 17:45 ` Keller, Jacob E
  2017-02-07 18:54 ` Soheil Hassas Yeganeh
@ 2017-02-07 20:37 ` sdncurious
  2017-02-08 10:26   ` Miroslav Lichvar
  2017-02-08  1:18 ` Denny Page
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 47+ messages in thread
From: sdncurious @ 2017-02-07 20:37 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:


> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
>
>    PTP uses preamble RX timestamps, but NTP works with trailer RX
>    timestamps. This means NTP implementations currently need to
>    transpose HW RX timestamps. The calculation requires the link speed
>    and the length of the packet at layer 2. It seems this can be
>    reliably done only using raw sockets. It would be very nice if the
>    kernel could tranpose the timestamps automatically.

Is this a requirement ? RFC 5905 does not seem to imply this and a
search  does not show any issues being reported.
>From RFC 5905

   Reference Timestamp: Time when the system clock was last set or
   corrected, in NTP timestamp format.

   Origin Timestamp (org): Time at the client when the request departed
   for the server, in NTP timestamp format.

   Receive Timestamp (rec): Time at the server when the request arrived
   from the client, in NTP timestamp format.

   Transmit Timestamp (xmt): Time at the server when the response left
   for the client, in NTP timestamp format.

   Destination Timestamp (dst): Time at the client when the reply
   arrived from the server, in NTP timestamp format.

   Note: The Destination Timestamp field is not included as a header
   field; it is determined upon arrival of the packet and made available
   in the packet buffer data structure.

   If the NTP has access to the physical layer, then the timestamps are
   associated with the beginning of the symbol after the start of frame.
   Otherwise, implementations should attempt to associate the timestamp
   to the earliest accessible point in the frame.



>
>    The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
>    SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
>    SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.
>
>    PTP has a similar problem with SW RX timestamps, which are closer
>    to the trailer timestamps rather than preamble timestamps. A new
>    SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
>    implementations to get transposed timestamps in order to improve
>    accuracy.


>
> 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
>
>    With bridges, bonding and other things it's difficult to determine
>    which PHC timestamped the packet. It would be very useful if the
>    PHC index was provided with each HW timestamp.
>
>    I'm not sure what would be the best place to put it. I guess the
>    second timespec in scm_timestamping could be reused for this, but
>    that sounds like a gross hack. Do we need to define a new struct?

What is the use case for this. even if the delay though the PHY's how
would that be compensated ?

RMS
>
> Thoughts?
>
> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 17:45 ` Keller, Jacob E
@ 2017-02-07 22:32   ` Willem de Bruijn
  2017-02-08 14:18     ` Soheil Hassas Yeganeh
  2017-02-27 15:23     ` Miroslav Lichvar
  2017-02-08  1:52   ` Denny Page
  1 sibling, 2 replies; 47+ messages in thread
From: Willem de Bruijn @ 2017-02-07 22:32 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: Miroslav Lichvar, netdev, Richard Cochran, Jiri Benc, Denny Page,
	Willem de Bruijn

>> 2) new SO_TIMESTAMPING option to receive from the error queue only
>>    user data as was passed to sendmsg() instead of Ethernet frames
>>
>>    Parsing Ethernet and IP headers (especially IPv6 options) is not
>>    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>>    applications which process messages from the error queue
>>    asynchronously and don't bind/connect their sockets.
>
> This would be useful for application writing.

What kind of user data are you suggesting? Just a user-defined ID
passed as a cmsg? Allowing such metadata to override
skb_shinfo(skb)->tskey sounds fine.

>> 3) target address in msg_name of messages from the error queue
>>
>>    With 2) and unconnected sockets, there needs to be a way to get the
>>    address to which the packet was sent. Is it ok to always fill
>>    msg_name, or does it need to be a new option?
>
>
> I'm not sure.

This would be an argument to just loop the original packet.

>> 4) allow sockets to use both SW and HW TX timestamping at the same time
>>
>>    When using a socket which is not bound to a specific interface, it
>>    would be nice to get transmit SW timestamps when HW timestamps are
>>    missing. I suspect it's difficult to predict if a HW timestamp will
>>    be available. Maybe it would be acceptable to get from the error
>>    queue two messages per transmission if the interface supports both
>>    SW and HW timestamping?
>
>
> This seems useful,

Agreed, as long as it is optional so that it does not change the
behavior for existing applications.

> but not sure how best to implement it.

It might be sufficient to just remove the second line in sw_tx_timestamp

static inline void sw_tx_timestamp(struct sk_buff *skb)
{
        if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
            !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
                skb_tstamp_tx(skb, NULL);
}

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
                   ` (2 preceding siblings ...)
  2017-02-07 20:37 ` sdncurious
@ 2017-02-08  1:18 ` Denny Page
       [not found] ` <CAHoNx58u=Fze4e5V2Wb_LiBhka1Mzny3zOVNfvuzjnmQ4wBO=Q@mail.gmail.com>
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-08  1:18 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Feb 07, 2017, at 06:01, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> 
> 1) new rx_filter for NTP
> 
>   Some NICs can't timestamp all received packets and are currently
>   unusable for NTP with HW timestamping. The new filter would allow
>   NTP support in new NICs and adding support to existing NICs with
>   firmware/driver updates. The filter would apply to IPv4 and IPv6
>   UDP packets received from or sent to the port number 123.
> 

I think this is a good idea. Even if the hardware doesn’t support it, the filtering could be done in the kernel. Save a huge number of context switches.



> 4) allow sockets to use both SW and HW TX timestamping at the same time
> 
>   When using a socket which is not bound to a specific interface, it
>   would be nice to get transmit SW timestamps when HW timestamps are
>   missing. I suspect it's difficult to predict if a HW timestamp will
>   be available. Maybe it would be acceptable to get from the error
>   queue two messages per transmission if the interface supports both
>   SW and HW timestamping?
> 

Highly agreed. The current interface pretty much forces a socket per physical interface, which should not be necessary.


> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> 
>   PTP uses preamble RX timestamps, but NTP works with trailer RX
>   timestamps. This means NTP implementations currently need to
>   transpose HW RX timestamps. The calculation requires the link speed
>   and the length of the packet at layer 2. It seems this can be
>   reliably done only using raw sockets. It would be very nice if the
>   kernel could tranpose the timestamps automatically.
> 
>   The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
>   SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
>   SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.
> 
>   PTP has a similar problem with SW RX timestamps, which are closer
>   to the trailer timestamps rather than preamble timestamps. A new
>   SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
>   implementations to get transposed timestamps in order to improve
>   accuracy.
> 

Also highly agreed.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 17:45 ` Keller, Jacob E
  2017-02-07 22:32   ` Willem de Bruijn
@ 2017-02-08  1:52   ` Denny Page
  2017-02-08  5:27     ` Richard Cochran
  1 sibling, 1 reply; 47+ messages in thread
From: Denny Page @ 2017-02-08  1:52 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: Miroslav Lichvar, netdev, Richard Cochran, Jiri Benc, Willem de Bruijn

[Resend without rich text]

> On Feb 07, 2017, at 09:45, Keller, Jacob E <jacob.e.keller@intel.com> wrote:
> 
> The main problem here is that most hardware that *can't* timestamp all packets is pretty limited to timestamping only PTP frames.


Most, but not all. The TI DP83630 doesn’t support timestamping for all packets, but it does support either PTP or NTP:

===
2.3.2.3 NTP Packet Timestamp
The DP83630 may be programmed to timestamp NTP packets instead of PTP packets. This operation is enabled by setting the NTP_TS_EN control in the PTP_TXCFG0 register. When configured for NTP timestamps, the DP83630 will timestamp packets with the NTP UDP port number rather than the PTP port number (note that the device cannot be configured to timestamp both PTP and NTP packets). One-Step operation is not supported for NTP timestamps, so transmit timestamps cannot be inserted directly into outgoing NTP packets. Timestamp insertion is available for receive timestamps but must use a single, fixed location. 
===

Right now, there is no API to signal to the driver that NTP timestamping is desired.

Even if the hardware does not directly support filtering, it can be implemented in the driver.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
       [not found] ` <CAHoNx58u=Fze4e5V2Wb_LiBhka1Mzny3zOVNfvuzjnmQ4wBO=Q@mail.gmail.com>
@ 2017-02-08  3:06   ` Denny Page
  0 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-08  3:06 UTC (permalink / raw)
  To: sdncurious
  Cc: Miroslav Lichvar, netdev, Richard Cochran, Jiri Benc, Keller,
	Jacob E, Willem de Bruijn

[Resend without rich text]

On Feb 07, 2017, at 12:17, sdncurious <sdncurious@gmail.com> wrote:
>  If the NTP has access to the physical layer, then the timestamps are
>    associated with the beginning of the symbol after the start of frame.
>    Otherwise, implementations should attempt to associate the timestamp
>    to the earliest accessible point in the frame.

The spec is unfortunately a bit ambiguous and probably should be clarified.

NTP is sensitive to transmission asymmetry. While using the SFD is appropriate for transmit timestamps, it is not appropriate for receive timestamps. A simple reason for this is port speed mismatch. Consider a 1Gb entity communicating with a 100Mb entity on a local switch: leaving aside internal switch delays, if SFD timestamping is used for both transmit and receive, then there is a baked in asymmetry of 6768ns between the forward and reverse paths; if SFD is used for transmit, and FCS end is used for receive, there is no asymmetry.

There is a good explanation of this written by David Mills (NTP's author) here: https://www.eecis.udel.edu/~mills/stamp.html#require

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-08  1:52   ` Denny Page
@ 2017-02-08  5:27     ` Richard Cochran
  2017-02-08  5:48       ` Denny Page
  2017-02-08 17:27       ` Denny Page
  0 siblings, 2 replies; 47+ messages in thread
From: Richard Cochran @ 2017-02-08  5:27 UTC (permalink / raw)
  To: Denny Page
  Cc: Keller, Jacob E, Miroslav Lichvar, netdev, Jiri Benc, Willem de Bruijn

On Tue, Feb 07, 2017 at 05:52:52PM -0800, Denny Page wrote:
> Most, but not all. The TI DP83630 doesn’t support timestamping for all packets, but it does support either PTP or NTP:

That is the one and only device that explicitly supports NTP. This is
a nice idea, of course, but it just did not take off among other
products.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-08  5:27     ` Richard Cochran
@ 2017-02-08  5:48       ` Denny Page
  2017-02-08 17:27       ` Denny Page
  1 sibling, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-08  5:48 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Keller, Jacob E, Miroslav Lichvar, netdev, Jiri Benc, Willem de Bruijn

On Feb 07, 2017, at 21:27, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> On Tue, Feb 07, 2017 at 05:52:52PM -0800, Denny Page wrote:
>> Most, but not all. The TI DP83630 doesn’t support timestamping for all packets, but it does support either PTP or NTP:
> 
> That is the one and only device that explicitly supports NTP. This is
> a nice idea, of course, but it just did not take off among other
> products.

I have to say I haven’t gone looking for others, and will take your word regarding the DP83630 being the one and only. I only learned about the DP83630 because I have a couple of stratum 1 devices that use this phy and have been working with the vendor regarding integration of hardware timestamps for NTP.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 18:54 ` Soheil Hassas Yeganeh
@ 2017-02-08 10:14   ` Miroslav Lichvar
  0 siblings, 0 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-08 10:14 UTC (permalink / raw)
  To: Soheil Hassas Yeganeh
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

On Tue, Feb 07, 2017 at 10:54:22AM -0800, Soheil Hassas Yeganeh wrote:
> On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > 2) new SO_TIMESTAMPING option to receive from the error queue only
> >    user data as was passed to sendmsg() instead of Ethernet frames
> >
> >    Parsing Ethernet and IP headers (especially IPv6 options) is not
> >    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
> >    applications which process messages from the error queue
> >    asynchronously and don't bind/connect their sockets.
> 
> This is going to be quite useful. However, I'm not sure if sending
> back the original packet would be a proper API. Instead, one option is
> to add a control message, so that applications can set the OPT_ID for
> the timestamp. Perhaps, something like from user's perspective:
> 
> cmsg->cmsg_level             = SOL_SOCKET;
> cmsg->cmsg_type              = SCM_TIMESTAMPING_OPT_ID;
> cmsg->cmsg_len               = CMSG_LEN(sizeof(__u32));
> *((__u32 *) CMSG_DATA(cmsg)) = my_id;

That could be very useful. The question is if 32 bits worth of user
data would be good enough for all applications. In the case of the NTP
server, I currently save 128 bits per client in order to support the
interleaved mode. Half of that is the receive timestamp, which is
compared to the receive timestamp from messages received from the
error queue. Matching only lower 32 bits of the timestamp would
probably still work fine. However, if NTP supported follow up messages
like PTP, 32 bits would not be enough to create a valid message for
the client without saving some additional state. Getting the original
message would be very convenient here. NTP packets are normally very
short, so I'm not sure how much benefit there would be in using the
OPT_ID.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 20:37 ` sdncurious
@ 2017-02-08 10:26   ` Miroslav Lichvar
  2017-02-08 23:27     ` sdncurious
  2017-02-08 23:34     ` sdncurious
  0 siblings, 2 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-08 10:26 UTC (permalink / raw)
  To: sdncurious
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

On Tue, Feb 07, 2017 at 12:37:15PM -0800, sdncurious wrote:
> On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> >
> >    With bridges, bonding and other things it's difficult to determine
> >    which PHC timestamped the packet. It would be very useful if the
> >    PHC index was provided with each HW timestamp.
> >
> >    I'm not sure what would be the best place to put it. I guess the
> >    second timespec in scm_timestamping could be reused for this, but
> >    that sounds like a gross hack. Do we need to define a new struct?
> 
> What is the use case for this. even if the delay though the PHY's how
> would that be compensated ?

The idea was that applications like NTP servers and clients wouldn't
have to care about interfaces and how they map together with addresses
to PHCs over time. Currently, I use the interface index from
IP_PKTINFO to get the PHC, but that doesn't work with bridges and
other virtual interfaces. Another possibility would be an option to
modify the behavior of IP_PKTINFO to save the index of the real
interface. I'm not sure how would that compare in difficulty to
extending SCM_TIMESTAMPING with PHC index.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 22:32   ` Willem de Bruijn
@ 2017-02-08 14:18     ` Soheil Hassas Yeganeh
  2017-02-27 15:23     ` Miroslav Lichvar
  1 sibling, 0 replies; 47+ messages in thread
From: Soheil Hassas Yeganeh @ 2017-02-08 14:18 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Keller, Jacob E, Miroslav Lichvar, netdev, Richard Cochran,
	Jiri Benc, Denny Page, Willem de Bruijn

On Tue, Feb 7, 2017 at 2:32 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>>> 2) new SO_TIMESTAMPING option to receive from the error queue only
>>>    user data as was passed to sendmsg() instead of Ethernet frames
>>>
>>>    Parsing Ethernet and IP headers (especially IPv6 options) is not
>>>    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>>>    applications which process messages from the error queue
>>>    asynchronously and don't bind/connect their sockets.
>>
>> This would be useful for application writing.
>
> What kind of user data are you suggesting? Just a user-defined ID
> passed as a cmsg? Allowing such metadata to override
> skb_shinfo(skb)->tskey sounds fine.

Yes, exactly. Just a user-defined ID that overrides the
skb_shinfo(skb)->tskey.

Thanks,
Soheil

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-08  5:27     ` Richard Cochran
  2017-02-08  5:48       ` Denny Page
@ 2017-02-08 17:27       ` Denny Page
  1 sibling, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-08 17:27 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Keller, Jacob E, Miroslav Lichvar, netdev, Jiri Benc, Willem de Bruijn


> On Feb 07, 2017, at 21:27, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> On Tue, Feb 07, 2017 at 05:52:52PM -0800, Denny Page wrote:
>> Most, but not all. The TI DP83630 doesn’t support timestamping for all packets, but it does support either PTP or NTP:
> 
> That is the one and only device that explicitly supports NTP. This is
> a nice idea, of course, but it just did not take off among other
> products.

With a commonly available NTP implementation supporting hardware timestamping, this might get a bit more traction with manufacturers. 

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-08 10:26   ` Miroslav Lichvar
@ 2017-02-08 23:27     ` sdncurious
  2017-02-08 23:34     ` sdncurious
  1 sibling, 0 replies; 47+ messages in thread
From: sdncurious @ 2017-02-08 23:27 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

Dealing with individual interfaces does not make sense. This seems to be a
case where Reciprocity property is violated and hence should be handled as
such. This is different than when the two sides have single but different
speed NIC's. In this case the NIC used and the speed can change with each
packet. Although I am not sure if that is possible because the hash should
always land the packet on the NIC of the bond.

7. Reciprocity Errors

The above analysis assumes that the delays on the outbound and inbound
paths are the same; that is, the paths are reciprocal. This is assured if
the ropagation delays are the same, the transmission rates are the same and
the packet lengths are the same. In the NTP on-wire protocol all packets
have the the same length. If we assume the transmission rates are the same,
the only difference in path delays must be due to nonreciprocal
transmission paths. This often occurs if one way is via landline and the
other via satellite. It can also occur when the paths traverse tag-switched
core networks.

RMS

On Wed, Feb 8, 2017 at 2:26 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Tue, Feb 07, 2017 at 12:37:15PM -0800, sdncurious wrote:
>> On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
>> >
>> >    With bridges, bonding and other things it's difficult to determine
>> >    which PHC timestamped the packet. It would be very useful if the
>> >    PHC index was provided with each HW timestamp.
>> >
>> >    I'm not sure what would be the best place to put it. I guess the
>> >    second timespec in scm_timestamping could be reused for this, but
>> >    that sounds like a gross hack. Do we need to define a new struct?
>>
>> What is the use case for this. even if the delay though the PHY's how
>> would that be compensated ?
>
> The idea was that applications like NTP servers and clients wouldn't
> have to care about interfaces and how they map together with addresses
> to PHCs over time. Currently, I use the interface index from
> IP_PKTINFO to get the PHC, but that doesn't work with bridges and
> other virtual interfaces. Another possibility would be an option to
> modify the behavior of IP_PKTINFO to save the index of the real
> interface. I'm not sure how would that compare in difficulty to
> extending SCM_TIMESTAMPING with PHC index.
>
> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-08 10:26   ` Miroslav Lichvar
  2017-02-08 23:27     ` sdncurious
@ 2017-02-08 23:34     ` sdncurious
  1 sibling, 0 replies; 47+ messages in thread
From: sdncurious @ 2017-02-08 23:34 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

On Wed, Feb 8, 2017 at 2:26 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Tue, Feb 07, 2017 at 12:37:15PM -0800, sdncurious wrote:
>> On Tue, Feb 7, 2017 at 6:01 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
>> >
>> >    With bridges, bonding and other things it's difficult to determine
>> >    which PHC timestamped the packet. It would be very useful if the
>> >    PHC index was provided with each HW timestamp.
>> >
>> >    I'm not sure what would be the best place to put it. I guess the
>> >    second timespec in scm_timestamping could be reused for this, but
>> >    that sounds like a gross hack. Do we need to define a new struct?
>>
>> What is the use case for this. even if the delay though the PHY's how
>> would that be compensated ?
>
> The idea was that applications like NTP servers and clients wouldn't
> have to care about interfaces and how they map together with addresses
> to PHCs over time. Currently, I use the interface index from
> IP_PKTINFO to get the PHC, but that doesn't work with bridges and
> other virtual interfaces. Another possibility would be an option to
> modify the behavior of IP_PKTINFO to save the index of the real
> interface. I'm not sure how would that compare in difficulty to
> extending SCM_TIMESTAMPING with PHC index.

Why not just return the digest that is in the message ?
Though I am not sure if the least 32 bits will result in too many collisions.

RMS

>
> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
                   ` (4 preceding siblings ...)
       [not found] ` <CAHoNx58u=Fze4e5V2Wb_LiBhka1Mzny3zOVNfvuzjnmQ4wBO=Q@mail.gmail.com>
@ 2017-02-09  0:45 ` Denny Page
  2017-02-09 11:15   ` Miroslav Lichvar
  2017-02-09 20:25   ` Denny Page
  2017-02-09  8:02 ` Richard Cochran
  6 siblings, 2 replies; 47+ messages in thread
From: Denny Page @ 2017-02-09  0:45 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Willem de Bruijn

[Resend as plain text]


> On Feb 07, 2017, at 06:01, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> 
> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> 
>   PTP uses preamble RX timestamps, but NTP works with trailer RX
>   timestamps. This means NTP implementations currently need to
>   transpose HW RX timestamps. The calculation requires the link speed
>   and the length of the packet at layer 2. It seems this can be
>   reliably done only using raw sockets. It would be very nice if the
>   kernel could tranpose the timestamps automatically.
> 
>   The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
>   SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
>   SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.
> 
>   PTP has a similar problem with SW RX timestamps, which are closer
>   to the trailer timestamps rather than preamble timestamps. A new
>   SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
>   implementations to get transposed timestamps in order to improve
>   accuracy.
> 
> 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> 
>   With bridges, bonding and other things it's difficult to determine
>   which PHC timestamped the packet. It would be very useful if the
>   PHC index was provided with each HW timestamp.
> 
>   I'm not sure what would be the best place to put it. I guess the
>   second timespec in scm_timestamping could be reused for this, but
>   that sounds like a gross hack. Do we need to define a new struct?


Miroslav, if #5 were implemented, would #6 still needed?

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
                   ` (5 preceding siblings ...)
  2017-02-09  0:45 ` Denny Page
@ 2017-02-09  8:02 ` Richard Cochran
  2017-02-09 11:09   ` Miroslav Lichvar
  6 siblings, 1 reply; 47+ messages in thread
From: Richard Cochran @ 2017-02-09  8:02 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Jiri Benc, Keller, Jacob E, Denny Page, Willem de Bruijn

On Tue, Feb 07, 2017 at 03:01:44PM +0100, Miroslav Lichvar wrote:
> 1) new rx_filter for NTP

This is an easy one.  No objections here.

>    Should be the current drivers of HW that can timestamp all packets
>    updated to fall back to HWTSTAMP_FILTER_ALL?

Yes, and the phyter, the only driver that can support this directly,
would need some work.
 
> 2) new SO_TIMESTAMPING option to receive from the error queue only
>    user data as was passed to sendmsg() instead of Ethernet frames
> 
>    Parsing Ethernet and IP headers (especially IPv6 options) is not
>    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>    applications which process messages from the error queue
>    asynchronously and don't bind/connect their sockets.

This doesn't seem justified to me.  From the application POV, it is
easier to hash the transmitted frames than to parse loop backed
packets.

> 3) target address in msg_name of messages from the error queue
> 
>    With 2) and unconnected sockets, there needs to be a way to get the
>    address to which the packet was sent. Is it ok to always fill
>    msg_name, or does it need to be a new option?

Again, a hash table cures this.

> 4) allow sockets to use both SW and HW TX timestamping at the same time
> 
>    When using a socket which is not bound to a specific interface, it
>    would be nice to get transmit SW timestamps when HW timestamps are
>    missing. I suspect it's difficult to predict if a HW timestamp will
>    be available.

Right.

>    Maybe it would be acceptable to get from the error
>    queue two messages per transmission if the interface supports both
>    SW and HW timestamping?

I like this idea better.

However, I doubt the utility of this.  If you provide SW time stamps
always and TX mostly, but not always, this forces the application to
keep two sets of filtered data or two servos, one designed for SW and
one for HW accuracy.
 
> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> 
>    PTP uses preamble RX timestamps, but NTP works with trailer RX
>    timestamps. This means NTP implementations currently need to
>    transpose HW RX timestamps. The calculation requires the link speed
>    and the length of the packet at layer 2. It seems this can be
>    reliably done only using raw sockets. It would be very nice if the
>    kernel could tranpose the timestamps automatically.

Impossible, because the link speed may change between the time when
the MAC receives the data the kernel gets around to calculating the
time stamp.
 
> 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> 
>    With bridges, bonding and other things it's difficult to determine
>    which PHC timestamped the packet. It would be very useful if the
>    PHC index was provided with each HW timestamp.

Again, this only makes writing the application harder, as it would be
forced to sort packets by PHC index.  It is much easier to open
multiple sockets, each bound to one physical interface.
 
>    I'm not sure what would be the best place to put it. I guess the
>    second timespec in scm_timestamping could be reused for this, but
>    that sounds like a gross hack. Do we need to define a new struct?

Yes, I think so.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09  8:02 ` Richard Cochran
@ 2017-02-09 11:09   ` Miroslav Lichvar
  2017-02-09 19:42     ` sdncurious
  2017-03-23 16:21     ` Miroslav Lichvar
  0 siblings, 2 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-09 11:09 UTC (permalink / raw)
  To: Richard Cochran
  Cc: netdev, Jiri Benc, Keller, Jacob E, Denny Page, Willem de Bruijn

On Thu, Feb 09, 2017 at 09:02:42AM +0100, Richard Cochran wrote:
> On Tue, Feb 07, 2017 at 03:01:44PM +0100, Miroslav Lichvar wrote:
> > 2) new SO_TIMESTAMPING option to receive from the error queue only
> >    user data as was passed to sendmsg() instead of Ethernet frames
> > 
> >    Parsing Ethernet and IP headers (especially IPv6 options) is not
> >    fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
> >    applications which process messages from the error queue
> >    asynchronously and don't bind/connect their sockets.
> 
> This doesn't seem justified to me.  From the application POV, it is
> easier to hash the transmitted frames than to parse loop backed
> packets.

At least in the case of the NTP implementation I'm working on that
would not be easier. I'm not saving transmitted packets. I think that
would be a waste of memory, complicating the code, and duplicating
work that the kernel is already doing. A public NTP server can handle
hundreds of thousands of requests per second, but not all of them may
get a SW/HW transmit timestamp. How would I know which will actually
get it and how long should I wait for it?

If the packet contains all data needed to process the TX timestamp,
it's much easier for me to use data from the kernel queue. If the
kernel drops it, it's not a problem. If the kernel loops it back, I
have everything I need.

> > 3) target address in msg_name of messages from the error queue
> > 
> >    With 2) and unconnected sockets, there needs to be a way to get the
> >    address to which the packet was sent. Is it ok to always fill
> >    msg_name, or does it need to be a new option?
> 
> Again, a hash table cures this.

It does, but I'm not sure it's always the best option.

> >    Maybe it would be acceptable to get from the error
> >    queue two messages per transmission if the interface supports both
> >    SW and HW timestamping?
> 
> I like this idea better.
> 
> However, I doubt the utility of this.  If you provide SW time stamps
> always and TX mostly, but not always, this forces the application to
> keep two sets of filtered data or two servos, one designed for SW and
> one for HW accuracy.

I think that depends on how is the application designed. In my case
each sample is using the best timestamps that were available (any
combination of daemon/SW/HW timestamps is possible) and they are all
mixed together. The NTP filtering algorithms then drop samples based
on their delay, not the timestamping source. Samples using SW
timestamps have larger delay than samples using HW timestamp, so they
will be dropped unless HW timestamps are missing for long time. In my
testing, and from what others have reported, this work well. An
occasional missing HW timestamp is not a problem. 

> > 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> > 
> >    PTP uses preamble RX timestamps, but NTP works with trailer RX
> >    timestamps. This means NTP implementations currently need to
> >    transpose HW RX timestamps. The calculation requires the link speed
> >    and the length of the packet at layer 2. It seems this can be
> >    reliably done only using raw sockets. It would be very nice if the
> >    kernel could tranpose the timestamps automatically.
> 
> Impossible, because the link speed may change between the time when
> the MAC receives the data the kernel gets around to calculating the
> time stamp.

I think that would be an acceptable limitation. The application
certainly couldn't do a better job than the kernel and it won't have
to use raw sockets.

> > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> > 
> >    With bridges, bonding and other things it's difficult to determine
> >    which PHC timestamped the packet. It would be very useful if the
> >    PHC index was provided with each HW timestamp.
> 
> Again, this only makes writing the application harder, as it would be
> forced to sort packets by PHC index.  It is much easier to open
> multiple sockets, each bound to one physical interface.

With multiple sockets I'd have to know which packet belongs to which
socket and track routing changes. I'm not sure if that's even possible
with bonding. One socket for everything seems much easier to me. I
don't care about interfaces, I just need to know which clock
timestamped the packet.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09  0:45 ` Denny Page
@ 2017-02-09 11:15   ` Miroslav Lichvar
  2017-02-09 20:25   ` Denny Page
  1 sibling, 0 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-09 11:15 UTC (permalink / raw)
  To: Denny Page
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Wed, Feb 08, 2017 at 04:45:05PM -0800, Denny Page wrote:
> > On Feb 07, 2017, at 06:01, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > 
> > 5) new SO_TIMESTAMPING options to get transposed RX timestamps

> > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> > 
> >   With bridges, bonding and other things it's difficult to determine
> >   which PHC timestamped the packet. It would be very useful if the
> >   PHC index was provided with each HW timestamp.
> > 
> >   I'm not sure what would be the best place to put it. I guess the
> >   second timespec in scm_timestamping could be reused for this, but
> >   that sounds like a gross hack. Do we need to define a new struct?

> Miroslav, if #5 were implemented, would #6 still needed?

Yes. With #5 we wouldn't have to know the link speed of the interface
and guess the length of received packets (or use raw sockets), but we
would still not know which clock has actually timestamped the packet.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09 11:09   ` Miroslav Lichvar
@ 2017-02-09 19:42     ` sdncurious
  2017-02-09 20:37       ` Denny Page
  2017-02-10  0:33       ` Denny Page
  2017-03-23 16:21     ` Miroslav Lichvar
  1 sibling, 2 replies; 47+ messages in thread
From: sdncurious @ 2017-02-09 19:42 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Richard Cochran, netdev, Jiri Benc, Keller, Jacob E, Denny Page,
	Willem de Bruijn

>
>> > 5) new SO_TIMESTAMPING options to get transposed RX timestamps
>> >
>> >    PTP uses preamble RX timestamps, but NTP works with trailer RX
>> >    timestamps. This means NTP implementations currently need to
>> >    transpose HW RX timestamps. The calculation requires the link speed
>> >    and the length of the packet at layer 2. It seems this can be
>> >    reliably done only using raw sockets. It would be very nice if the
>> >    kernel could tranpose the timestamps automatically.
>>
>> Impossible, because the link speed may change between the time when
>> the MAC receives the data the kernel gets around to calculating the
>> time stamp.
>
> I think that would be an acceptable limitation. The application
> certainly couldn't do a better job than the kernel and it won't have
> to use raw sockets.
>

As you are using HW that supports NTP time stamping won't it by
default time stamp the receiving packet correctly at the CRC ? Or if
someone came out with such a HW than what ?
I am still at a loss as to why transpose is required in case of HW
time stamping. If STF is used for both Tx and Rx time stamping the
timing is absolutely correct. Any  delay in the PHY is nothing more
than usual kernel processing delay,  which NTP should be able to deal
with when trying to calculate the round trip time, unless there is an
issue with the algorithm.

The application can even calculate the complete delay in kernel
processing if we provide another time stamp when the packet is read by
the application. That seems to provide more accuracy and seems like a
better idea.


RMS.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09  0:45 ` Denny Page
  2017-02-09 11:15   ` Miroslav Lichvar
@ 2017-02-09 20:25   ` Denny Page
  1 sibling, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-09 20:25 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Richard Cochran, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Feb 08, 2017, at 16:45, Denny Page <dennypage@me.com> wrote:
> 
> [Resend as plain text]
> 
> 
>> On Feb 07, 2017, at 06:01, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> 
>> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
>> 
>>  PTP uses preamble RX timestamps, but NTP works with trailer RX
>>  timestamps. This means NTP implementations currently need to
>>  transpose HW RX timestamps. The calculation requires the link speed
>>  and the length of the packet at layer 2. It seems this can be
>>  reliably done only using raw sockets. It would be very nice if the
>>  kernel could tranpose the timestamps automatically.
>> 
>>  The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
>>  SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
>>  SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.
>> 
>>  PTP has a similar problem with SW RX timestamps, which are closer
>>  to the trailer timestamps rather than preamble timestamps. A new
>>  SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
>>  implementations to get transposed timestamps in order to improve
>>  accuracy.
>> 
>> 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
>> 
>>  With bridges, bonding and other things it's difficult to determine
>>  which PHC timestamped the packet. It would be very useful if the
>>  PHC index was provided with each HW timestamp.
>> 
>>  I'm not sure what would be the best place to put it. I guess the
>>  second timespec in scm_timestamping could be reused for this, but
>>  that sounds like a gross hack. Do we need to define a new struct?
> 
> 
> Miroslav, if #5 were implemented, would #6 still needed?
> 
> Denny

Miroslav, please ignore this. Of course you still need the index in order to get the PHC offset. My bad.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09 19:42     ` sdncurious
@ 2017-02-09 20:37       ` Denny Page
  2017-02-10  0:33       ` Denny Page
  1 sibling, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-09 20:37 UTC (permalink / raw)
  To: sdncurious
  Cc: Miroslav Lichvar, Richard Cochran, netdev, Jiri Benc, Keller,
	Jacob E, Willem de Bruijn


> On Feb 09, 2017, at 11:42, sdncurious <sdncurious@gmail.com> wrote:
> 
> As you are using HW that supports NTP time stamping won't it by
> default time stamp the receiving packet correctly at the CRC ? Or if
> someone came out with such a HW than what ?

As discussed in private email, all hardware operates at the end of the SFD, and in makes sense for the hardware to always do so regardless of what protocol is passing through.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09 19:42     ` sdncurious
  2017-02-09 20:37       ` Denny Page
@ 2017-02-10  0:33       ` Denny Page
  2017-02-10 18:55         ` Denny Page
  1 sibling, 1 reply; 47+ messages in thread
From: Denny Page @ 2017-02-10  0:33 UTC (permalink / raw)
  To: sdncurious
  Cc: Miroslav Lichvar, Richard Cochran, netdev, Jiri Benc, Keller,
	Jacob E, Willem de Bruijn


> On Feb 09, 2017, at 11:42, sdncurious <sdncurious@gmail.com> wrote:
> 
> I am still at a loss as to why transpose is required in case of HW
> time stamping. If STF is used for both Tx and Rx time stamping the
> timing is absolutely correct.

Perhaps this will help. The specific transposition is:

  transposed_timestamp_ns = timestamp_ns + (frame_len_bits * 1000000000) / (interface_speed * 1000000)

The transposition is applied to received timestamps only.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-10  0:33       ` Denny Page
@ 2017-02-10 18:55         ` Denny Page
  0 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-02-10 18:55 UTC (permalink / raw)
  To: sdncurious
  Cc: Miroslav Lichvar, Richard Cochran, netdev, Jiri Benc, Keller,
	Jacob E, Willem de Bruijn


> On Feb 09, 2017, at 16:33, Denny Page <dennypage@me.com> wrote:
> 
> 
>> On Feb 09, 2017, at 11:42, sdncurious <sdncurious@gmail.com> wrote:
>> 
>> I am still at a loss as to why transpose is required in case of HW
>> time stamping. If STF is used for both Tx and Rx time stamping the
>> timing is absolutely correct.
> 
> Perhaps this will help. The specific transposition is:
> 
>  transposed_timestamp_ns = timestamp_ns + (frame_len_bits * 1000000000) / (interface_speed * 1000000)
> 
> The transposition is applied to received timestamps only.


Before anyone else asks, yes, I know this can be reduced. :)

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-07 22:32   ` Willem de Bruijn
  2017-02-08 14:18     ` Soheil Hassas Yeganeh
@ 2017-02-27 15:23     ` Miroslav Lichvar
  2017-02-28  0:01       ` Willem de Bruijn
  1 sibling, 1 reply; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-27 15:23 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Keller, Jacob E, netdev, Richard Cochran, Jiri Benc, Denny Page,
	Willem de Bruijn

On Tue, Feb 07, 2017 at 02:32:04PM -0800, Willem de Bruijn wrote:
> >> 4) allow sockets to use both SW and HW TX timestamping at the same time
> >>
> >>    When using a socket which is not bound to a specific interface, it
> >>    would be nice to get transmit SW timestamps when HW timestamps are
> >>    missing. I suspect it's difficult to predict if a HW timestamp will
> >>    be available. Maybe it would be acceptable to get from the error
> >>    queue two messages per transmission if the interface supports both
> >>    SW and HW timestamping?
> >
> >
> > This seems useful,
> 
> Agreed, as long as it is optional so that it does not change the
> behavior for existing applications.

Do you think it is safe to assume that no application enabled both SW
and HW TX timestamping? Do we need a new option for this?

> > but not sure how best to implement it.
> 
> It might be sufficient to just remove the second line in sw_tx_timestamp
> 
> static inline void sw_tx_timestamp(struct sk_buff *skb)
> {
>         if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>             !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>                 skb_tstamp_tx(skb, NULL);
> }

With this change I'm getting two error messages per transmission, but
it looks like it may need some additional changes.

If the first error message is received after the HW timestamp was
captured, it contains both timestamps as the HW timestamp is in the
shared info of the skb. Is it possible it could contain a partially
updated HW timestamp? I'm not sure how locking works here. Is
scm_timestamping actually allowed to contain more than one timestamp?
The timestamping.txt document says "Only one field is non-zero at any
time.", but that wasn't true even before if both SW and HW RX
timestamping was enabled.

If SO_TIMESTAMP{,NS} is enabled, ts[0] in the second error message
will contain a bogus SW timestamp added by __sock_recv_timestamp() for
a "Race occurred between timestamp enabling and packet receiving". Is
there a guarantee applications will get a timestamp for all messages
after enabling SO_TIMESTAMP? The original code is older than the git
repo, so I'm not sure what was the reason for this. To me it would
make more sense to not add any SCM_TIMESTAMP (and SW timestamp in
SCM_TIMESTAMPING) when the the timestamp is missing. If that's not
always acceptable, maybe it could be restricted to sockets that have
HW timestamping enabled?

Some drivers don't call skb_tx_timestamp() when HW timestamp was
requested. From a cursory look it is e1000e, xgbe, sxgbe, and stmmac.
This should hopefully be an easy fix.

Thoughts?

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-27 15:23     ` Miroslav Lichvar
@ 2017-02-28  0:01       ` Willem de Bruijn
  2017-02-28  8:26         ` Miroslav Lichvar
  0 siblings, 1 reply; 47+ messages in thread
From: Willem de Bruijn @ 2017-02-28  0:01 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Keller, Jacob E, netdev, Richard Cochran, Jiri Benc, Denny Page,
	Willem de Bruijn

On Mon, Feb 27, 2017 at 10:23 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Tue, Feb 07, 2017 at 02:32:04PM -0800, Willem de Bruijn wrote:
>> >> 4) allow sockets to use both SW and HW TX timestamping at the same time
>> >>
>> >>    When using a socket which is not bound to a specific interface, it
>> >>    would be nice to get transmit SW timestamps when HW timestamps are
>> >>    missing. I suspect it's difficult to predict if a HW timestamp will
>> >>    be available. Maybe it would be acceptable to get from the error
>> >>    queue two messages per transmission if the interface supports both
>> >>    SW and HW timestamping?
>> >
>> >
>> > This seems useful,
>>
>> Agreed, as long as it is optional so that it does not change the
>> behavior for existing applications.
>
> Do you think it is safe to assume that no application enabled both SW
> and HW TX timestamping?

We cannot rule out that a process set both flags.

> Do we need a new option for this?

Similar to OPT_TSONLY or OPT_ID, but to signal the intent of
receiving both timestamps. Yes, agreed.

>> > but not sure how best to implement it.
>>
>> It might be sufficient to just remove the second line in sw_tx_timestamp
>>
>> static inline void sw_tx_timestamp(struct sk_buff *skb)
>> {
>>         if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>>             !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>>                 skb_tstamp_tx(skb, NULL);
>> }
>
> With this change I'm getting two error messages per transmission, but
> it looks like it may need some additional changes.
>
> If the first error message is received after the HW timestamp was
> captured,

When does this happen? The first timestamp is generated from
skb_tx_timestamp in the device driver's ndo_start_xmit before
passing the packet to the NIC, the second when the device
driver cleans the tx descriptor on completion.

Is this for drivers that do not have skb_tx_timestamp, as you
mention below? Then the solution is to add that call.

> it contains both timestamps as the HW timestamp is in the
> shared info of the skb. Is it possible it could contain a partially
> updated HW timestamp? I'm not sure how locking works here. Is
> scm_timestamping actually allowed to contain more than one timestamp?
> The timestamping.txt document says "Only one field is non-zero at any
> time.", but that wasn't true even before if both SW and HW RX
> timestamping was enabled.
>
> If SO_TIMESTAMP{,NS} is enabled, ts[0] in the second error message
> will contain a bogus SW timestamp added by __sock_recv_timestamp() for
> a "Race occurred between timestamp enabling and packet receiving".  Is

Good point. That should not be set on transmit timestamps.

> there a guarantee applications will get a timestamp for all messages
> after enabling SO_TIMESTAMP? The original code is older than the git
> repo, so I'm not sure what was the reason for this. To me it would
> make more sense to not add any SCM_TIMESTAMP (and SW timestamp in
> SCM_TIMESTAMPING) when the the timestamp is missing. If that's not
> always acceptable, maybe it could be restricted to sockets that have
> HW timestamping enabled?

I would limit scope to tx timestamping and leave rx semantics as is.

> Some drivers don't call skb_tx_timestamp() when HW timestamp was
> requested. From a cursory look it is e1000e, xgbe, sxgbe, and stmmac.
> This should hopefully be an easy fix.

Indeed. that should be added, then.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-28  0:01       ` Willem de Bruijn
@ 2017-02-28  8:26         ` Miroslav Lichvar
  2017-02-28 21:05           ` Willem de Bruijn
  0 siblings, 1 reply; 47+ messages in thread
From: Miroslav Lichvar @ 2017-02-28  8:26 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Keller, Jacob E, netdev, Richard Cochran, Jiri Benc, Denny Page,
	Willem de Bruijn

On Mon, Feb 27, 2017 at 07:01:54PM -0500, Willem de Bruijn wrote:
> On Mon, Feb 27, 2017 at 10:23 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > On Tue, Feb 07, 2017 at 02:32:04PM -0800, Willem de Bruijn wrote:
> >> >> 4) allow sockets to use both SW and HW TX timestamping at the same time

> > Do we need a new option for this?
> 
> Similar to OPT_TSONLY or OPT_ID, but to signal the intent of
> receiving both timestamps. Yes, agreed.

Ok. Thanks.

> > With this change I'm getting two error messages per transmission, but
> > it looks like it may need some additional changes.
> >
> > If the first error message is received after the HW timestamp was
> > captured,
> 
> When does this happen? The first timestamp is generated from
> skb_tx_timestamp in the device driver's ndo_start_xmit before
> passing the packet to the NIC, the second when the device
> driver cleans the tx descriptor on completion.

As I understand it, it happens when the first skb (created by the
skb_tx_timestamp() call) is received by the application after the
driver called skb_tstamp_tx() with the HW timestamp. The SW timestamps
are separate, but the HW timestamp is shared between clones. It
probably doesn't happen with the TSONLY option as it allocates a new
skb. When I print timestamps from scm_timestamping I see a mix of two
cases:

TX 1488268812.193945472 0.000000000 1488286813.273760139
TX 0.000000000 0.000000000 1488286813.273760139
RX 1488268812.354356188 0.000000000 1488286813.434096389

TX 1488268816.364407934 0.000000000 0.000000000
TX 0.000000000 0.000000000 1488286817.444251014
RX 1488268816.525150589 0.000000000 1488286817.604749889

In the first case I assume the HW timestamp was saved before the first
error message was received, so both error messages have the same HW
timestamp, but only one has the SW timestamp. In the second case, the
HW timestamp was saved later, so there is one message with SW
timestamp and one message with HW timestamp.

>From the application point of view it would make sense if in the first
case there was only one error message containing both timestamps. I'm
not sure how easy/safe it would be to drop the second skb. The other
approach would be to not put HW timestamp in the first message when
this "dual TX timestamping" option is enabled, so each error message
has only one timestamp.

> Is this for drivers that do not have skb_tx_timestamp, as you
> mention below? Then the solution is to add that call.

FWIW, I saw that with the e1000e driver after I made the
skb_tx_timestamp() call unconditional.

> > it contains both timestamps as the HW timestamp is in the
> > shared info of the skb. Is it possible it could contain a partially
> > updated HW timestamp? I'm not sure how locking works here. Is
> > scm_timestamping actually allowed to contain more than one timestamp?
> > The timestamping.txt document says "Only one field is non-zero at any
> > time.", but that wasn't true even before if both SW and HW RX
> > timestamping was enabled.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-28  8:26         ` Miroslav Lichvar
@ 2017-02-28 21:05           ` Willem de Bruijn
  0 siblings, 0 replies; 47+ messages in thread
From: Willem de Bruijn @ 2017-02-28 21:05 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Keller, Jacob E, netdev, Richard Cochran, Jiri Benc, Denny Page,
	Willem de Bruijn

>> > With this change I'm getting two error messages per transmission, but
>> > it looks like it may need some additional changes.
>> >
>> > If the first error message is received after the HW timestamp was
>> > captured,
>>
>> When does this happen? The first timestamp is generated from
>> skb_tx_timestamp in the device driver's ndo_start_xmit before
>> passing the packet to the NIC, the second when the device
>> driver cleans the tx descriptor on completion.
>
> As I understand it, it happens when the first skb (created by the
> skb_tx_timestamp() call) is received by the application after the
> driver called skb_tstamp_tx() with the HW timestamp. The SW timestamps
> are separate, but the HW timestamp is shared between clones. It

Oh right, the conversion to struct scm_timestamping only happens
on socket read in __sock_recv_timestamp.

> probably doesn't happen with the TSONLY option as it allocates a new
> skb. When I print timestamps from scm_timestamping I see a mix of two
> cases:
>
> TX 1488268812.193945472 0.000000000 1488286813.273760139
> TX 0.000000000 0.000000000 1488286813.273760139
> RX 1488268812.354356188 0.000000000 1488286813.434096389
>
> TX 1488268816.364407934 0.000000000 0.000000000
> TX 0.000000000 0.000000000 1488286817.444251014
> RX 1488268816.525150589 0.000000000 1488286817.604749889
>
> In the first case I assume the HW timestamp was saved before the first
> error message was received, so both error messages have the same HW
> timestamp, but only one has the SW timestamp. In the second case, the
> HW timestamp was saved later, so there is one message with SW
> timestamp and one message with HW timestamp.
>
> From the application point of view it would make sense if in the first
> case there was only one error message containing both timestamps. I'm

Agreed. I just proposed something similar on the error queue for
zerocopy notifications in http://patchwork.ozlabs.org/patch/731214/

> not sure how easy/safe it would be to drop the second skb. The other
> approach would be to not put HW timestamp in the first message when
> this "dual TX timestamping" option is enabled, so each error message
> has only one timestamp.

If it's possible to avoid one skb_clone completely, then that is preferable
over creating both and consuming one. If either approach becomes
complex, then queuing two separate messages is fine. A process
can recvmmsg(), after all. As long as the behavior is consistent.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-02-09 11:09   ` Miroslav Lichvar
  2017-02-09 19:42     ` sdncurious
@ 2017-03-23 16:21     ` Miroslav Lichvar
  2017-03-23 18:54       ` Denny Page
                         ` (2 more replies)
  1 sibling, 3 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-03-23 16:21 UTC (permalink / raw)
  To: Richard Cochran
  Cc: netdev, Jiri Benc, Keller, Jacob E, Denny Page, Willem de Bruijn

On Thu, Feb 09, 2017 at 12:09:41PM +0100, Miroslav Lichvar wrote:
> On Thu, Feb 09, 2017 at 09:02:42AM +0100, Richard Cochran wrote:
> > On Tue, Feb 07, 2017 at 03:01:44PM +0100, Miroslav Lichvar wrote:
> > > 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> > > 
> > >    PTP uses preamble RX timestamps, but NTP works with trailer RX
> > >    timestamps. This means NTP implementations currently need to
> > >    transpose HW RX timestamps. The calculation requires the link speed
> > >    and the length of the packet at layer 2. It seems this can be
> > >    reliably done only using raw sockets. It would be very nice if the
> > >    kernel could tranpose the timestamps automatically.
> > 
> > Impossible, because the link speed may change between the time when
> > the MAC receives the data the kernel gets around to calculating the
> > time stamp.
> 
> I think that would be an acceptable limitation. The application
> certainly couldn't do a better job than the kernel and it won't have
> to use raw sockets.

After becoming a bit more familiar with the code I don't think this is
a good idea anymore :). I suspect there would be a noticeable
performance impact if each timestamped packet could trigger reading of
the current link speed. If the value had to be cached it would make
more sense to do it in the application.

With no option to get transposed timestamps the point 6 can be
scratched too.

A better approach might be a control message that would provide the
original interface index together with the length of the packet, so
the application could transpose the HW timestamp and map the HW
interface to the PHC.

The two values could be saved in the skb_shared_info structure. Now
my question is if they could be useful also for other things than
timestamping and if it should be a new socket option which would work
on any socket independently from timestamping, or if it should rather
be a new flag for the SO_TIMESTAMPING option. If the latter, would it
make sense to put them in the skb_shared_hwtstamps structure and
modify all drivers to set the values when a HW timestamp is captured
instead of adding more code to __netif_receive_skb_core() or similar?

What do you think?

> > > 6) new SO_TIMESTAMPING option to get PHC index with HW timestamps
> > > 
> > >    With bridges, bonding and other things it's difficult to determine
> > >    which PHC timestamped the packet. It would be very useful if the
> > >    PHC index was provided with each HW timestamp.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-23 16:21     ` Miroslav Lichvar
@ 2017-03-23 18:54       ` Denny Page
  2017-03-23 19:07       ` Richard Cochran
       [not found]       ` <6121D504-288F-4C9B-9AB3-D1C8292965D5@me.com>
  2 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-03-23 18:54 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Richard Cochran, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

[Resend as plain text for netdev]


> On Mar 23, 2017, at 09:21, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> 
> After becoming a bit more familiar with the code I don't think this is
> a good idea anymore :). I suspect there would be a noticeable
> performance impact if each timestamped packet could trigger reading of
> the current link speed. If the value had to be cached it would make
> more sense to do it in the application.


I am very surprised at this. The application caching approach requires the application retrieve the value via a system call. The system call overhead is huge in comparison to everything else. More importantly, the application cached value may be wrong. If the application takes a sample every 5 seconds, there are 5 seconds of timestamps that can be wildly wrong.

At the driver level, if the speed check is done on packet receive, retrieving the link speed is a single register read which is a small overhead compared with processing the timestamp. The alternative approach of caching still makes more sense in the driver rather than the application. The driver receives an interrupt when negotiation happens, and It’s trivial to cache the value at that point. And a cached value by the driver will always be correct. Implementing it in the driver also allows for hardware to provide the functionality where available. Yes, there is only one chip that provides this currently, but if there is sufficient demand others will appear. There is no way to take advantage of this functionality unless this is handled by the driver.

I think it makes a lot of sense to leave this to the driver developer.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-23 16:21     ` Miroslav Lichvar
  2017-03-23 18:54       ` Denny Page
@ 2017-03-23 19:07       ` Richard Cochran
  2017-03-24  7:25         ` Miroslav Lichvar
       [not found]       ` <6121D504-288F-4C9B-9AB3-D1C8292965D5@me.com>
  2 siblings, 1 reply; 47+ messages in thread
From: Richard Cochran @ 2017-03-23 19:07 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: netdev, Jiri Benc, Keller, Jacob E, Denny Page, Willem de Bruijn

On Thu, Mar 23, 2017 at 05:21:45PM +0100, Miroslav Lichvar wrote:
> A better approach might be a control message that would provide the
> original interface index together with the length of the packet, so
> the application could transpose the HW timestamp and map the HW
> interface to the PHC.

This sounds better than trying to auto-magically transpose and correct
for link speed.

BTW, isn't there already a control message for "original interface
index"?
 
> The two values could be saved in the skb_shared_info structure. Now
> my question is if they could be useful also for other things than
> timestamping

such as?

> and if it should be a new socket option which would work
> on any socket independently from timestamping, or if it should rather
> be a new flag for the SO_TIMESTAMPING option. If the latter, would it
> make sense to put them in the skb_shared_hwtstamps structure and
> modify all drivers to set the values when a HW timestamp is captured
> instead of adding more code to __netif_receive_skb_core() or similar?

This information is solely for a highly specialized NTP application.
No normal program would ever need this, AFAICT.  So, if possible,
getting the original frame length should be done in a way that doesn't
affect users that don't need it.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-23 19:07       ` Richard Cochran
@ 2017-03-24  7:25         ` Miroslav Lichvar
  0 siblings, 0 replies; 47+ messages in thread
From: Miroslav Lichvar @ 2017-03-24  7:25 UTC (permalink / raw)
  To: Richard Cochran
  Cc: netdev, Jiri Benc, Keller, Jacob E, Denny Page, Willem de Bruijn

On Thu, Mar 23, 2017 at 08:07:33PM +0100, Richard Cochran wrote:
> On Thu, Mar 23, 2017 at 05:21:45PM +0100, Miroslav Lichvar wrote:
> > A better approach might be a control message that would provide the
> > original interface index together with the length of the packet, so
> > the application could transpose the HW timestamp and map the HW
> > interface to the PHC.
> 
> This sounds better than trying to auto-magically transpose and correct
> for link speed.
> 
> BTW, isn't there already a control message for "original interface
> index"?

There is the PACKET_ORIGDEV option, but it works only with packet
sockets, and it doesn't look to me like it could be easily turned into
a SO_ORIGDEV option. If there was such an option and also a SO_ORIGLEN
option, I think that would work nicely for me.

> > The two values could be saved in the skb_shared_info structure. Now
> > my question is if they could be useful also for other things than
> > timestamping
> 
> such as?

I'm not sure. What people do with PACKET_ORIGDEV and would it make
sense with other sockets? Googling "PACKET_ORIGDEV" shows
implementations of some low-level protocols.

> > and if it should be a new socket option which would work
> > on any socket independently from timestamping, or if it should rather
> > be a new flag for the SO_TIMESTAMPING option. If the latter, would it
> > make sense to put them in the skb_shared_hwtstamps structure and
> > modify all drivers to set the values when a HW timestamp is captured
> > instead of adding more code to __netif_receive_skb_core() or similar?
> 
> This information is solely for a highly specialized NTP application.
> No normal program would ever need this, AFAICT.  So, if possible,
> getting the original frame length should be done in a way that doesn't
> affect users that don't need it.

Ok. I'll put the two fields to skb_shared_hwtstamps, taking the place
of the old syststamp field, and try to avoid adding any code to paths
not specific to timestamping.

Thanks,

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
       [not found]       ` <6121D504-288F-4C9B-9AB3-D1C8292965D5@me.com>
@ 2017-03-24  9:45         ` Miroslav Lichvar
  2017-03-24 17:17           ` Denny Page
  2017-03-24  9:55         ` Jiri Benc
  1 sibling, 1 reply; 47+ messages in thread
From: Miroslav Lichvar @ 2017-03-24  9:45 UTC (permalink / raw)
  To: Denny Page
  Cc: Richard Cochran, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Thu, Mar 23, 2017 at 10:08:00AM -0700, Denny Page wrote:
> > On Mar 23, 2017, at 09:21, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > 
> > After becoming a bit more familiar with the code I don't think this is
> > a good idea anymore :). I suspect there would be a noticeable
> > performance impact if each timestamped packet could trigger reading of
> > the current link speed. If the value had to be cached it would make
> > more sense to do it in the application.
> 
> I am very surprised at this. The application caching approach requires the application retrieve the value via a system call. The system call overhead is huge in comparison to everything else. More importantly, the application cached value may be wrong. If the application takes a sample every 5 seconds, there are 5 seconds of timestamps that can be wildly wrong.

I'm just trying to be practical and minimize the performance impact
and the amount of code that needs to be written, reviewed and
maintained.

How common is to have link speed changing in normal operation on LAN?

There are other problems with changing link speed. It does not affect
only the transposition.

- If the change happens during a measurement, anywhere on the path
  between the server and client, the measured offset will have an
  error due to the asymmetry in the network delay. An NTP measurement
  in the basic mode is short (it's just the rount-trip time), so it's
  not very likely to be hit by a link speed change, but in the
  interleaved mode the probability is exactly the opposite.

- Even if the measurement is not hit and the measured offset is
  accurate, the change in the measured delay may confuse the NTP
  client (e.g. temporarily disrupt its sample filtering).

- The TX/RX compensation values depend on the link speed. If the
  application doesn't reliably know link speed for each packet, the
  compensation cannot be reliable either.

It seems to me for best timekeeping with NTP it is necessary to make
sure the link speed in the network is constant, even if the
transposition was always accurate.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
       [not found]       ` <6121D504-288F-4C9B-9AB3-D1C8292965D5@me.com>
  2017-03-24  9:45         ` Miroslav Lichvar
@ 2017-03-24  9:55         ` Jiri Benc
  1 sibling, 0 replies; 47+ messages in thread
From: Jiri Benc @ 2017-03-24  9:55 UTC (permalink / raw)
  To: Denny Page
  Cc: Miroslav Lichvar, Richard Cochran, netdev, Keller, Jacob E,
	Willem de Bruijn

On Thu, 23 Mar 2017 10:08:00 -0700, Denny Page wrote:
> I am very surprised at this. The application caching approach
> requires the application retrieve the value via a system call. The
> system call overhead is huge in comparison to everything else. More
> importantly, the application cached value may be wrong. If the
> application takes a sample every 5 seconds, there are 5 seconds of
> timestamps that can be wildly wrong.

You can add a netlink event that is sent on speed change. No need for
polling then and the wrong timestamp window will be very tiny. (It
can't be zero, even with per-packet data.)

ethtool needs to be converted to netlink, anyway.

 Jiri

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-24  9:45         ` Miroslav Lichvar
@ 2017-03-24 17:17           ` Denny Page
  2017-03-24 18:52             ` Keller, Jacob E
  2017-03-27 10:13             ` Miroslav Lichvar
  0 siblings, 2 replies; 47+ messages in thread
From: Denny Page @ 2017-03-24 17:17 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Richard Cochran, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Mar 24, 2017, at 02:45, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> 
> On Thu, Mar 23, 2017 at 10:08:00AM -0700, Denny Page wrote:
>>> On Mar 23, 2017, at 09:21, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>>> 
>>> After becoming a bit more familiar with the code I don't think this is
>>> a good idea anymore :). I suspect there would be a noticeable
>>> performance impact if each timestamped packet could trigger reading of
>>> the current link speed. If the value had to be cached it would make
>>> more sense to do it in the application.
>> 
>> I am very surprised at this. The application caching approach requires the application retrieve the value via a system call. The system call overhead is huge in comparison to everything else. More importantly, the application cached value may be wrong. If the application takes a sample every 5 seconds, there are 5 seconds of timestamps that can be wildly wrong.
> 
> I'm just trying to be practical and minimize the performance impact
> and the amount of code that needs to be written, reviewed and
> maintained.
> 
> How common is to have link speed changing in normal operation on LAN?

In my case, it’s currently every few minutes because I’m doing hw timestamp testing. :)

But this does speak to my point. If it’s cached by the application, the application has to check it regularly to minimize the possibility of bad timestamps. If the link speed doesn’t change, every call by the application is wasted overhead. If it’s cached by the driver, there is no waste, and the stamps are always correct.

I should have remembered this yesterday... I went and looked at my favorite driver, Intel's igb. Not only is the igb driver already caching link speed, it is also performing timestamp correction based on that link speed. It appears that all Intel drivers are caching link speed. I looked at a few other popular manufacturers, and it appears that caching link speed is common. The only one I quickly found that didn’t cache was realtek.

I believe that timestamp correction, whether it be speed based latency, header -> trailer, or whatever else might be needed later down the line, are properly done in the driver. It’s a lot for the application to try and figure out if it should or should not be doing corrections and what correction to apply. The driver knows.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* RE: Extending socket timestamping API for NTP
  2017-03-24 17:17           ` Denny Page
@ 2017-03-24 18:52             ` Keller, Jacob E
  2017-03-27 10:13             ` Miroslav Lichvar
  1 sibling, 0 replies; 47+ messages in thread
From: Keller, Jacob E @ 2017-03-24 18:52 UTC (permalink / raw)
  To: Denny Page, Miroslav Lichvar
  Cc: Richard Cochran, netdev, Jiri Benc, Willem de Bruijn

> -----Original Message-----
> From: Denny Page [mailto:dennypage@me.com]
> Sent: Friday, March 24, 2017 10:18 AM
> To: Miroslav Lichvar <mlichvar@redhat.com>
> Cc: Richard Cochran <richardcochran@gmail.com>; netdev@vger.kernel.org; Jiri
> Benc <jbenc@redhat.com>; Keller, Jacob E <jacob.e.keller@intel.com>; Willem
> de Bruijn <willemb@google.com>
> Subject: Re: Extending socket timestamping API for NTP
> 
> 
> > On Mar 24, 2017, at 02:45, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> >
> > On Thu, Mar 23, 2017 at 10:08:00AM -0700, Denny Page wrote:
> >>> On Mar 23, 2017, at 09:21, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> >>>
> >>> After becoming a bit more familiar with the code I don't think this is
> >>> a good idea anymore :). I suspect there would be a noticeable
> >>> performance impact if each timestamped packet could trigger reading of
> >>> the current link speed. If the value had to be cached it would make
> >>> more sense to do it in the application.
> >>
> >> I am very surprised at this. The application caching approach requires the
> application retrieve the value via a system call. The system call overhead is huge
> in comparison to everything else. More importantly, the application cached value
> may be wrong. If the application takes a sample every 5 seconds, there are 5
> seconds of timestamps that can be wildly wrong.
> >
> > I'm just trying to be practical and minimize the performance impact
> > and the amount of code that needs to be written, reviewed and
> > maintained.
> >
> > How common is to have link speed changing in normal operation on LAN?
> 
> In my case, it’s currently every few minutes because I’m doing hw timestamp
> testing. :)
> 
> But this does speak to my point. If it’s cached by the application, the application
> has to check it regularly to minimize the possibility of bad timestamps. If the link
> speed doesn’t change, every call by the application is wasted overhead. If it’s
> cached by the driver, there is no waste, and the stamps are always correct.
> 
> I should have remembered this yesterday... I went and looked at my favorite
> driver, Intel's igb. Not only is the igb driver already caching link speed, it is also
> performing timestamp correction based on that link speed. It appears that all
> Intel drivers are caching link speed. I looked at a few other popular
> manufacturers, and it appears that caching link speed is common. The only one I
> quickly found that didn’t cache was realtek.
> 
> I believe that timestamp correction, whether it be speed based latency, header -
> > trailer, or whatever else might be needed later down the line, are properly
> done in the driver. It’s a lot for the application to try and figure out if it should or
> should not be doing corrections and what correction to apply. The driver knows.

I also believe the right place for these corrections is in the driver.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-24 17:17           ` Denny Page
  2017-03-24 18:52             ` Keller, Jacob E
@ 2017-03-27 10:13             ` Miroslav Lichvar
  2017-03-27 14:29               ` Richard Cochran
  1 sibling, 1 reply; 47+ messages in thread
From: Miroslav Lichvar @ 2017-03-27 10:13 UTC (permalink / raw)
  To: Denny Page
  Cc: Richard Cochran, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Fri, Mar 24, 2017 at 10:17:51AM -0700, Denny Page wrote:
> > On Mar 24, 2017, at 02:45, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > How common is to have link speed changing in normal operation on LAN?
> 
> In my case, it’s currently every few minutes because I’m doing hw timestamp testing. :)
> 
> But this does speak to my point. If it’s cached by the application, the application has to check it regularly to minimize the possibility of bad timestamps. If the link speed doesn’t change, every call by the application is wasted overhead. If it’s cached by the driver, there is no waste, and the stamps are always correct.

At least on the HW I'm testing, reading the link speed from user space
doesn't take much. It's about 10-15x faster than reading the PHC for
instance, which must be done periodically in any case.

> I should have remembered this yesterday... I went and looked at my favorite driver, Intel's igb. Not only is the igb driver already caching link speed, it is also performing timestamp correction based on that link speed.

Isn't the i210 the only NIC for which the correction is actually
implemented? Will this ever be done for all HW with timestamping
support, so that the applications wouldn't have to care about link
speed?

> I believe that timestamp correction, whether it be speed based latency, header -> trailer, or whatever else might be needed later down the line, are properly done in the driver. It’s a lot for the application to try and figure out if it should or should not be doing corrections and what correction to apply. The driver knows.

I agree, but I'm not sure how feasible that is.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 10:13             ` Miroslav Lichvar
@ 2017-03-27 14:29               ` Richard Cochran
  2017-03-27 16:25                 ` Denny Page
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Cochran @ 2017-03-27 14:29 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Denny Page, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Mon, Mar 27, 2017 at 12:13:24PM +0200, Miroslav Lichvar wrote:
> On Fri, Mar 24, 2017 at 10:17:51AM -0700, Denny Page wrote:
> > I should have remembered this yesterday... I went and looked at my favorite driver, Intel's igb. Not only is the igb driver already caching link speed, it is also performing timestamp correction based on that link speed.
> 
> Isn't the i210 the only NIC for which the correction is actually
> implemented?

Yes.

> Will this ever be done for all HW with timestamping
> support, so that the applications wouldn't have to care about link
> speed?

No.

At the end of the day, the correction in the igb driver is useless and
even harmful.  Why?  Because if the app cares about this level of
accuracy, it is going to have to implement special logic anyhow, and
having a special case for the igb is even more work for the app.

In addition, if you look into the igb data sheet, you will find a
range of correction values, with little indication of how they
measured the latency and what the ranges depend on.  In my
experiments, I have seen the igb consistently land on the extreme of
one of the ranges (who knows why), but the driver corrects using the
average, forcing me then to correct the remaining offset by hand.

> > I believe that timestamp correction, whether it be speed based latency, header -> trailer, or whatever else might be needed later down the line, are properly done in the driver. It’s a lot for the application to try and figure out if it should or should not be doing corrections and what correction to apply. The driver knows.
> 
> I agree, but I'm not sure how feasible that is.

+1

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 14:29               ` Richard Cochran
@ 2017-03-27 16:25                 ` Denny Page
  2017-03-27 18:28                   ` Richard Cochran
  0 siblings, 1 reply; 47+ messages in thread
From: Denny Page @ 2017-03-27 16:25 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Mar 27, 2017, at 07:29, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> At the end of the day, the correction in the igb driver is useless and
> even harmful.  Why?  Because if the app cares about this level of
> accuracy, it is going to have to implement special logic anyhow, and
> having a special case for the igb is even more work for the app.

If you are doing correction in the application, _every_ driver is a special case. The a driver making average (known) correction is no more special than any other.


> In addition, if you look into the igb data sheet, you will find a
> range of correction values, with little indication of how they
> measured the latency and what the ranges depend on.

You are looking at the 2.2 datasheet I expect. The values for 10Mb and 1Gb have been removed from subsequent datasheets, however they have added a bit more detail as to how the values are measured and what the values.


> In my
> experiments, I have seen the igb consistently land on the extreme of
> one of the ranges (who knows why), but the driver corrects using the
> average, forcing me then to correct the remaining offset by hand.

I agree that the values in the igb driver are incorrect. They were middle of the range values from the old tables. At least for 100Mb, Intel seems to know that the original table was incorrect. I’ve done extensive measurements of the i210 and i211 at both 100Mb and 1Gb. The “external link partner” numbers Intel currently publishes for the 100Mb appear accurate. I’m still finalizing the values for 1Gb, but one thing I will note is that the values for master mode and slave mode are quite different. FWIW, master/slave mode correction is also something that can only be corrected in the driver :)

I am curious to know any data you developed in your experiments and how you did the measurements. Please email me directly if you are willing to share.

Thanks,
Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 16:25                 ` Denny Page
@ 2017-03-27 18:28                   ` Richard Cochran
  2017-03-27 19:18                     ` Denny Page
                                       ` (3 more replies)
  0 siblings, 4 replies; 47+ messages in thread
From: Richard Cochran @ 2017-03-27 18:28 UTC (permalink / raw)
  To: Denny Page
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Mon, Mar 27, 2017 at 09:25:03AM -0700, Denny Page wrote:

> I agree that the values in the igb driver are incorrect. They were
> middle of the range values from the old tables. At least for 100Mb,
> Intel seems to know that the original table was incorrect. I’ve done
> extensive measurements of the i210 and i211 at both 100Mb and
> 1Gb. The “external link partner” numbers Intel currently publishes
> for the 100Mb appear accurate.

Well, after reading this, I am more convinced than ever that doing the
correction in user space is the right way.  If the one and only vendor
who publishes numbers can't even get them straight, how on earth will
we ever get the drivers right?

> I’m still finalizing the values for 1Gb, but one thing I will note
> is that the values for master mode and slave mode are quite
> different. FWIW, master/slave mode correction is also something that
> can only be corrected in the driver :)

Actually, adding ethtool support for SyncE (and consequently Gigabit
Ethernet slave/master status) is something we have discussed in the
past.  I would support expanding the interface to accommodate this...
 
> I am curious to know any data you developed in your experiments and
> how you did the measurements. Please email me directly if you are
> willing to share.

I didn't do anything super methodical, and I didn't keep notes, but I
had a phyter (whose delays were published by TI and independently
confirmed in a ISPCS paper by Christian Riesch) and an i210 with a 100
MBit link and with a PPS between them.  The phyter's numbers are
correct to within a nanosecond, and I saw that the i210 was repeatedly
landing at the published extreme of the range.  I don't remember which
extreme, and I didn't repeat more than a few times, however.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 18:28                   ` Richard Cochran
@ 2017-03-27 19:18                     ` Denny Page
  2017-03-27 20:58                       ` Richard Cochran
  2017-03-27 19:21                     ` Denny Page
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 47+ messages in thread
From: Denny Page @ 2017-03-27 19:18 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Mar 27, 2017, at 11:28, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> On Mon, Mar 27, 2017 at 09:25:03AM -0700, Denny Page wrote:
> 
>> I agree that the values in the igb driver are incorrect. They were
>> middle of the range values from the old tables. At least for 100Mb,
>> Intel seems to know that the original table was incorrect. I’ve done
>> extensive measurements of the i210 and i211 at both 100Mb and
>> 1Gb. The “external link partner” numbers Intel currently publishes
>> for the 100Mb appear accurate.
> 
> Well, after reading this, I am more convinced than ever that doing the
> correction in user space is the right way.  If the one and only vendor
> who publishes numbers can't even get them straight, how on earth will
> we ever get the drivers right?

I think that on average, the Vendor’s numbers are likely to be more accurate than anyone else’s. The concept that independent software implementations are going to somehow obtain and maintain better numbers is too much of a stretch.

FWIW, My testing indicates that the 100Mb numbers that Intel currently publishes are quite accurate. I don’t believe that Intel did the driver corrections btw, if memory serves these values were lifted from the Mac.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 18:28                   ` Richard Cochran
  2017-03-27 19:18                     ` Denny Page
@ 2017-03-27 19:21                     ` Denny Page
  2017-03-27 19:21                     ` Denny Page
       [not found]                     ` <5FD283AB-39DE-4A9D-902A-BA5F0F0B62A3@me.com>
  3 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-03-27 19:21 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

[Resend in plain text for vger]

> On Mar 27, 2017, at 11:28, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> On Mon, Mar 27, 2017 at 09:25:03AM -0700, Denny Page wrote:
> 
>> I agree that the values in the igb driver are incorrect. They were
>> middle of the range values from the old tables. At least for 100Mb,
>> Intel seems to know that the original table was incorrect. I’ve done
>> extensive measurements of the i210 and i211 at both 100Mb and
>> 1Gb. The “external link partner” numbers Intel currently publishes
>> for the 100Mb appear accurate.
> 
> Well, after reading this, I am more convinced than ever that doing the
> correction in user space is the right way.  If the one and only vendor
> who publishes numbers can't even get them straight, how on earth will
> we ever get the drivers right?

I think that on average, the Vendor’s numbers are likely to be more accurate than anyone else’s. The concept that independent software implementations are going to somehow obtain and maintain better numbers is too much of a stretch.

FWIW, My testing indicates that the 100Mb numbers that Intel currently publishes are quite accurate. I don’t believe that Intel did the driver corrections btw, if memory serves these values were lifted from the Mac.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 18:28                   ` Richard Cochran
  2017-03-27 19:18                     ` Denny Page
  2017-03-27 19:21                     ` Denny Page
@ 2017-03-27 19:21                     ` Denny Page
       [not found]                     ` <5FD283AB-39DE-4A9D-902A-BA5F0F0B62A3@me.com>
  3 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-03-27 19:21 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

[Resend in plain text for vger]

> On Mar 27, 2017, at 11:28, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> I didn't do anything super methodical, and I didn't keep notes, but I
> had a phyter (whose delays were published by TI and independently
> confirmed in a ISPCS paper by Christian Riesch) and an i210 with a 100
> MBit link and with a PPS between them.  The phyter's numbers are
> correct to within a nanosecond, and I saw that the i210 was repeatedly
> landing at the published extreme of the range.  I don't remember which
> extreme, and I didn't repeat more than a few times, however.

Do you still have the resulting correction values from this?

Thanks,
Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 19:18                     ` Denny Page
@ 2017-03-27 20:58                       ` Richard Cochran
  2017-03-27 21:20                         ` Denny Page
  0 siblings, 1 reply; 47+ messages in thread
From: Richard Cochran @ 2017-03-27 20:58 UTC (permalink / raw)
  To: Denny Page
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Mon, Mar 27, 2017 at 12:18:47PM -0700, Denny Page wrote:
> I think that on average, the Vendor’s numbers are likely to be more
> accurate than anyone else’s. The concept that independent software
> implementations are going to somehow obtain and maintain better
> numbers is too much of a stretch.

But you just said that Intel's first published numbers were wrong.  If
the vendors would have published accurate information, then you would
not have to have made your own measurements, and the drivers could
simply use the correct values.

Sadly, this will never happen.  The vendor's track record is 100%
fail.  The apps will always need to implement their own, truly correct
values.  Having "almost correct" values hard coded into the drivers
only makes things worse.

> FWIW, My testing indicates that the 100Mb numbers that Intel
> currently publishes are quite accurate. I don’t believe that Intel
> did the driver corrections btw, if memory serves these values were
> lifted from the Mac.

Huh?  Mac?  -ENOPARSE.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
       [not found]                     ` <5FD283AB-39DE-4A9D-902A-BA5F0F0B62A3@me.com>
@ 2017-03-27 21:00                       ` Richard Cochran
  0 siblings, 0 replies; 47+ messages in thread
From: Richard Cochran @ 2017-03-27 21:00 UTC (permalink / raw)
  To: Denny Page
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn

On Mon, Mar 27, 2017 at 12:19:57PM -0700, Denny Page wrote:
> Do you still have the resulting correction values from this?

No, I don't, but next time I drag out the phyter I will take another
look and let you know.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: Extending socket timestamping API for NTP
  2017-03-27 20:58                       ` Richard Cochran
@ 2017-03-27 21:20                         ` Denny Page
  0 siblings, 0 replies; 47+ messages in thread
From: Denny Page @ 2017-03-27 21:20 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Miroslav Lichvar, netdev, Jiri Benc, Keller, Jacob E, Willem de Bruijn


> On Mar 27, 2017, at 13:58, Richard Cochran <richardcochran@gmail.com> wrote:
> 
> On Mon, Mar 27, 2017 at 12:18:47PM -0700, Denny Page wrote:
>> I think that on average, the Vendor’s numbers are likely to be more
>> accurate than anyone else’s. The concept that independent software
>> implementations are going to somehow obtain and maintain better
>> numbers is too much of a stretch.
> 
> But you just said that Intel's first published numbers were wrong.  If
> the vendors would have published accurate information, then you would
> not have to have made your own measurements, and the drivers could
> simply use the correct values.
> 
> Sadly, this will never happen.  The vendor's track record is 100%
> fail.  The apps will always need to implement their own, truly correct
> values.  Having "almost correct" values hard coded into the drivers
> only makes things worse.

Yes, Intel’s original numbers were wrong. But that doesn’t mean that other’s people’s numbers are going to be particularly better. Even Intel’s original numbers were far better than most will be able to achieve. 

But let’s bring this back to the driver. If someone conducts tests and believes that they have better numbers than currently used in the driver, let them come forward with their information and propose a kernel patch. No harm in that at all. And much easier than brining a patch for dozens of applications.

Denny

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2017-03-27 21:20 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-07 14:01 Extending socket timestamping API for NTP Miroslav Lichvar
2017-02-07 17:45 ` Keller, Jacob E
2017-02-07 22:32   ` Willem de Bruijn
2017-02-08 14:18     ` Soheil Hassas Yeganeh
2017-02-27 15:23     ` Miroslav Lichvar
2017-02-28  0:01       ` Willem de Bruijn
2017-02-28  8:26         ` Miroslav Lichvar
2017-02-28 21:05           ` Willem de Bruijn
2017-02-08  1:52   ` Denny Page
2017-02-08  5:27     ` Richard Cochran
2017-02-08  5:48       ` Denny Page
2017-02-08 17:27       ` Denny Page
2017-02-07 18:54 ` Soheil Hassas Yeganeh
2017-02-08 10:14   ` Miroslav Lichvar
2017-02-07 20:37 ` sdncurious
2017-02-08 10:26   ` Miroslav Lichvar
2017-02-08 23:27     ` sdncurious
2017-02-08 23:34     ` sdncurious
2017-02-08  1:18 ` Denny Page
     [not found] ` <CAHoNx58u=Fze4e5V2Wb_LiBhka1Mzny3zOVNfvuzjnmQ4wBO=Q@mail.gmail.com>
2017-02-08  3:06   ` Denny Page
2017-02-09  0:45 ` Denny Page
2017-02-09 11:15   ` Miroslav Lichvar
2017-02-09 20:25   ` Denny Page
2017-02-09  8:02 ` Richard Cochran
2017-02-09 11:09   ` Miroslav Lichvar
2017-02-09 19:42     ` sdncurious
2017-02-09 20:37       ` Denny Page
2017-02-10  0:33       ` Denny Page
2017-02-10 18:55         ` Denny Page
2017-03-23 16:21     ` Miroslav Lichvar
2017-03-23 18:54       ` Denny Page
2017-03-23 19:07       ` Richard Cochran
2017-03-24  7:25         ` Miroslav Lichvar
     [not found]       ` <6121D504-288F-4C9B-9AB3-D1C8292965D5@me.com>
2017-03-24  9:45         ` Miroslav Lichvar
2017-03-24 17:17           ` Denny Page
2017-03-24 18:52             ` Keller, Jacob E
2017-03-27 10:13             ` Miroslav Lichvar
2017-03-27 14:29               ` Richard Cochran
2017-03-27 16:25                 ` Denny Page
2017-03-27 18:28                   ` Richard Cochran
2017-03-27 19:18                     ` Denny Page
2017-03-27 20:58                       ` Richard Cochran
2017-03-27 21:20                         ` Denny Page
2017-03-27 19:21                     ` Denny Page
2017-03-27 19:21                     ` Denny Page
     [not found]                     ` <5FD283AB-39DE-4A9D-902A-BA5F0F0B62A3@me.com>
2017-03-27 21:00                       ` Richard Cochran
2017-03-24  9:55         ` Jiri Benc

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.