linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] Document more PTP timestamping known quirks
@ 2020-07-17 16:10 Vladimir Oltean
  2020-07-17 16:10 ` [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs" Vladimir Oltean
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-17 16:10 UTC (permalink / raw)
  To: kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc

I've tried to collect and summarize the conclusions of these discussions:
https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
which were a bit surprising to me. Make sure they are present in the
documentation.

Vladimir Oltean (3):
  docs: networking: timestamping: rename last section to "Known bugs".
  docs: networking: timestamping: add one more known issue
  docs: networking: timestamping: add a set of frequently asked
    questions

 Documentation/networking/timestamping.rst | 77 ++++++++++++++++++++---
 1 file changed, 70 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs".
  2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
@ 2020-07-17 16:10 ` Vladimir Oltean
  2020-07-17 22:05   ` Jacob Keller
  2020-07-17 16:10 ` [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue Vladimir Oltean
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-17 16:10 UTC (permalink / raw)
  To: kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc

One more quirk of the timestamping infrastructure will be documented
shortly. Rename the section from "Other caveats for MAC drivers" to
simply "Known bugs". This uncovers some bad phrasing at the beginning of
the section, which is now corrected.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 Documentation/networking/timestamping.rst | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 5fa4e2274dd9..9a1f4cb4ce9e 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -711,14 +711,14 @@ discoverable and attachable to a ``struct phy_device`` through Device Tree, and
 for the rest, they use the same mii_ts infrastructure as those. See
 Documentation/devicetree/bindings/ptp/timestamper.txt for more details.
 
-3.2.4 Other caveats for MAC drivers
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Stacked PHCs, especially DSA (but not only) - since that doesn't require any
-modification to MAC drivers, so it is more difficult to ensure correctness of
-all possible code paths - is that they uncover bugs which were impossible to
-trigger before the existence of stacked PTP clocks.  One example has to do with
-this line of code, already presented earlier::
+3.2.4 Known bugs
+^^^^^^^^^^^^^^^^
+
+One caveat with stacked PHCs, especially DSA (but not only) - since that
+doesn't require any modification to MAC drivers, so it is more difficult to
+ensure correctness of all possible code paths - is that they uncover bugs which
+were impossible to trigger before the existence of stacked PTP clocks.
+One example has to do with this line of code, already presented earlier::
 
       skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue
  2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
  2020-07-17 16:10 ` [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs" Vladimir Oltean
@ 2020-07-17 16:10 ` Vladimir Oltean
  2020-07-17 23:08   ` Jacob Keller
  2020-07-17 16:10 ` [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions Vladimir Oltean
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-17 16:10 UTC (permalink / raw)
  To: kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc

Document the fact that Ethernet PHY timestamping has a fundamentally
flawed corner case (which in fact hits the majority of networking
drivers): a PHY for which its host MAC driver doesn't forward the
phy_mii_ioctl for timestamping is still going to be presented to user
space as functional.

Fixing this inconsistency would require moving the phy_has_tsinfo()
check inside all MAC drivers which are capable of PHY timestamping, to
be in harmony with the existing design for phy_has_hwtstamp() checks.
Instead of doing that, document the preferable solution which is that
offending MAC drivers be fixed instead.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 Documentation/networking/timestamping.rst | 37 +++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 9a1f4cb4ce9e..4004c5d2771d 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -754,3 +754,40 @@ check in their "TX confirmation" portion, not only for
 that PTP timestamping is not enabled for anything other than the outermost PHC,
 this enhanced check will avoid delivering a duplicated TX timestamp to user
 space.
+
+Another known limitation is the design of the ``__ethtool_get_ts_info``
+function::
+
+  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
+  {
+          const struct ethtool_ops *ops = dev->ethtool_ops;
+          struct phy_device *phydev = dev->phydev;
+
+          memset(info, 0, sizeof(*info));
+          info->cmd = ETHTOOL_GET_TS_INFO;
+
+          if (phy_has_tsinfo(phydev))
+                  return phy_ts_info(phydev, info);
+          if (ops->get_ts_info)
+                  return ops->get_ts_info(dev, info);
+
+          info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
+                                  SOF_TIMESTAMPING_SOFTWARE;
+          info->phc_index = -1;
+
+          return 0;
+  }
+
+Because the generic function searches first for the timestamping capabilities
+of the attached PHY, and returns them directly without consulting the MAC
+driver, no checking is being done whether the requirements described in `3.2.2
+Ethernet PHYs`_ are implemented or not. Therefore, if the MAC driver does not
+satisfy the requirements for PHY timestamping, and
+``CONFIG_NETWORK_PHY_TIMESTAMPING`` is enabled, then a non-functional PHC index
+(the one corresponding to the PHY) will be reported to user space, via
+``ethtool -T``.
+
+The correct solution to this problem is to implement the PHY timestamping
+requirements in the MAC driver found broken, and submit as a bug fix patch to
+netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
+<stable_kernel_rules>` for more details.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
  2020-07-17 16:10 ` [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs" Vladimir Oltean
  2020-07-17 16:10 ` [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue Vladimir Oltean
@ 2020-07-17 16:10 ` Vladimir Oltean
  2020-07-17 23:12   ` Jacob Keller
  2020-07-17 21:13 ` [PATCH net-next 0/3] Document more PTP timestamping known quirks Sergey Organov
  2020-07-18 11:21 ` Sergey Organov
  4 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-17 16:10 UTC (permalink / raw)
  To: kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc

These are some questions I had while trying to explain the behavior of
some drivers with respect to software timestamping. Answered with the
help of Richard Cochran.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 Documentation/networking/timestamping.rst | 26 +++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 4004c5d2771d..e01ec01179fe 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -791,3 +791,29 @@ The correct solution to this problem is to implement the PHY timestamping
 requirements in the MAC driver found broken, and submit as a bug fix patch to
 netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
 <stable_kernel_rules>` for more details.
+
+3.4 Frequently asked questions
+------------------------------
+
+Q: When should drivers set SKBTX_IN_PROGRESS?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
+and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
+Originally, the network stack could deliver either a hardware or a software
+time stamp, but not both. This flag prevents software timestamp delivery.
+This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
+option, but still the original behavior is preserved as the default.
+
+Q: Should drivers that don't offer SOF_TIMESTAMPING_TX_SOFTWARE call skb_tx_timestamp()?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``skb_clone_tx_timestamp()`` function from its body helps with propagation
+of TX timestamps from PTP PHYs, and the required placement of this call is the
+same as for software TX timestamping.
+Additionally, since PTP is broken on ports with timestamping PHYs and unmet
+requirements, the consequence is that any driver which may be ever coupled to
+a timestamping-capable PHY in ``netdev->phydev`` should call at least
+``skb_clone_tx_timestamp()``. However, calling the higher-level
+``skb_tx_timestamp()`` instead achieves the same purpose, but also offers
+additional compliance to ``SOF_TIMESTAMPING_TX_SOFTWARE``.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
                   ` (2 preceding siblings ...)
  2020-07-17 16:10 ` [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions Vladimir Oltean
@ 2020-07-17 21:13 ` Sergey Organov
  2020-07-17 21:57   ` Vladimir Oltean
  2020-07-18 11:21 ` Sergey Organov
  4 siblings, 1 reply; 26+ messages in thread
From: Sergey Organov @ 2020-07-17 21:13 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, linux-doc

Vladimir Oltean <olteanv@gmail.com> writes:

> I've tried to collect and summarize the conclusions of these discussions:
> https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
> https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
> which were a bit surprising to me. Make sure they are present in the
> documentation.

As one of participants of these discussions, I'm afraid I incline to
alternative approach to solving the issues current design has than the one
you advocate in these patch series.

I believe its upper-level that should enforce common policies like
handling hw time stamping at outermost capable device, not random MAC
driver out there.

I'd argue that it's then upper-level that should check PHY features, and
then do not bother MAC with ioctl() requests that MAC should not handle
in given configuration. This way, the checks for phy_has_hwtstamp()
won't be spread over multiple MAC drivers and will happily sit in the
upper-level ioctl() handler.

In other words, I mean that it's approach taken in ethtool that I tend
to consider being the right one.

Thanks,
-- Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-17 21:13 ` [PATCH net-next 0/3] Document more PTP timestamping known quirks Sergey Organov
@ 2020-07-17 21:57   ` Vladimir Oltean
  2020-07-17 23:13     ` Jacob Keller
  2020-07-18 10:54     ` Sergey Organov
  0 siblings, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-17 21:57 UTC (permalink / raw)
  To: Sergey Organov; +Cc: kuba, davem, netdev, richardcochran, linux-doc

On Sat, Jul 18, 2020 at 12:13:42AM +0300, Sergey Organov wrote:
> Vladimir Oltean <olteanv@gmail.com> writes:
> 
> > I've tried to collect and summarize the conclusions of these discussions:
> > https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
> > https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
> > which were a bit surprising to me. Make sure they are present in the
> > documentation.
> 
> As one of participants of these discussions, I'm afraid I incline to
> alternative approach to solving the issues current design has than the one
> you advocate in these patch series.
> 
> I believe its upper-level that should enforce common policies like
> handling hw time stamping at outermost capable device, not random MAC
> driver out there.
> 
> I'd argue that it's then upper-level that should check PHY features, and
> then do not bother MAC with ioctl() requests that MAC should not handle
> in given configuration. This way, the checks for phy_has_hwtstamp()
> won't be spread over multiple MAC drivers and will happily sit in the
> upper-level ioctl() handler.
> 
> In other words, I mean that it's approach taken in ethtool that I tend
> to consider being the right one.
> 
> Thanks,
> -- Sergey

Concretely speaking, what are you going to do for
skb_defer_tx_timestamp() and skb_defer_rx_timestamp()? Not to mention
subtle bugs like SKBTX_IN_PROGRESS. If you don't address those, it's
pointless to move the phy_has_hwtstamp() check to net/core/dev_ioctl.c.

The only way I see to fix the bug is to introduce a new netdev flag,
NETIF_F_PHY_HWTSTAMP or something like that. Then I'd grep for all
occurrences of phy_has_hwtstamp() in the kernel (which currently amount
to a whopping 2 users, 3 with your FEC "fix"), and declare this
netdevice flag in their list of features. Then, phy_has_hwtstamp() and
phy_has_tsinfo() and what not can be moved to generic places (or at
least, I think they can), and those places could proceed to advertise
and enable PHY timestamping only if the MAC declared itself ready. But,
it is a bit strange to introduce a netdev flag just to fix a bug, I
think.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs".
  2020-07-17 16:10 ` [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs" Vladimir Oltean
@ 2020-07-17 22:05   ` Jacob Keller
  0 siblings, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2020-07-17 22:05 UTC (permalink / raw)
  To: Vladimir Oltean, kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc



On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> One more quirk of the timestamping infrastructure will be documented
> shortly. Rename the section from "Other caveats for MAC drivers" to
> simply "Known bugs". This uncovers some bad phrasing at the beginning of
> the section, which is now corrected.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  Documentation/networking/timestamping.rst | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> index 5fa4e2274dd9..9a1f4cb4ce9e 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -711,14 +711,14 @@ discoverable and attachable to a ``struct phy_device`` through Device Tree, and
>  for the rest, they use the same mii_ts infrastructure as those. See
>  Documentation/devicetree/bindings/ptp/timestamper.txt for more details.
>  
> -3.2.4 Other caveats for MAC drivers
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -Stacked PHCs, especially DSA (but not only) - since that doesn't require any
> -modification to MAC drivers, so it is more difficult to ensure correctness of
> -all possible code paths - is that they uncover bugs which were impossible to
> -trigger before the existence of stacked PTP clocks.  One example has to do with
> -this line of code, already presented earlier::
> +3.2.4 Known bugs
> +^^^^^^^^^^^^^^^^
> +
> +One caveat with stacked PHCs, especially DSA (but not only) - since that
> +doesn't require any modification to MAC drivers, so it is more difficult to
> +ensure correctness of all possible code paths - is that they uncover bugs which
> +were impossible to trigger before the existence of stacked PTP clocks.
> +One example has to do with this line of code, already presented earlier::
>  

The interjection between - - is really long and made it difficult to
parse this statement. Maybe re-word it like

One caveat with stacked PHCs is that they uncover bugs which were
impossible to trigger otherwise, as it is more difficult to ensure
correctness of all possible code flows. This is especially true of DSA
since it does not require any modifications to the MAC drivers to setup.
One example has to do with this line of code, already presented earlier::


>        skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
>  
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue
  2020-07-17 16:10 ` [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue Vladimir Oltean
@ 2020-07-17 23:08   ` Jacob Keller
  2020-07-18 11:36     ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2020-07-17 23:08 UTC (permalink / raw)
  To: Vladimir Oltean, kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc



On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> Document the fact that Ethernet PHY timestamping has a fundamentally
> flawed corner case (which in fact hits the majority of networking
> drivers): a PHY for which its host MAC driver doesn't forward the
> phy_mii_ioctl for timestamping is still going to be presented to user
> space as functional.
> 
> Fixing this inconsistency would require moving the phy_has_tsinfo()
> check inside all MAC drivers which are capable of PHY timestamping, to
> be in harmony with the existing design for phy_has_hwtstamp() checks.
> Instead of doing that, document the preferable solution which is that
> offending MAC drivers be fixed instead.

This statement feels weird. Aren't you documenting that the preferable
solution is? I.e. "Document this preferable solution: Fix the offending
MAC driver"

Or am I misunderstanding what the issue is here?

> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  Documentation/networking/timestamping.rst | 37 +++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> index 9a1f4cb4ce9e..4004c5d2771d 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -754,3 +754,40 @@ check in their "TX confirmation" portion, not only for
>  that PTP timestamping is not enabled for anything other than the outermost PHC,
>  this enhanced check will avoid delivering a duplicated TX timestamp to user
>  space.
> +
> +Another known limitation is the design of the ``__ethtool_get_ts_info``
> +function::
> +
> +  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
> +  {
> +          const struct ethtool_ops *ops = dev->ethtool_ops;
> +          struct phy_device *phydev = dev->phydev;
> +
> +          memset(info, 0, sizeof(*info));
> +          info->cmd = ETHTOOL_GET_TS_INFO;
> +
> +          if (phy_has_tsinfo(phydev))
> +                  return phy_ts_info(phydev, info);
> +          if (ops->get_ts_info)
> +                  return ops->get_ts_info(dev, info);
> +
> +          info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
> +                                  SOF_TIMESTAMPING_SOFTWARE;
> +          info->phc_index = -1;
> +
> +          return 0;
> +  }
> +
> +Because the generic function searches first for the timestamping capabilities
> +of the attached PHY, and returns them directly without consulting the MAC
> +driver, no checking is being done whether the requirements described in `3.2.2
> +Ethernet PHYs`_ are implemented or not. Therefore, if the MAC driver does not
> +satisfy the requirements for PHY timestamping, and
> +``CONFIG_NETWORK_PHY_TIMESTAMPING`` is enabled, then a non-functional PHC index
> +(the one corresponding to the PHY) will be reported to user space, via
> +``ethtool -T``.
> +
> +The correct solution to this problem is to implement the PHY timestamping
> +requirements in the MAC driver found broken, and submit as a bug fix patch to
> +netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
> +<stable_kernel_rules>` for more details.
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-17 16:10 ` [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions Vladimir Oltean
@ 2020-07-17 23:12   ` Jacob Keller
  2020-07-18 11:35     ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2020-07-17 23:12 UTC (permalink / raw)
  To: Vladimir Oltean, kuba, davem, netdev; +Cc: richardcochran, sorganov, linux-doc



On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> These are some questions I had while trying to explain the behavior of
> some drivers with respect to software timestamping. Answered with the
> help of Richard Cochran.
> 
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  Documentation/networking/timestamping.rst | 26 +++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> index 4004c5d2771d..e01ec01179fe 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -791,3 +791,29 @@ The correct solution to this problem is to implement the PHY timestamping
>  requirements in the MAC driver found broken, and submit as a bug fix patch to
>  netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
>  <stable_kernel_rules>` for more details.
> +
> +3.4 Frequently asked questions
> +------------------------------
> +
> +Q: When should drivers set SKBTX_IN_PROGRESS?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> +Originally, the network stack could deliver either a hardware or a software
> +time stamp, but not both. This flag prevents software timestamp delivery.
> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
> +option, but still the original behavior is preserved as the default.
> +

So, this implies that we set this only if both are supported? I thought
the intention was to set this flag whenever we start a HW timestamp.

> +Q: Should drivers that don't offer SOF_TIMESTAMPING_TX_SOFTWARE call skb_tx_timestamp()?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +The ``skb_clone_tx_timestamp()`` function from its body helps with propagation
> +of TX timestamps from PTP PHYs, and the required placement of this call is the
> +same as for software TX timestamping.
> +Additionally, since PTP is broken on ports with timestamping PHYs and unmet
> +requirements, the consequence is that any driver which may be ever coupled to
> +a timestamping-capable PHY in ``netdev->phydev`` should call at least
> +``skb_clone_tx_timestamp()``. However, calling the higher-level
> +``skb_tx_timestamp()`` instead achieves the same purpose, but also offers
> +additional compliance to ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> 

This makes sense.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-17 21:57   ` Vladimir Oltean
@ 2020-07-17 23:13     ` Jacob Keller
  2020-07-18 10:54     ` Sergey Organov
  1 sibling, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2020-07-17 23:13 UTC (permalink / raw)
  To: Vladimir Oltean, Sergey Organov
  Cc: kuba, davem, netdev, richardcochran, linux-doc



On 7/17/2020 2:57 PM, Vladimir Oltean wrote:
> On Sat, Jul 18, 2020 at 12:13:42AM +0300, Sergey Organov wrote:
>> Vladimir Oltean <olteanv@gmail.com> writes:
>>
>>> I've tried to collect and summarize the conclusions of these discussions:
>>> https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
>>> https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
>>> which were a bit surprising to me. Make sure they are present in the
>>> documentation.
>>
>> As one of participants of these discussions, I'm afraid I incline to
>> alternative approach to solving the issues current design has than the one
>> you advocate in these patch series.
>>
>> I believe its upper-level that should enforce common policies like
>> handling hw time stamping at outermost capable device, not random MAC
>> driver out there.
>>
>> I'd argue that it's then upper-level that should check PHY features, and
>> then do not bother MAC with ioctl() requests that MAC should not handle
>> in given configuration. This way, the checks for phy_has_hwtstamp()
>> won't be spread over multiple MAC drivers and will happily sit in the
>> upper-level ioctl() handler.
>>
>> In other words, I mean that it's approach taken in ethtool that I tend
>> to consider being the right one.
>>
>> Thanks,
>> -- Sergey
> 
> Concretely speaking, what are you going to do for
> skb_defer_tx_timestamp() and skb_defer_rx_timestamp()? Not to mention
> subtle bugs like SKBTX_IN_PROGRESS. If you don't address those, it's
> pointless to move the phy_has_hwtstamp() check to net/core/dev_ioctl.c.
> 
> The only way I see to fix the bug is to introduce a new netdev flag,
> NETIF_F_PHY_HWTSTAMP or something like that. Then I'd grep for all
> occurrences of phy_has_hwtstamp() in the kernel (which currently amount
> to a whopping 2 users, 3 with your FEC "fix"), and declare this
> netdevice flag in their list of features. Then, phy_has_hwtstamp() and
> phy_has_tsinfo() and what not can be moved to generic places (or at
> least, I think they can), and those places could proceed to advertise
> and enable PHY timestamping only if the MAC declared itself ready. But,
> it is a bit strange to introduce a netdev flag just to fix a bug, I
> think.
> 

This approach doesn't seem bad to me. We then document that
NETIF_F_PHY_HWTSTAMP should only set of the correct conditions are met.

> Thanks,
> -Vladimir
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-17 21:57   ` Vladimir Oltean
  2020-07-17 23:13     ` Jacob Keller
@ 2020-07-18 10:54     ` Sergey Organov
  2020-07-18 11:30       ` Vladimir Oltean
  1 sibling, 1 reply; 26+ messages in thread
From: Sergey Organov @ 2020-07-18 10:54 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, linux-doc

Vladimir Oltean <olteanv@gmail.com> writes:

> On Sat, Jul 18, 2020 at 12:13:42AM +0300, Sergey Organov wrote:
>> Vladimir Oltean <olteanv@gmail.com> writes:
>> 
>> > I've tried to collect and summarize the conclusions of these discussions:
>> > https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
>> > https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
>> > which were a bit surprising to me. Make sure they are present in the
>> > documentation.
>> 
>> As one of participants of these discussions, I'm afraid I incline to
>> alternative approach to solving the issues current design has than the one
>> you advocate in these patch series.
>> 
>> I believe its upper-level that should enforce common policies like
>> handling hw time stamping at outermost capable device, not random MAC
>> driver out there.
>> 
>> I'd argue that it's then upper-level that should check PHY features, and
>> then do not bother MAC with ioctl() requests that MAC should not handle
>> in given configuration. This way, the checks for phy_has_hwtstamp()
>> won't be spread over multiple MAC drivers and will happily sit in the
>> upper-level ioctl() handler.
>> 
>> In other words, I mean that it's approach taken in ethtool that I tend
>> to consider being the right one.
>> 
>> Thanks,
>> -- Sergey
>
> Concretely speaking, what are you going to do for
> skb_defer_tx_timestamp() and skb_defer_rx_timestamp()? Not to mention
> subtle bugs like SKBTX_IN_PROGRESS.

I think that we have at least 2 problems here, and what I argue about
above addresses one of them, while you try to get solution for another
one.

> If you don't address those, it's pointless to move the
> phy_has_hwtstamp() check to net/core/dev_ioctl.c.

No, even though solving one problem could be considered pointless
without solving another, it doesn't mean that solving it is pointless. I
do hope you will solve another one.

I believe that logic in ethtool ioctl handling should be moved to clocks
subsystem ioctl handling, and then ethtool should simply forward
relevant calls to clocks subsystem. This will give us single
implementation point that defines which ioctls go to which clocks, and
single point where policy decisions are made, that, besides getting rid
of current inconsistencies, will allow for easier changes of policies in
the future.

That also could be the point that caches time stamping configuration and
gives it back to user space by ioctl request, freeing each driver from
implementing it, along with copying request structures to/from user
space that currently is done in every driver.

I believe such changes are valuable despite particular way the
SKBTX_IN_PROGRESS issue will be resolved.

> The only way I see to fix the bug is to introduce a new netdev flag,
> NETIF_F_PHY_HWTSTAMP or something like that. Then I'd grep for all
> occurrences of phy_has_hwtstamp() in the kernel (which currently amount
> to a whopping 2 users, 3 with your FEC "fix"), and declare this
> netdevice flag in their list of features. Then, phy_has_hwtstamp() and
> phy_has_tsinfo() and what not can be moved to generic places (or at
> least, I think they can), and those places could proceed to advertise
> and enable PHY timestamping only if the MAC declared itself ready. But,
> it is a bit strange to introduce a netdev flag just to fix a bug, I
> think.

To me this sounds like a plan.

In general (please don't take it as direct proposal to fix current
issues), the most flexible solution would be to allow for user space to
select which units will be time stamping (kernel clock being simply one
of them), and to deliver all the time stamps to the user space. This
will need clock IDs to be delivered along with time stamps (that is a
nice thing to have by itself, as I already mentioned elsewhere in
previous discussions.) For now it's just a raw idea, nevertheless to me
it sounds like a suitable goal for future design.

Thanks,
-- Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
                   ` (3 preceding siblings ...)
  2020-07-17 21:13 ` [PATCH net-next 0/3] Document more PTP timestamping known quirks Sergey Organov
@ 2020-07-18 11:21 ` Sergey Organov
  4 siblings, 0 replies; 26+ messages in thread
From: Sergey Organov @ 2020-07-18 11:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: kuba, davem, netdev, richardcochran, linux-doc, Eugene Syromiatnikov

Vladimir Oltean <olteanv@gmail.com> writes:

> I've tried to collect and summarize the conclusions of these discussions:
> https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
> https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
> which were a bit surprising to me. Make sure they are present in the
> documentation.

By the way, there is another somewhat related issue that needs to be
addressed. I believe kernel needs to free user space from this trick
found even in kernel sources themselves:

tools/testing/selftests/ptp/testptp.c:87:

static clockid_t get_clockid(int fd)
{
#define CLOCKFD 3
        return (((unsigned int) ~fd) << 3) | CLOCKFD;
}

Once upon a time there was a patch for that, but I don't think it
addressed the issue correctly, here is more background:

https://lore.kernel.org/lkml/87y2pxvsbr.fsf@osv.gnss.ru/

Thanks,
-- Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-18 10:54     ` Sergey Organov
@ 2020-07-18 11:30       ` Vladimir Oltean
  2020-07-18 13:35         ` Sergey Organov
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-18 11:30 UTC (permalink / raw)
  To: Sergey Organov; +Cc: kuba, davem, netdev, richardcochran, linux-doc

On Sat, Jul 18, 2020 at 01:54:11PM +0300, Sergey Organov wrote:
> Vladimir Oltean <olteanv@gmail.com> writes:
>
> > On Sat, Jul 18, 2020 at 12:13:42AM +0300, Sergey Organov wrote:
> >> Vladimir Oltean <olteanv@gmail.com> writes:
> >>
> >> > I've tried to collect and summarize the conclusions of these discussions:
> >> > https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
> >> > https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
> >> > which were a bit surprising to me. Make sure they are present in the
> >> > documentation.
> >>
> >> As one of participants of these discussions, I'm afraid I incline to
> >> alternative approach to solving the issues current design has than the one
> >> you advocate in these patch series.
> >>
> >> I believe its upper-level that should enforce common policies like
> >> handling hw time stamping at outermost capable device, not random MAC
> >> driver out there.
> >>
> >> I'd argue that it's then upper-level that should check PHY features, and
> >> then do not bother MAC with ioctl() requests that MAC should not handle
> >> in given configuration. This way, the checks for phy_has_hwtstamp()
> >> won't be spread over multiple MAC drivers and will happily sit in the
> >> upper-level ioctl() handler.
> >>
> >> In other words, I mean that it's approach taken in ethtool that I tend
> >> to consider being the right one.
> >>
> >> Thanks,
> >> -- Sergey
> >
> > Concretely speaking, what are you going to do for
> > skb_defer_tx_timestamp() and skb_defer_rx_timestamp()? Not to mention
> > subtle bugs like SKBTX_IN_PROGRESS.
>
> I think that we have at least 2 problems here, and what I argue about
> above addresses one of them, while you try to get solution for another
> one.
>
> > If you don't address those, it's pointless to move the
> > phy_has_hwtstamp() check to net/core/dev_ioctl.c.
>
> No, even though solving one problem could be considered pointless
> without solving another, it doesn't mean that solving it is pointless. I
> do hope you will solve another one.
>
> I believe that logic in ethtool ioctl handling should be moved to clocks
> subsystem ioctl handling, and then ethtool should simply forward
> relevant calls to clocks subsystem. This will give us single
> implementation point that defines which ioctls go to which clocks, and
> single point where policy decisions are made, that, besides getting rid
> of current inconsistencies, will allow for easier changes of policies in
> the future.
>
> That also could be the point that caches time stamping configuration and
> gives it back to user space by ioctl request, freeing each driver from
> implementing it, along with copying request structures to/from user
> space that currently is done in every driver.
>
> I believe such changes are valuable despite particular way the
> SKBTX_IN_PROGRESS issue will be resolved.
>
> > The only way I see to fix the bug is to introduce a new netdev flag,
> > NETIF_F_PHY_HWTSTAMP or something like that. Then I'd grep for all
> > occurrences of phy_has_hwtstamp() in the kernel (which currently amount
> > to a whopping 2 users, 3 with your FEC "fix"), and declare this
> > netdevice flag in their list of features. Then, phy_has_hwtstamp() and
> > phy_has_tsinfo() and what not can be moved to generic places (or at
> > least, I think they can), and those places could proceed to advertise
> > and enable PHY timestamping only if the MAC declared itself ready. But,
> > it is a bit strange to introduce a netdev flag just to fix a bug, I
> > think.
>
> To me this sounds like a plan.
>
> In general (please don't take it as direct proposal to fix current
> issues), the most flexible solution would be to allow for user space to
> select which units will be time stamping (kernel clock being simply one
> of them), and to deliver all the time stamps to the user space. This
> will need clock IDs to be delivered along with time stamps (that is a
> nice thing to have by itself, as I already mentioned elsewhere in
> previous discussions.) For now it's just a raw idea, nevertheless to me
> it sounds like a suitable goal for future design.
>
> Thanks,
> -- Sergey

To me, there's one big inconsistency I see between your position when
you were coming from a 4.9 kernel where you wanted to fix a bug, and
your position now.

Your position when _you_ wanted to solve a problem for yourself was:

  |You see, I have a problem on kernel 4.9.146. After I apply this patch,
  |the problem goes away, at least for FEC/PHY combo that I care about, and
  |chances are high that for DSA as well, according to your own expertise.
  |Why should I care what is or is not ready for what to get a bug-fix
  |patch into the kernel? Why should I guess some vague "intentions" or
  |spend my time elsewhere?

As I said in that email thread, I can't contradict you. It is a design
limitation which right now I am simply documenting. That design
limitation is there to stay in stable kernels: I don't think there is
any way to backport a new flag to netdev_features_t to kernels as old as
4.9 and such. If you think there is, please say so, that would change
things quite a lot.

And now, you're arguing that I shouldn't be documenting the design
limitation, I should be fixing it. Maybe I will, but first of all,
you're asking me to effectively close the door for anybody else in your
position. On one side you proved that PHY timestamping is something that
should have been working, and now you're treating it as something which
shouldn't be.

You can argue that we can keep accepting bug fixes to this problem for
stable kernels, and in that case I don't see why you're arguing that we
shouldn't be documenting the design limitation.

Nobody said things are set in stone, I'm simply recording where we are
today and I'll be making further changes to the documentation as things
progress.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-17 23:12   ` Jacob Keller
@ 2020-07-18 11:35     ` Vladimir Oltean
  2020-07-20 18:54       ` Jacob Keller
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-18 11:35 UTC (permalink / raw)
  To: Jacob Keller; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc

On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
> 
> 
> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> > These are some questions I had while trying to explain the behavior of
> > some drivers with respect to software timestamping. Answered with the
> > help of Richard Cochran.
> > 
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  Documentation/networking/timestamping.rst | 26 +++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> > 
> > diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> > index 4004c5d2771d..e01ec01179fe 100644
> > --- a/Documentation/networking/timestamping.rst
> > +++ b/Documentation/networking/timestamping.rst
> > @@ -791,3 +791,29 @@ The correct solution to this problem is to implement the PHY timestamping
> >  requirements in the MAC driver found broken, and submit as a bug fix patch to
> >  netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
> >  <stable_kernel_rules>` for more details.
> > +
> > +3.4 Frequently asked questions
> > +------------------------------
> > +
> > +Q: When should drivers set SKBTX_IN_PROGRESS?
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
> > +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> > +Originally, the network stack could deliver either a hardware or a software
> > +time stamp, but not both. This flag prevents software timestamp delivery.
> > +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
> > +option, but still the original behavior is preserved as the default.
> > +
> 
> So, this implies that we set this only if both are supported? I thought
> the intention was to set this flag whenever we start a HW timestamp.
> 

It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
practice, but there are many situations where it can do more harm than
good.

> > +Q: Should drivers that don't offer SOF_TIMESTAMPING_TX_SOFTWARE call skb_tx_timestamp()?
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +The ``skb_clone_tx_timestamp()`` function from its body helps with propagation
> > +of TX timestamps from PTP PHYs, and the required placement of this call is the
> > +same as for software TX timestamping.
> > +Additionally, since PTP is broken on ports with timestamping PHYs and unmet
> > +requirements, the consequence is that any driver which may be ever coupled to
> > +a timestamping-capable PHY in ``netdev->phydev`` should call at least
> > +``skb_clone_tx_timestamp()``. However, calling the higher-level
> > +``skb_tx_timestamp()`` instead achieves the same purpose, but also offers
> > +additional compliance to ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> > 
> 
> This makes sense.
> 
> Thanks,
> Jake

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue
  2020-07-17 23:08   ` Jacob Keller
@ 2020-07-18 11:36     ` Vladimir Oltean
  0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-18 11:36 UTC (permalink / raw)
  To: Jacob Keller; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc

On Fri, Jul 17, 2020 at 04:08:03PM -0700, Jacob Keller wrote:
> 
> 
> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> > Document the fact that Ethernet PHY timestamping has a fundamentally
> > flawed corner case (which in fact hits the majority of networking
> > drivers): a PHY for which its host MAC driver doesn't forward the
> > phy_mii_ioctl for timestamping is still going to be presented to user
> > space as functional.
> > 
> > Fixing this inconsistency would require moving the phy_has_tsinfo()
> > check inside all MAC drivers which are capable of PHY timestamping, to
> > be in harmony with the existing design for phy_has_hwtstamp() checks.
> > Instead of doing that, document the preferable solution which is that
> > offending MAC drivers be fixed instead.
> 
> This statement feels weird. Aren't you documenting that the preferable
> solution is? I.e. "Document this preferable solution: Fix the offending
> MAC driver"
> 
> Or am I misunderstanding what the issue is here?
> 

You're right, it looks like I wasn't thinking in full sentences at that
particular time of day.

> > 
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  Documentation/networking/timestamping.rst | 37 +++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> > 
> > diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> > index 9a1f4cb4ce9e..4004c5d2771d 100644
> > --- a/Documentation/networking/timestamping.rst
> > +++ b/Documentation/networking/timestamping.rst
> > @@ -754,3 +754,40 @@ check in their "TX confirmation" portion, not only for
> >  that PTP timestamping is not enabled for anything other than the outermost PHC,
> >  this enhanced check will avoid delivering a duplicated TX timestamp to user
> >  space.
> > +
> > +Another known limitation is the design of the ``__ethtool_get_ts_info``
> > +function::
> > +
> > +  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
> > +  {
> > +          const struct ethtool_ops *ops = dev->ethtool_ops;
> > +          struct phy_device *phydev = dev->phydev;
> > +
> > +          memset(info, 0, sizeof(*info));
> > +          info->cmd = ETHTOOL_GET_TS_INFO;
> > +
> > +          if (phy_has_tsinfo(phydev))
> > +                  return phy_ts_info(phydev, info);
> > +          if (ops->get_ts_info)
> > +                  return ops->get_ts_info(dev, info);
> > +
> > +          info->so_timestamping = SOF_TIMESTAMPING_RX_SOFTWARE |
> > +                                  SOF_TIMESTAMPING_SOFTWARE;
> > +          info->phc_index = -1;
> > +
> > +          return 0;
> > +  }
> > +
> > +Because the generic function searches first for the timestamping capabilities
> > +of the attached PHY, and returns them directly without consulting the MAC
> > +driver, no checking is being done whether the requirements described in `3.2.2
> > +Ethernet PHYs`_ are implemented or not. Therefore, if the MAC driver does not
> > +satisfy the requirements for PHY timestamping, and
> > +``CONFIG_NETWORK_PHY_TIMESTAMPING`` is enabled, then a non-functional PHC index
> > +(the one corresponding to the PHY) will be reported to user space, via
> > +``ethtool -T``.
> > +
> > +The correct solution to this problem is to implement the PHY timestamping
> > +requirements in the MAC driver found broken, and submit as a bug fix patch to
> > +netdev@vger.kernel.org. See :ref:`Documentation/process/stable-kernel-rules.rst
> > +<stable_kernel_rules>` for more details.
> > 

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/3] Document more PTP timestamping known quirks
  2020-07-18 11:30       ` Vladimir Oltean
@ 2020-07-18 13:35         ` Sergey Organov
  0 siblings, 0 replies; 26+ messages in thread
From: Sergey Organov @ 2020-07-18 13:35 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, linux-doc

Vladimir Oltean <olteanv@gmail.com> writes:

> On Sat, Jul 18, 2020 at 01:54:11PM +0300, Sergey Organov wrote:
>> Vladimir Oltean <olteanv@gmail.com> writes:
>>
>> > On Sat, Jul 18, 2020 at 12:13:42AM +0300, Sergey Organov wrote:
>> >> Vladimir Oltean <olteanv@gmail.com> writes:
>> >>
>> >> > I've tried to collect and summarize the conclusions of these
>> >> > discussions:
>> >> > https://patchwork.ozlabs.org/project/netdev/patch/20200711120842.2631-1-sorganov@gmail.com/
>> >> > https://patchwork.ozlabs.org/project/netdev/patch/20200710113611.3398-5-kurt@linutronix.de/
>> >> > which were a bit surprising to me. Make sure they are present in the
>> >> > documentation.
>> >>
>> >> As one of participants of these discussions, I'm afraid I incline to
>> >> alternative approach to solving the issues current design has than the one
>> >> you advocate in these patch series.
>> >>
>> >> I believe its upper-level that should enforce common policies like
>> >> handling hw time stamping at outermost capable device, not random MAC
>> >> driver out there.
>> >>
>> >> I'd argue that it's then upper-level that should check PHY features, and
>> >> then do not bother MAC with ioctl() requests that MAC should not handle
>> >> in given configuration. This way, the checks for phy_has_hwtstamp()
>> >> won't be spread over multiple MAC drivers and will happily sit in the
>> >> upper-level ioctl() handler.
>> >>
>> >> In other words, I mean that it's approach taken in ethtool that I tend
>> >> to consider being the right one.
>> >>
>> >> Thanks,
>> >> -- Sergey
>> >
>> > Concretely speaking, what are you going to do for
>> > skb_defer_tx_timestamp() and skb_defer_rx_timestamp()? Not to mention
>> > subtle bugs like SKBTX_IN_PROGRESS.
>>
>> I think that we have at least 2 problems here, and what I argue about
>> above addresses one of them, while you try to get solution for another
>> one.
>>
>> > If you don't address those, it's pointless to move the
>> > phy_has_hwtstamp() check to net/core/dev_ioctl.c.
>>
>> No, even though solving one problem could be considered pointless
>> without solving another, it doesn't mean that solving it is pointless. I
>> do hope you will solve another one.
>>
>> I believe that logic in ethtool ioctl handling should be moved to clocks
>> subsystem ioctl handling, and then ethtool should simply forward
>> relevant calls to clocks subsystem. This will give us single
>> implementation point that defines which ioctls go to which clocks, and
>> single point where policy decisions are made, that, besides getting rid
>> of current inconsistencies, will allow for easier changes of policies in
>> the future.
>>
>> That also could be the point that caches time stamping configuration and
>> gives it back to user space by ioctl request, freeing each driver from
>> implementing it, along with copying request structures to/from user
>> space that currently is done in every driver.
>>
>> I believe such changes are valuable despite particular way the
>> SKBTX_IN_PROGRESS issue will be resolved.
>>
>> > The only way I see to fix the bug is to introduce a new netdev flag,
>> > NETIF_F_PHY_HWTSTAMP or something like that. Then I'd grep for all
>> > occurrences of phy_has_hwtstamp() in the kernel (which currently amount
>> > to a whopping 2 users, 3 with your FEC "fix"), and declare this
>> > netdevice flag in their list of features. Then, phy_has_hwtstamp() and
>> > phy_has_tsinfo() and what not can be moved to generic places (or at
>> > least, I think they can), and those places could proceed to advertise
>> > and enable PHY timestamping only if the MAC declared itself ready. But,
>> > it is a bit strange to introduce a netdev flag just to fix a bug, I
>> > think.
>>
>> To me this sounds like a plan.
>>
>> In general (please don't take it as direct proposal to fix current
>> issues), the most flexible solution would be to allow for user space to
>> select which units will be time stamping (kernel clock being simply one
>> of them), and to deliver all the time stamps to the user space. This
>> will need clock IDs to be delivered along with time stamps (that is a
>> nice thing to have by itself, as I already mentioned elsewhere in
>> previous discussions.) For now it's just a raw idea, nevertheless to me
>> it sounds like a suitable goal for future design.
>>
>> Thanks,
>> -- Sergey
>
> To me, there's one big inconsistency I see between your position when
> you were coming from a 4.9 kernel where you wanted to fix a bug, and
> your position now.

[First of all, as it seems there is misunderstanding here, let me say
right from beginning that I have nothing against documenting current
status, I'm rather all in favor of it, and I do appreciate your work on
it very much, and I believe I already said this in a reply to your
previous documentation patch some time ago.]

Inconsistency? I don't think so. One thing is to fix a bug within
current design limitations, and another one -- suggest design
improvements that should make things better in the future. I just talk
about different matters.

>
> Your position when _you_ wanted to solve a problem for yourself was:
>
>   |You see, I have a problem on kernel 4.9.146. After I apply this patch,
>   |the problem goes away, at least for FEC/PHY combo that I care about, and
>   |chances are high that for DSA as well, according to your own expertise.
>   |Why should I care what is or is not ready for what to get a bug-fix
>   |patch into the kernel? Why should I guess some vague "intentions" or
>   |spend my time elsewhere?
>
> As I said in that email thread, I can't contradict you.

I fail to see what this has to do with current discussion. Apparently
I was not able to make my current intentions clear enough, but I try
again below.

> It is a design limitation which right now I am simply documenting.
> That design limitation is there to stay in stable kernels: I don't
> think there is any way to backport a new flag to netdev_features_t to
> kernels as old as 4.9 and such. If you think there is, please say so,
> that would change things quite a lot.

I didn't even think about it, sorry, don't know how you figured I have
an opinion about it.

> And now, you're arguing that I shouldn't be documenting the design
> limitation, I should be fixing it.

No, I'm not. In case you just document them, no. If you also suggest
some road for improvements that I don't agree on, only then I argue.

> Maybe I will, but first of all, you're asking me to effectively close
> the door for anybody else in your position. On one side you proved
> that PHY timestamping is something that should have been working, and
> now you're treating it as something which shouldn't be.

I'm not asking you for anything, no. I just shared my thoughts on the
issue of proper ways to improve overall design. You are free to disagree
or even to ignore them.

> You can argue that we can keep accepting bug fixes to this problem for
> stable kernels, and in that case I don't see why you're arguing that we
> shouldn't be documenting the design limitation.

You don't see because I'm not. I'm all for documenting.

I just got an impression that you also suggest ways to improve the
design, and, based on your comment that ethtool way is the bad design as
it doesn't consult MAC when handling ioctls, I was afraid the design
might went into direction I don't agree with.

Maybe I got your intentions wrong. Sorry if I did.

>
> Nobody said things are set in stone, I'm simply recording where we are
> today and I'll be making further changes to the documentation as things
> progress.

I have absolutely nothing against documenting current status, not at
all. Moreover, as I think I already said in this thread, your recent
idea about NETIF_F_PHY_HWTSTAMP seemed sound to me too.

I'm sorry I managed to provoke so much worry on your side!

Thanks,
-- Sergey

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-18 11:35     ` Vladimir Oltean
@ 2020-07-20 18:54       ` Jacob Keller
  2020-07-20 21:05         ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2020-07-20 18:54 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc

On 7/18/2020 4:35 AM, Vladimir Oltean wrote:
> On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
>> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
>>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
>>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
>>> +Originally, the network stack could deliver either a hardware or a software
>>> +time stamp, but not both. This flag prevents software timestamp delivery.
>>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
>>> +option, but still the original behavior is preserved as the default.
>>> +
>>
>> So, this implies that we set this only if both are supported? I thought
>> the intention was to set this flag whenever we start a HW timestamp.
>>
> 
> It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
> seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
> practice, but there are many situations where it can do more harm than
> good.
> 

I guess I've only ever implemented a driver with software timestamping
enabled as an option. What sort of issues arise when you have this set?
I'm guessing that it's some configuration of stacked devices as in the
other cases? If the issue can't be fixed I'd at least like more
explanation here, since the prevailing convention is that we set this
flag, so understanding when and why it's problematic would be useful.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 18:54       ` Jacob Keller
@ 2020-07-20 21:05         ` Vladimir Oltean
  2020-07-20 21:45           ` Jacob Keller
  2020-07-21  0:15           ` Richard Cochran
  0 siblings, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-20 21:05 UTC (permalink / raw)
  To: Jacob Keller; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc

On Mon, Jul 20, 2020 at 11:54:30AM -0700, Jacob Keller wrote:
> On 7/18/2020 4:35 AM, Vladimir Oltean wrote:
> > On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
> >> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> >>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
> >>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> >>> +Originally, the network stack could deliver either a hardware or a software
> >>> +time stamp, but not both. This flag prevents software timestamp delivery.
> >>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
> >>> +option, but still the original behavior is preserved as the default.
> >>> +
> >>
> >> So, this implies that we set this only if both are supported? I thought
> >> the intention was to set this flag whenever we start a HW timestamp.
> >>
> > 
> > It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
> > seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
> > practice, but there are many situations where it can do more harm than
> > good.
> > 
> 
> I guess I've only ever implemented a driver with software timestamping
> enabled as an option. What sort of issues arise when you have this set?
> I'm guessing that it's some configuration of stacked devices as in the
> other cases? If the issue can't be fixed I'd at least like more
> explanation here, since the prevailing convention is that we set this
> flag, so understanding when and why it's problematic would be useful.
> 
> Thanks,
> Jake

Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The
pattern is that:
- DSA sets SKBTX_IN_PROGRESS
- calls dev_queue_xmit towards the MAC driver
- MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it
- MAC driver delivers TX timestamp
- DSA ends poll or receives TX interrupt, collects its timestamp, and
  delivers a second TX timestamp
In fact this is explained in a bit more detail in the current
timestamping.rst file.
Not only are there existing in-tree drivers that do that (and various
subtle variations of it), but new code also has this tendency to take
shortcuts and interpret any SKBTX_IN_PROGRESS flag set as being set
locally. Good thing it's caught during review most of the time these
days. It's an error-prone design.
On the DSA front, 1 driver sets this flag (sja1105) and 3 don't (felix,
mv88e6xxx, hellcreek). The driver who had trouble because of this flag?
sja1105.
On the PHY front, 2 drivers set this flag (mscc_phy, dp83640) and 1
doesn't (ptp_ines). The driver who had trouble? dp83640.
So it's very far from obvious that setting this flag is 'the prevailing
convention'. For a MAC driver, that might well be, but for DSA/PHY,
there seem to be risks associated with doing that, and driver writers
should know what they're signing up for.

-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 21:05         ` Vladimir Oltean
@ 2020-07-20 21:45           ` Jacob Keller
  2020-07-20 22:13             ` Vladimir Oltean
  2020-07-21  0:15           ` Richard Cochran
  1 sibling, 1 reply; 26+ messages in thread
From: Jacob Keller @ 2020-07-20 21:45 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc



On 7/20/2020 2:05 PM, Vladimir Oltean wrote:
> On Mon, Jul 20, 2020 at 11:54:30AM -0700, Jacob Keller wrote:
>> On 7/18/2020 4:35 AM, Vladimir Oltean wrote:
>>> On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
>>>> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
>>>>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
>>>>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
>>>>> +Originally, the network stack could deliver either a hardware or a software
>>>>> +time stamp, but not both. This flag prevents software timestamp delivery.
>>>>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
>>>>> +option, but still the original behavior is preserved as the default.
>>>>> +
>>>>
>>>> So, this implies that we set this only if both are supported? I thought
>>>> the intention was to set this flag whenever we start a HW timestamp.
>>>>
>>>
>>> It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
>>> seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
>>> practice, but there are many situations where it can do more harm than
>>> good.
>>>
>>
>> I guess I've only ever implemented a driver with software timestamping
>> enabled as an option. What sort of issues arise when you have this set?
>> I'm guessing that it's some configuration of stacked devices as in the
>> other cases? If the issue can't be fixed I'd at least like more
>> explanation here, since the prevailing convention is that we set this
>> flag, so understanding when and why it's problematic would be useful.
>>
>> Thanks,
>> Jake
> 
> Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The
> pattern is that:
> - DSA sets SKBTX_IN_PROGRESS
> - calls dev_queue_xmit towards the MAC driver
> - MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it
> - MAC driver delivers TX timestamp
> - DSA ends poll or receives TX interrupt, collects its timestamp, and
>   delivers a second TX timestamp
> In fact this is explained in a bit more detail in the current
> timestamping.rst file.
> Not only are there existing in-tree drivers that do that (and various
> subtle variations of it), but new code also has this tendency to take
> shortcuts and interpret any SKBTX_IN_PROGRESS flag set as being set
> locally. Good thing it's caught during review most of the time these
> days. It's an error-prone design.
> On the DSA front, 1 driver sets this flag (sja1105) and 3 don't (felix,
> mv88e6xxx, hellcreek). The driver who had trouble because of this flag?
> sja1105.
> On the PHY front, 2 drivers set this flag (mscc_phy, dp83640) and 1
> doesn't (ptp_ines). The driver who had trouble? dp83640.
> So it's very far from obvious that setting this flag is 'the prevailing
> convention'. For a MAC driver, that might well be, but for DSA/PHY,
> there seem to be risks associated with doing that, and driver writers
> should know what they're signing up for.
> 

Perhaps the issue is that the MAC driver using SKBTX_IN_PROGRESS as the
mechanism for telling if it should deliver a timestamp. Shouldn't they
be relying on SKBTX_HW_TSTAMP for the "please timestamp" notification,
and then using their own mechanism for forwarding that timestamp once
it's complete?

I see a handful of drivers do rely on checking this, but I think that's
the real bug here.

> -Vladimir
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 21:45           ` Jacob Keller
@ 2020-07-20 22:13             ` Vladimir Oltean
  2020-07-21  0:21               ` Richard Cochran
  2020-07-21 17:16               ` Jacob Keller
  0 siblings, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-20 22:13 UTC (permalink / raw)
  To: Jacob Keller; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc

On Mon, Jul 20, 2020 at 02:45:03PM -0700, Jacob Keller wrote:
> 
> 
> On 7/20/2020 2:05 PM, Vladimir Oltean wrote:
> > On Mon, Jul 20, 2020 at 11:54:30AM -0700, Jacob Keller wrote:
> >> On 7/18/2020 4:35 AM, Vladimir Oltean wrote:
> >>> On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
> >>>> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> >>>>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
> >>>>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> >>>>> +Originally, the network stack could deliver either a hardware or a software
> >>>>> +time stamp, but not both. This flag prevents software timestamp delivery.
> >>>>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
> >>>>> +option, but still the original behavior is preserved as the default.
> >>>>> +
> >>>>
> >>>> So, this implies that we set this only if both are supported? I thought
> >>>> the intention was to set this flag whenever we start a HW timestamp.
> >>>>
> >>>
> >>> It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
> >>> seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
> >>> practice, but there are many situations where it can do more harm than
> >>> good.
> >>>
> >>
> >> I guess I've only ever implemented a driver with software timestamping
> >> enabled as an option. What sort of issues arise when you have this set?
> >> I'm guessing that it's some configuration of stacked devices as in the
> >> other cases? If the issue can't be fixed I'd at least like more
> >> explanation here, since the prevailing convention is that we set this
> >> flag, so understanding when and why it's problematic would be useful.
> >>
> >> Thanks,
> >> Jake
> > 
> > Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The
> > pattern is that:
> > - DSA sets SKBTX_IN_PROGRESS
> > - calls dev_queue_xmit towards the MAC driver
> > - MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it
> > - MAC driver delivers TX timestamp
> > - DSA ends poll or receives TX interrupt, collects its timestamp, and
> >   delivers a second TX timestamp
> > In fact this is explained in a bit more detail in the current
> > timestamping.rst file.
> > Not only are there existing in-tree drivers that do that (and various
> > subtle variations of it), but new code also has this tendency to take
> > shortcuts and interpret any SKBTX_IN_PROGRESS flag set as being set
> > locally. Good thing it's caught during review most of the time these
> > days. It's an error-prone design.
> > On the DSA front, 1 driver sets this flag (sja1105) and 3 don't (felix,
> > mv88e6xxx, hellcreek). The driver who had trouble because of this flag?
> > sja1105.
> > On the PHY front, 2 drivers set this flag (mscc_phy, dp83640) and 1
> > doesn't (ptp_ines). The driver who had trouble? dp83640.
> > So it's very far from obvious that setting this flag is 'the prevailing
> > convention'. For a MAC driver, that might well be, but for DSA/PHY,
> > there seem to be risks associated with doing that, and driver writers
> > should know what they're signing up for.
> > 
> 
> Perhaps the issue is that the MAC driver using SKBTX_IN_PROGRESS as the
> mechanism for telling if it should deliver a timestamp. Shouldn't they
> be relying on SKBTX_HW_TSTAMP for the "please timestamp" notification,
> and then using their own mechanism for forwarding that timestamp once
> it's complete?
> 
> I see a handful of drivers do rely on checking this, but I think that's
> the real bug here.
> 
> > -Vladimir
> > 

Yes, indeed, a lot of them are exclusively checking
"skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS", without any further
verification that they have hardware timestamping enabled in the first
place, a lot more than I remembered. Some of the occurrences are
actually new.

I think at least part of the reason why this keeps going on is that
there aren't any hard and fast rules that say you shouldn't do it. When
there isn't even a convincing percentage of DSA/PHY drivers that do set
SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked
PHC driver that sets the flag, plus a MAC driver that checks for it
incorrectly. So people tend to ignore this case. Even though, if stacked
DSA drivers started supporting software TX timestamping (which is not
unlikely, given the fact that this would also give you compatibility
with PHY timestamping), I'm sure things would change, because more
people would become aware of the issue once mv88e6xxx starts getting
affected.

What I've been trying to do is at least try to get people (especially
people who have a lot of XP with 1588 drivers) to agree on a common set
of guidelines that are explicitly written down. I think that's step #1.

-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 21:05         ` Vladimir Oltean
  2020-07-20 21:45           ` Jacob Keller
@ 2020-07-21  0:15           ` Richard Cochran
  1 sibling, 0 replies; 26+ messages in thread
From: Richard Cochran @ 2020-07-21  0:15 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Jacob Keller, kuba, davem, netdev, sorganov, linux-doc

On Tue, Jul 21, 2020 at 12:05:18AM +0300, Vladimir Oltean wrote:
> 
> Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The
> pattern is that:
> - DSA sets SKBTX_IN_PROGRESS

Nit: DSA should _not_ set this bit (but a PHY/MII device should).

> - calls dev_queue_xmit towards the MAC driver
> - MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it
> - MAC driver delivers TX timestamp
> - DSA ends poll or receives TX interrupt, collects its timestamp, and
>   delivers a second TX timestamp

> So it's very far from obvious that setting this flag is 'the prevailing
> convention'. For a MAC driver, that might well be, but for DSA/PHY,
> there seem to be risks associated with doing that, and driver writers
> should know what they're signing up for.

The flag only exists to prevent the stack from delivering SW time
stamps when HW time stamps are active.  If an interface doesn't
provide SW time stamps (like DSA interfaces), then there is no need to
set the flag.

For MAC and PHY/MII time stamping, they must co-operate, meaning that
the MAC driver must be prepared to deal with the fact that the PHY/MII
might set this flag.  Many MAC drivers don't do this correctly, but
there are very few PHY/MII actually in use, and so very few authors of
MAC drivers have paid attention to this.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 22:13             ` Vladimir Oltean
@ 2020-07-21  0:21               ` Richard Cochran
  2020-07-21 19:51                 ` Vladimir Oltean
  2020-07-21 17:16               ` Jacob Keller
  1 sibling, 1 reply; 26+ messages in thread
From: Richard Cochran @ 2020-07-21  0:21 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Jacob Keller, kuba, davem, netdev, sorganov, linux-doc

On Tue, Jul 21, 2020 at 01:13:14AM +0300, Vladimir Oltean wrote:
> I think at least part of the reason why this keeps going on is that
> there aren't any hard and fast rules that say you shouldn't do it. When
> there isn't even a convincing percentage of DSA/PHY drivers that do set
> SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked
> PHC driver that sets the flag, plus a MAC driver that checks for it
> incorrectly. So people tend to ignore this case.

Right.

> Even though, if stacked
> DSA drivers started supporting software TX timestamping (which is not
> unlikely, given the fact that this would also give you compatibility
> with PHY timestamping), I'm sure things would change, because more
> people would become aware of the issue once mv88e6xxx starts getting
> affected.

I really can't see the utility in having a SW time stamp from a DSA
interface.  The whole point of DSA time stamping is to get the ingress
and egress time of frames on the external ports, in order to correct
for the residence time within the switch.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-20 22:13             ` Vladimir Oltean
  2020-07-21  0:21               ` Richard Cochran
@ 2020-07-21 17:16               ` Jacob Keller
  1 sibling, 0 replies; 26+ messages in thread
From: Jacob Keller @ 2020-07-21 17:16 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: kuba, davem, netdev, richardcochran, sorganov, linux-doc



On 7/20/2020 3:13 PM, Vladimir Oltean wrote:>
> Yes, indeed, a lot of them are exclusively checking
> "skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS", without any further
> verification that they have hardware timestamping enabled in the first
> place, a lot more than I remembered. Some of the occurrences are
> actually new.
> 
> I think at least part of the reason why this keeps going on is that
> there aren't any hard and fast rules that say you shouldn't do it. When
> there isn't even a convincing percentage of DSA/PHY drivers that do set
> SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked
> PHC driver that sets the flag, plus a MAC driver that checks for it
> incorrectly. So people tend to ignore this case. Even though, if stacked
> DSA drivers started supporting software TX timestamping (which is not
> unlikely, given the fact that this would also give you compatibility
> with PHY timestamping), I'm sure things would change, because more
> people would become aware of the issue once mv88e6xxx starts getting
> affected.
> 
> What I've been trying to do is at least try to get people (especially
> people who have a lot of XP with 1588 drivers) to agree on a common set
> of guidelines that are explicitly written down. I think that's step #1.
> 
> -Vladimir
> 

Right. I think the guideline should be:

This flag indicates to the stack whether or not a hardware Tx timestamp
has been started. It's primary purpose is to prevent sending software
timestamps if a hw timestamp is provided.

1) set the flag whenever you start a tx timestamp

2) do not assume you are the only driver that will set the flag for a
given skb. Use a separate mechanism to decide if that skb is supposed to
have a timestamp.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-21  0:21               ` Richard Cochran
@ 2020-07-21 19:51                 ` Vladimir Oltean
  2020-07-22  3:25                   ` Richard Cochran
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-21 19:51 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Jacob Keller, kuba, davem, netdev, sorganov, linux-doc

On Mon, Jul 20, 2020 at 05:21:50PM -0700, Richard Cochran wrote:
> On Tue, Jul 21, 2020 at 01:13:14AM +0300, Vladimir Oltean wrote:
> > I think at least part of the reason why this keeps going on is that
> > there aren't any hard and fast rules that say you shouldn't do it. When
> > there isn't even a convincing percentage of DSA/PHY drivers that do set
> > SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked
> > PHC driver that sets the flag, plus a MAC driver that checks for it
> > incorrectly. So people tend to ignore this case.
> 
> Right.
> 
> > Even though, if stacked
> > DSA drivers started supporting software TX timestamping (which is not
> > unlikely, given the fact that this would also give you compatibility
> > with PHY timestamping), I'm sure things would change, because more
> > people would become aware of the issue once mv88e6xxx starts getting
> > affected.
> 
> I really can't see the utility in having a SW time stamp from a DSA
> interface.  The whole point of DSA time stamping is to get the ingress
> and egress time of frames on the external ports, in order to correct
> for the residence time within the switch.
> 
> Thanks,
> Richard

I understand where this is coming from. The DSA software data path is
the mirror image of the hardware data path: first the net device
corresponding to the switch port, then the net device corresponding to
the host port, then the physical host port, then the physical switch
port. So, just as hardware timestamping makes the most sense on the
outermost PHC, software timestamping makes the most sense on the
innermost driver, the last frontier before the packet leaves software
hands. That is clear.

But I feel that going as far as saying that 'DSA shouldn't set
SKBTX_IN_PROGRESS because it already offers hardware timestamping' is
wrong. A software timestamp provided by a DSA net device is just as
valuable (or not, depending on your needs) as a software timestamp
provided by any other net device. For example, to the people doing TCP
time stamping, this software timestamp is just 'the driver timestamp',
so it makes perfect sense to have it just where it is: in DSA. Not only
that, but we shouldn't completely rule out the idea of software TX
timestamps in DSA _and_ in the host interface for the same packet,
either, since that could form the basis for some nice benchmarking.

Not only that, but with PHY timestamping, one popular way of handling TX
hardware timestamps from a PHY is to call skb_tx_timestamp(). It is
nonsense to me, and counterproductive, to end up having that in the
code, but not claim SOF_TIMESTAMPING_TX_SOFTWARE support. And PHY
timestamping with DSA is not a contradiction in terms by any means,
on the contrary, it makes just as much sense as PHY timestamping in
general.

So I think the position of "just don't have software timestamping code
in DSA and you'll be fine" won't be getting us anywhere. Either you can
or you can't, and there isn't anything absurd about it, so sooner or
later somebody will want to do it. The rules surrounding it, however,
are far from being ready, or clear.

Am I missing something?

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-21 19:51                 ` Vladimir Oltean
@ 2020-07-22  3:25                   ` Richard Cochran
  2020-07-25 21:32                     ` Vladimir Oltean
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Cochran @ 2020-07-22  3:25 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: Jacob Keller, kuba, davem, netdev, sorganov, linux-doc

On Tue, Jul 21, 2020 at 10:51:27PM +0300, Vladimir Oltean wrote:
> So I think the position of "just don't have software timestamping code
> in DSA and you'll be fine" won't be getting us anywhere. Either you can
> or you can't, and there isn't anything absurd about it, so sooner or
> later somebody will want to do it. The rules surrounding it, however,
> are far from being ready, or clear.
> 
> Am I missing something?

I'm just trying to make things easy for you, as the author of DSA
drivers.  There is no need to set skb flags that have no purpose
within the stack.

Nobody is demanding software time stamps from any DSA devices yet, and
so I don't see the point in solving a problem that doesn't exist.

I'm sorry if the "rules" are not clear, but if you look around the
kernel internals, you will be hard pressed to find perfectly
documented rules anywhere!

Thanks,
Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions
  2020-07-22  3:25                   ` Richard Cochran
@ 2020-07-25 21:32                     ` Vladimir Oltean
  0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Oltean @ 2020-07-25 21:32 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Jacob Keller, kuba, davem, netdev, sorganov, linux-doc

On Tue, Jul 21, 2020 at 08:25:53PM -0700, Richard Cochran wrote:
> On Tue, Jul 21, 2020 at 10:51:27PM +0300, Vladimir Oltean wrote:
> > So I think the position of "just don't have software timestamping code
> > in DSA and you'll be fine" won't be getting us anywhere. Either you can
> > or you can't, and there isn't anything absurd about it, so sooner or
> > later somebody will want to do it. The rules surrounding it, however,
> > are far from being ready, or clear.
> > 
> > Am I missing something?
> 
> I'm just trying to make things easy for you, as the author of DSA
> drivers.  There is no need to set skb flags that have no purpose
> within the stack.
> 
> Nobody is demanding software time stamps from any DSA devices yet, and
> so I don't see the point in solving a problem that doesn't exist.
> 
> I'm sorry if the "rules" are not clear, but if you look around the
> kernel internals, you will be hard pressed to find perfectly
> documented rules anywhere!
> 
> Thanks,
> Richard

Could we perhaps take a step back and see what can be improved about the
documentation updates?

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-07-25 21:32 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-17 16:10 [PATCH net-next 0/3] Document more PTP timestamping known quirks Vladimir Oltean
2020-07-17 16:10 ` [PATCH net-next 1/3] docs: networking: timestamping: rename last section to "Known bugs" Vladimir Oltean
2020-07-17 22:05   ` Jacob Keller
2020-07-17 16:10 ` [PATCH net-next 2/3] docs: networking: timestamping: add one more known issue Vladimir Oltean
2020-07-17 23:08   ` Jacob Keller
2020-07-18 11:36     ` Vladimir Oltean
2020-07-17 16:10 ` [PATCH net-next 3/3] docs: networking: timestamping: add a set of frequently asked questions Vladimir Oltean
2020-07-17 23:12   ` Jacob Keller
2020-07-18 11:35     ` Vladimir Oltean
2020-07-20 18:54       ` Jacob Keller
2020-07-20 21:05         ` Vladimir Oltean
2020-07-20 21:45           ` Jacob Keller
2020-07-20 22:13             ` Vladimir Oltean
2020-07-21  0:21               ` Richard Cochran
2020-07-21 19:51                 ` Vladimir Oltean
2020-07-22  3:25                   ` Richard Cochran
2020-07-25 21:32                     ` Vladimir Oltean
2020-07-21 17:16               ` Jacob Keller
2020-07-21  0:15           ` Richard Cochran
2020-07-17 21:13 ` [PATCH net-next 0/3] Document more PTP timestamping known quirks Sergey Organov
2020-07-17 21:57   ` Vladimir Oltean
2020-07-17 23:13     ` Jacob Keller
2020-07-18 10:54     ` Sergey Organov
2020-07-18 11:30       ` Vladimir Oltean
2020-07-18 13:35         ` Sergey Organov
2020-07-18 11:21 ` Sergey Organov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).