netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
@ 2019-12-16 22:33 Vladimir Oltean
  2019-12-17 19:57 ` Keller, Jacob E
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-12-16 22:33 UTC (permalink / raw)
  To: davem, jakub.kicinski
  Cc: richardcochran, f.fainelli, vivien.didelot, andrew, netdev,
	Vladimir Oltean

On the LS1021A-TSN board, it can be seen that under rare conditions,
ptp4l gets unexpected extra data in the event socket error queue.

This is because the DSA master driver (gianfar) has TX timestamping
logic along the lines of:

1. In gfar_start_xmit:
	do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
		    priv->hwts_tx_en;
	(...)
	if (unlikely(do_tstamp))
		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
2. Later in gfar_clean_tx_ring:
	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
		(...)
		skb_tstamp_tx(skb, &shhwtstamps);

That is to say, between the first and second check, it drops
priv->hwts_tx_en (it assumes that it is the only one who can set
SKBTX_IN_PROGRESS, disregarding stacked net devices such as DSA switches
or PTP-capable PHY drivers). Any such driver (like sja1105 in this case)
that would set SKBTX_IN_PROGRESS would trip up this check and gianfar
would deliver a garbage TX timestamp for this skb too, which can in turn
have unpredictable and surprising effects to user space.

In fact gianfar is by far not the only driver which uses
SKBTX_IN_PROGRESS to identify skb's that need special handling. The flag
used to have a historical purpose and is now evaluated in the networking
stack in a single place: in __skb_tstamp_tx, only on the software
timestamping path (hwtstamps == NULL) which is not relevant for us.

So do the wise thing and drop the unneeded assignment. Even though this
patch alone will not protect against all classes of Ethernet driver TX
timestamping bugs, it will circumvent those related to the incorrect
interpretation of this skb tx flag.

Fixes: 47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping")
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/sja1105/sja1105_ptp.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 54258a25031d..038c83fbd9e8 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -668,8 +668,6 @@ void sja1105_ptp_txtstamp_skb(struct dsa_switch *ds, int slot,
 	u64 ticks, ts;
 	int rc;
 
-	skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
-
 	mutex_lock(&ptp_data->lock);
 
 	rc = sja1105_ptpclkval_read(priv, &ticks, NULL);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-16 22:33 [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue Vladimir Oltean
@ 2019-12-17 19:57 ` Keller, Jacob E
  2019-12-17 20:07   ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Keller, Jacob E @ 2019-12-17 19:57 UTC (permalink / raw)
  To: Vladimir Oltean, davem, jakub.kicinski
  Cc: richardcochran, f.fainelli, vivien.didelot, andrew, netdev

> -----Original Message-----
> From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> Behalf Of Vladimir Oltean
> Sent: Monday, December 16, 2019 2:34 PM
> To: davem@davemloft.net; jakub.kicinski@netronome.com
> Cc: richardcochran@gmail.com; f.fainelli@gmail.com; vivien.didelot@gmail.com;
> andrew@lunn.ch; netdev@vger.kernel.org; Vladimir Oltean
> <olteanv@gmail.com>
> Subject: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to
> socket error queue
> 
> On the LS1021A-TSN board, it can be seen that under rare conditions,
> ptp4l gets unexpected extra data in the event socket error queue.
> 
> This is because the DSA master driver (gianfar) has TX timestamping
> logic along the lines of:
> 
> 1. In gfar_start_xmit:
> 	do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
> 		    priv->hwts_tx_en;
> 	(...)
> 	if (unlikely(do_tstamp))
> 		skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> 2. Later in gfar_clean_tx_ring:
> 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> 		(...)
> 		skb_tstamp_tx(skb, &shhwtstamps);

I'm not sure I fully understand the problem.

> 
> That is to say, between the first and second check, it drops
> priv->hwts_tx_en (it assumes that it is the only one who can set
> SKBTX_IN_PROGRESS, disregarding stacked net devices such as DSA switches
> or PTP-capable PHY drivers). Any such driver (like sja1105 in this case)
> that would set SKBTX_IN_PROGRESS would trip up this check and gianfar
> would deliver a garbage TX timestamp for this skb too, which can in turn
> have unpredictable and surprising effects to user space.
> 
> In fact gianfar is by far not the only driver which uses
> SKBTX_IN_PROGRESS to identify skb's that need special handling. The flag
> used to have a historical purpose and is now evaluated in the networking
> stack in a single place: in __skb_tstamp_tx, only on the software
> timestamping path (hwtstamps == NULL) which is not relevant for us.
> 
> So do the wise thing and drop the unneeded assignment. Even though this
> patch alone will not protect against all classes of Ethernet driver TX
> timestamping bugs, it will circumvent those related to the incorrect
> interpretation of this skb tx flag.
>

I thought the point of SKBTX_IN_PROGRESS was to inform the stack that a timestamp was pending. By not setting it, you no longer do this.

Maybe that has changed since the original implementation? Or am I misunderstanding this patch..?

You're removing the sja1105 assignment, not the one from the gianfar. Hmm

Ok, so the issue is that sja1105_ptp.c was incorrectly setting the flag.

Would it make more sense for gianfar to set SKBTX_IN_PROGRESS, but then use some other indicator internally, so that other callers who set it don't cause the gianfar driver to behave incorrectly? I believe we handle it in the Intel drivers that way by storing the skb. Then we don't check the SKBTX_IN_PROGRESS later.

Thanks,
Jake 
 
> Fixes: 47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping")
> Suggested-by: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  drivers/net/dsa/sja1105/sja1105_ptp.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c
> b/drivers/net/dsa/sja1105/sja1105_ptp.c
> index 54258a25031d..038c83fbd9e8 100644
> --- a/drivers/net/dsa/sja1105/sja1105_ptp.c
> +++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
> @@ -668,8 +668,6 @@ void sja1105_ptp_txtstamp_skb(struct dsa_switch *ds, int
> slot,
>  	u64 ticks, ts;
>  	int rc;
> 
> -	skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> -
>  	mutex_lock(&ptp_data->lock);
> 
>  	rc = sja1105_ptpclkval_read(priv, &ticks, NULL);
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-17 19:57 ` Keller, Jacob E
@ 2019-12-17 20:07   ` Vladimir Oltean
  2019-12-17 22:21     ` Keller, Jacob E
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-12-17 20:07 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: davem, jakub.kicinski, richardcochran, f.fainelli,
	vivien.didelot, andrew, netdev

Hi Jake,

On Tue, 17 Dec 2019 at 21:57, Keller, Jacob E <jacob.e.keller@intel.com> wrote:
>
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> > Behalf Of Vladimir Oltean
> > Sent: Monday, December 16, 2019 2:34 PM
> > To: davem@davemloft.net; jakub.kicinski@netronome.com
> > Cc: richardcochran@gmail.com; f.fainelli@gmail.com; vivien.didelot@gmail.com;
> > andrew@lunn.ch; netdev@vger.kernel.org; Vladimir Oltean
> > <olteanv@gmail.com>
> > Subject: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to
> > socket error queue
> >
> > On the LS1021A-TSN board, it can be seen that under rare conditions,
> > ptp4l gets unexpected extra data in the event socket error queue.
> >
> > This is because the DSA master driver (gianfar) has TX timestamping
> > logic along the lines of:
> >
> > 1. In gfar_start_xmit:
> >       do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
> >                   priv->hwts_tx_en;
> >       (...)
> >       if (unlikely(do_tstamp))
> >               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> > 2. Later in gfar_clean_tx_ring:
> >       if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> >               (...)
> >               skb_tstamp_tx(skb, &shhwtstamps);
>
> I'm not sure I fully understand the problem.
>
> I thought the point of SKBTX_IN_PROGRESS was to inform the stack that a timestamp was pending. By not setting it, you no longer do this.
>
> Maybe that has changed since the original implementation? Or am I misunderstanding this patch..?
>

I am not quite sure what the point of SKBTX_IN_PROGRESS is. If you
search through the kernel you will find a single occurrence right now
that is supposed to do something (I don't understand what exactly)
when there is a concurrent sw and hw timestamp.

> You're removing the sja1105 assignment, not the one from the gianfar. Hmm
>
> Ok, so the issue is that sja1105_ptp.c was incorrectly setting the flag.
>
> Would it make more sense for gianfar to set SKBTX_IN_PROGRESS, but then use some other indicator internally, so that other callers who set it don't cause the gianfar driver to behave incorrectly? I believe we handle it in the Intel drivers that way by storing the skb. Then we don't check the SKBTX_IN_PROGRESS later.

Yes, the point is that it should check priv->hwts_tx_en &&
(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS). That was my initial
fix for the bug, but Richard argued that setting SKBTX_IN_PROGRESS in
itself isn't needed.

There are many more drivers that are in principle broken with DSA PTP,
since they don't even have the equivalent check for priv->hwts_tx_en.

>
> Thanks,
> Jake
>
>

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-17 20:07   ` Vladimir Oltean
@ 2019-12-17 22:21     ` Keller, Jacob E
  2019-12-24 19:05       ` Richard Cochran
  0 siblings, 1 reply; 10+ messages in thread
From: Keller, Jacob E @ 2019-12-17 22:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: davem, jakub.kicinski, richardcochran, f.fainelli,
	vivien.didelot, andrew, netdev

> -----Original Message-----
> From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> Behalf Of Vladimir Oltean
> Sent: Tuesday, December 17, 2019 12:07 PM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: davem@davemloft.net; jakub.kicinski@netronome.com;
> richardcochran@gmail.com; f.fainelli@gmail.com; vivien.didelot@gmail.com;
> andrew@lunn.ch; netdev@vger.kernel.org
> Subject: Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps
> to socket error queue
> 
> Hi Jake,
> 
> > I thought the point of SKBTX_IN_PROGRESS was to inform the stack that a
> timestamp was pending. By not setting it, you no longer do this.
> >
> > Maybe that has changed since the original implementation? Or am I
> misunderstanding this patch..?
> >
> 
> I am not quite sure what the point of SKBTX_IN_PROGRESS is. If you
> search through the kernel you will find a single occurrence right now
> that is supposed to do something (I don't understand what exactly)
> when there is a concurrent sw and hw timestamp.
> 

My understanding was that setting it prevented the stack from generating a SW timestamp if the hardware timestamp was going to be provided. Basically, this is because we would otherwise report the timestamp twice to applications that expect only one timestamp.

There were some patches from Miroslav that enabled optionally allowed the reporting of both SW and HW timestamps at the same time.

> > You're removing the sja1105 assignment, not the one from the gianfar. Hmm
> >
> > Ok, so the issue is that sja1105_ptp.c was incorrectly setting the flag.
> >
> > Would it make more sense for gianfar to set SKBTX_IN_PROGRESS, but then
> use some other indicator internally, so that other callers who set it don't cause
> the gianfar driver to behave incorrectly? I believe we handle it in the Intel drivers
> that way by storing the skb. Then we don't check the SKBTX_IN_PROGRESS later.
> 
> Yes, the point is that it should check priv->hwts_tx_en &&
> (skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS). That was my initial
> fix for the bug, but Richard argued that setting SKBTX_IN_PROGRESS in
> itself isn't needed.
> 

I'd trust Richard on this one for sure.

> There are many more drivers that are in principle broken with DSA PTP,
> since they don't even have the equivalent check for priv->hwts_tx_en.
> 

Right. I'm wondering what the correct fix would be so that we can fix the drivers and hopefully avoid introducing a similar issue in the future.

Thanks,
Jake

> >
> > Thanks,
> > Jake
> >
> >
> 
> Thanks,
> -Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-17 22:21     ` Keller, Jacob E
@ 2019-12-24 19:05       ` Richard Cochran
  2019-12-26 18:24         ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Cochran @ 2019-12-24 19:05 UTC (permalink / raw)
  To: Keller, Jacob E
  Cc: Vladimir Oltean, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

On Tue, Dec 17, 2019 at 10:21:29PM +0000, Keller, Jacob E wrote:
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> > Behalf Of Vladimir Oltean

> My understanding was that setting it prevented the stack from
> generating a SW timestamp if the hardware timestamp was going to be
> provided. Basically, this is because we would otherwise report the
> timestamp twice to applications that expect only one timestamp.

Correct. 

> There were some patches from Miroslav that enabled optionally
> allowed the reporting of both SW and HW timestamps at the same time.

Not quite.  If the user sets SOF_TIMESTAMPING_OPT_TX_SWHW, then there
will be two packets delivered to the error queue, one SW and one HW.

> > There are many more drivers that are in principle broken with DSA PTP,
> > since they don't even have the equivalent check for priv->hwts_tx_en.

Please stop saying that.  It is not true.

> Right. I'm wondering what the correct fix would be so that we can
> fix the drivers and hopefully avoid introducing a similar issue in
> the future.

No fix is needed.  MAC drivers must set SKBTX_IN_PROGRESS and call
skb_tstamp_tx() to deliver the transmit time stamp.  DSA drivers
should call skb_complete_tx_timestamp() to deliver the transmit time
stamp, and they should *not* set SKBTX_IN_PROGRESS.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-24 19:05       ` Richard Cochran
@ 2019-12-26 18:24         ` Vladimir Oltean
  2019-12-27  1:52           ` Richard Cochran
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-12-26 18:24 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Keller, Jacob E, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

Hi Richard,

On Tue, 24 Dec 2019 at 21:05, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Tue, Dec 17, 2019 at 10:21:29PM +0000, Keller, Jacob E wrote:
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> > > Behalf Of Vladimir Oltean
>
> > > There are many more drivers that are in principle broken with DSA PTP,
> > > since they don't even have the equivalent check for priv->hwts_tx_en.
>
> Please stop saying that.  It is not true.
>

Why isn't it true?
You mean to say that this code in cavium/liquidio/lio_main.c will ever
work with a PTP-cable DSA switch or PHY?

static netdev_tx_t liquidio_xmit(struct sk_buff *skb, struct net_device *netdev)
{
(...)
    if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
        skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
        cmdsetup.s.timestamp = 1;
    }

Or this one in cavium/octeon/octeon_mgmt.c?

    re.s.tstamp = ((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) != 0);

Or this one in mscc/ocelot.c:

    /* Check if timestamping is needed */
    if (ocelot->ptp && shinfo->tx_flags & SKBTX_HW_TSTAMP) {
        info.rew_op = ocelot_port->ptp_cmd;
        if (ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP)
            info.rew_op |= (ocelot_port->ts_id  % 4) << 3;
    }

Or this one in mellanox/mlx5/core/en_tx.c:

            if (unlikely(skb_shinfo(skb)->tx_flags &
                     SKBTX_HW_TSTAMP)) {
                struct skb_shared_hwtstamps hwts = {};

                hwts.hwtstamp =
                    mlx5_timecounter_cyc2time(sq->clock,
                                  get_cqe_ts(cqe));
                skb_tstamp_tx(skb, &hwts);
            }

etc etc.

How will these drivers not transmit a second hw TX timestamp to the
stack, if they don't check whether TX timestamping is enabled for
their netdev?

Of course, at least that breakage is going to be much more binary and
obvious: PTP simply won't work at all for drivers stacked on top of
them until they are fixed.

> No fix is needed.  MAC drivers must set SKBTX_IN_PROGRESS and call
> skb_tstamp_tx() to deliver the transmit time stamp.  DSA drivers
> should call skb_complete_tx_timestamp() to deliver the transmit time
> stamp, and they should *not* set SKBTX_IN_PROGRESS.
>

Who says so, and why? How would it be better than fixing gianfar in
this case? How would it be better in avoiding compatibility with the
drivers mentioned above?

Does the TI PHYTER driver count?

commit e2e2f51dd0254fa0002bcd1c5fda180348163f09
Author: Stefan Sørensen <stefan.sorensen@spectralink.com>
Date:   Mon Feb 3 15:36:35 2014 +0100

    net:phy:dp83640: Declare that TX timestamping possible

    Set the SKBTX_IN_PROGRESS bit in tx_flags dp83640_txtstamp when doing
    tx timestamps as per Documentation/networking/timestamping.txt.

    Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
    Acked-by: Richard Cochran <richardcochran@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 547725fa8671..3f882eea6e1d 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1281,6 +1281,7 @@ static void dp83640_txtstamp(struct phy_device *phydev,
                }
                /* fall through */
        case HWTSTAMP_TX_ON:
+               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
                skb_queue_tail(&dp83640->tx_queue, skb);
                schedule_work(&dp83640->ts_work);
                break;

I think this rule is somewhat made-up and arbitrary.

> Thanks,
> Richard

Thanks,
-Vladimir

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-26 18:24         ` Vladimir Oltean
@ 2019-12-27  1:52           ` Richard Cochran
  2019-12-27 15:19             ` Richard Cochran
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Cochran @ 2019-12-27  1:52 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Keller, Jacob E, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

On Thu, Dec 26, 2019 at 08:24:19PM +0200, Vladimir Oltean wrote:
> How will these drivers not transmit a second hw TX timestamp to the
> stack, if they don't check whether TX timestamping is enabled for
> their netdev?

Ah, so they are checking SKBTX_HW_TSTAMP on the socket without
checking for HWTSTAMP first?  Yeah, that won't work with DSA time
stamping.  But who cares?  Most of the real world doesn't have gigabit
DSA switches in front of MACs that provide 10 gigabit links or higher.

> Of course, at least that breakage is going to be much more binary and
> obvious: PTP simply won't work at all for drivers stacked on top of
> them until they are fixed.

Right, you can fix individual MAC drivers to work with DSA, one by
one.  You are free to try to get the maintainers to ack your fixes.
(But, here is a friendly hint: don't start out by declaring their
drivers "broken").
 
> Does the TI PHYTER driver count?

No.  It really doesn't count.  It won't work together with MAC time
stamping, but this is a known limitation.  That part is 100 mbit only,
and there are very few implementations.  Unfortunately the
so_timestamping API did not foresee the possibility of simultaneous
MAC and PHY time stamping.  Too bad, that's life.  I didn't invent the
API, and I don't have to defend it, either.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-27  1:52           ` Richard Cochran
@ 2019-12-27 15:19             ` Richard Cochran
  2019-12-27 15:30               ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Cochran @ 2019-12-27 15:19 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Keller, Jacob E, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

On Thu, Dec 26, 2019 at 05:52:32PM -0800, Richard Cochran wrote:
> On Thu, Dec 26, 2019 at 08:24:19PM +0200, Vladimir Oltean wrote:
> > How will these drivers not transmit a second hw TX timestamp to the
> > stack, if they don't check whether TX timestamping is enabled for
> > their netdev?
> 
> Ah, so they are checking SKBTX_HW_TSTAMP on the socket without
> checking for HWTSTAMP first?

Thinking a bit more about this, MAC drivers should not attempt any
time stamping unless that functionality has been enabled at the device
level using HWTSTAMP.  This is the user space API.

So, if you want to fix those drivers, you can submit patches with the
above justification.  That is a stronger argument than saying it
breaks DSA drivers!

Thanks,
Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-27 15:19             ` Richard Cochran
@ 2019-12-27 15:30               ` Vladimir Oltean
  2019-12-27 17:39                 ` Richard Cochran
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-12-27 15:30 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Keller, Jacob E, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

On Fri, 27 Dec 2019 at 17:19, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Thu, Dec 26, 2019 at 05:52:32PM -0800, Richard Cochran wrote:
> > On Thu, Dec 26, 2019 at 08:24:19PM +0200, Vladimir Oltean wrote:
> > > How will these drivers not transmit a second hw TX timestamp to the
> > > stack, if they don't check whether TX timestamping is enabled for
> > > their netdev?
> >
> > Ah, so they are checking SKBTX_HW_TSTAMP on the socket without
> > checking for HWTSTAMP first?
>
> Thinking a bit more about this, MAC drivers should not attempt any
> time stamping unless that functionality has been enabled at the device
> level using HWTSTAMP.  This is the user space API.
>
> So, if you want to fix those drivers, you can submit patches with the
> above justification.  That is a stronger argument than saying it
> breaks DSA drivers!
>
> Thanks,
> Richard

But in the case that I _do_ care about, it _is_ caused by DSA. The
gianfar driver is not expecting anybody else to set SKBTX_IN_PROGRESS,
which in itself is not illegal, whether or not you argue that it is
needed or not.
For the rest of the drivers, sure, they commit even more flagrant API
breaches, such as not checking previous calls of the HWTSTAMP ioctl.
But the flagrant issues are at least easier to catch (aka PTP is never
going to work), so I'm not as concerned about them, as long as user
space (e.g. ptp4l) has the necessary checks in place to detect such
error conditions.

Thanks.
-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue
  2019-12-27 15:30               ` Vladimir Oltean
@ 2019-12-27 17:39                 ` Richard Cochran
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Cochran @ 2019-12-27 17:39 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Keller, Jacob E, davem, jakub.kicinski, f.fainelli,
	vivien.didelot, andrew, netdev

On Fri, Dec 27, 2019 at 05:30:09PM +0200, Vladimir Oltean wrote:
> But in the case that I _do_ care about, it _is_ caused by DSA. The
> gianfar driver is not expecting anybody else to set SKBTX_IN_PROGRESS,
> which in itself is not illegal, whether or not you argue that it is
> needed or not.

Ok, so I've reviewed the whole SKBTX_IN_PROGRESS thing again, and I'm
starting to see it your way.  (Or maybe not ;^)

The purpose of SKBTX_IN_PROGRESS is to prevent a SW transmit time
stamp from being delivered (unless SOF_TIMESTAMPING_OPT_TX_SWHW is set
on the socket).

So DSA drivers (and PHY drivers) should indeed set this flag.

However, most (or all?) MAC drivers do not expect anyone else to have
set this flag.

In the case of DSA, the DSA driver has the chance to set the flag
first, before the skb is passed to the MAC driver.  (PHY drivers don't
have first dibs, but let's concentrate on DSA for now.)

So, you could introduce a new rule that MAC drivers must check to see
if the flag is already set, and if so leave it alone.

MAC drivers should in any case respect their current SIOCSHWTSTAMP
configuration and not attempt time stamping when it is not enabled.

So I'm afraid that, as I've said before, you'll have to patch the MAC
drivers one by.  It will be lots of work.

But if and when you submit patches for the MAC drivers, please
remember that DSA time stamping is the new kid on the block, and
yelling that the MAC drivers are broken won't make you any friends.

Thanks,
Richard


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-12-27 17:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-16 22:33 [PATCH net] net: dsa: sja1105: Fix double delivery of TX timestamps to socket error queue Vladimir Oltean
2019-12-17 19:57 ` Keller, Jacob E
2019-12-17 20:07   ` Vladimir Oltean
2019-12-17 22:21     ` Keller, Jacob E
2019-12-24 19:05       ` Richard Cochran
2019-12-26 18:24         ` Vladimir Oltean
2019-12-27  1:52           ` Richard Cochran
2019-12-27 15:19             ` Richard Cochran
2019-12-27 15:30               ` Vladimir Oltean
2019-12-27 17:39                 ` Richard Cochran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).