netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 0/2] The DSA TX timestamping situation
@ 2019-12-28 13:30 Vladimir Oltean
  2019-12-28 13:30 ` [PATCH v2 net 1/2] gianfar: Fix TX timestamping with a stacked DSA driver Vladimir Oltean
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Vladimir Oltean @ 2019-12-28 13:30 UTC (permalink / raw)
  To: davem, jakub.kicinski
  Cc: richardcochran, f.fainelli, vivien.didelot, andrew,
	claudiu.manoil, yangbo.lu, netdev, Vladimir Oltean

This series is the moral v2 of "[PATCH net] net: dsa: sja1105: Fix
double delivery of TX timestamps to socket error queue" [0] which did
not manage to convince public opinion (actually it didn't convince me
neither).

This fixes PTP timestamping on one particular board, where the DSA
switch is sja1105 and the master is gianfar. Unfortunately there is no
way to make the fix more general without committing logical
inaccuracies: the SKBTX_IN_PROGRESS flag does serve a purpose, even if
the sja1105 driver is not using it now: it prevents delivering a SW
timestamp to the app socket when the HW timestamp will be provided. So
not setting this flag (the approach from v1) might create avoidable
complications in the future (not to mention that there isn't any
satisfactory explanation on why that would be the correct solution).

So the goal of this change set is to create a more strict framework for
DSA master devices when attached to PTP switches, and to fix the first
master driver that is overstepping its duties and is delivering
unsolicited TX timestamps.

[0]: https://www.spinics.net/lists/netdev/msg619699.html

Vladimir Oltean (2):
  gianfar: Fix TX timestamping with a stacked DSA driver
  net: dsa: Deny PTP on master if switch supports it

 drivers/net/ethernet/freescale/gianfar.c | 10 +++++---
 net/dsa/master.c                         | 30 ++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 3 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 net 1/2] gianfar: Fix TX timestamping with a stacked DSA driver
  2019-12-28 13:30 [PATCH v2 net 0/2] The DSA TX timestamping situation Vladimir Oltean
@ 2019-12-28 13:30 ` Vladimir Oltean
  2019-12-28 13:30 ` [PATCH v2 net 2/2] net: dsa: Deny PTP on master if switch supports it Vladimir Oltean
  2019-12-28 19:43 ` [PATCH v2 net 0/2] The DSA TX timestamping situation David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2019-12-28 13:30 UTC (permalink / raw)
  To: davem, jakub.kicinski
  Cc: richardcochran, f.fainelli, vivien.didelot, andrew,
	claudiu.manoil, yangbo.lu, netdev, Vladimir Oltean

The driver wrongly assumes that it is the only entity that can set the
SKBTX_IN_PROGRESS bit of the current skb. Therefore, in the
gfar_clean_tx_ring function, where the TX timestamp is collected if
necessary, the aforementioned bit is used to discriminate whether or not
the TX timestamp should be delivered to the socket's error queue.

But a stacked driver such as a DSA switch can also set the
SKBTX_IN_PROGRESS bit, which is actually exactly what it should do in
order to denote that the hardware timestamping process is undergoing.

Therefore, gianfar would misinterpret the "in progress" bit as being its
own, and deliver a second skb clone in the socket's error queue,
completely throwing off a PTP process which is not expecting to receive
it, _even though_ TX timestamping is not enabled for gianfar.

There have been discussions [0] as to whether non-MAC drivers need or
not to set SKBTX_IN_PROGRESS at all (whose purpose is to avoid sending 2
timestamps, a sw and a hw one, to applications which only expect one).
But as of this patch, there are at least 2 PTP drivers that would break
in conjunction with gianfar: the sja1105 DSA switch and the felix
switch, by way of its ocelot core driver.

So regardless of that conclusion, fix the gianfar driver to not do stuff
based on flags set by others and not intended for it.

[0]: https://www.spinics.net/lists/netdev/msg619699.html

Fixes: f0ee7acfcdd4 ("gianfar: Add hardware TX timestamping support")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
---
Changes from v1:
- Reworded the commit message, dropped reference to the TI PHYTER.

 drivers/net/ethernet/freescale/gianfar.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 72868a28b621..7d08bf6370ae 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2205,13 +2205,17 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 	skb_dirtytx = tx_queue->skb_dirtytx;
 
 	while ((skb = tx_queue->tx_skbuff[skb_dirtytx])) {
+		bool do_tstamp;
+
+		do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+			    priv->hwts_tx_en;
 
 		frags = skb_shinfo(skb)->nr_frags;
 
 		/* When time stamping, one additional TxBD must be freed.
 		 * Also, we need to dma_unmap_single() the TxPAL.
 		 */
-		if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
+		if (unlikely(do_tstamp))
 			nr_txbds = frags + 2;
 		else
 			nr_txbds = frags + 1;
@@ -2225,7 +2229,7 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 		    (lstatus & BD_LENGTH_MASK))
 			break;
 
-		if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS)) {
+		if (unlikely(do_tstamp)) {
 			next = next_txbd(bdp, base, tx_ring_size);
 			buflen = be16_to_cpu(next->length) +
 				 GMAC_FCB_LEN + GMAC_TXPAL_LEN;
@@ -2235,7 +2239,7 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 		dma_unmap_single(priv->dev, be32_to_cpu(bdp->bufPtr),
 				 buflen, DMA_TO_DEVICE);
 
-		if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS)) {
+		if (unlikely(do_tstamp)) {
 			struct skb_shared_hwtstamps shhwtstamps;
 			u64 *ns = (u64 *)(((uintptr_t)skb->data + 0x10) &
 					  ~0x7UL);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 net 2/2] net: dsa: Deny PTP on master if switch supports it
  2019-12-28 13:30 [PATCH v2 net 0/2] The DSA TX timestamping situation Vladimir Oltean
  2019-12-28 13:30 ` [PATCH v2 net 1/2] gianfar: Fix TX timestamping with a stacked DSA driver Vladimir Oltean
@ 2019-12-28 13:30 ` Vladimir Oltean
  2019-12-28 19:43 ` [PATCH v2 net 0/2] The DSA TX timestamping situation David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2019-12-28 13:30 UTC (permalink / raw)
  To: davem, jakub.kicinski
  Cc: richardcochran, f.fainelli, vivien.didelot, andrew,
	claudiu.manoil, yangbo.lu, netdev, Vladimir Oltean

It is possible to kill PTP on a DSA switch completely and absolutely,
until a reboot, with a simple command:

tcpdump -i eth2 -j adapter_unsynced

where eth2 is the switch's DSA master.

Why? Well, in short, the PTP API in place today is a bit rudimentary and
relies on applications to retrieve the TX timestamps by polling the
error queue and looking at the cmsg structure. But there is no timestamp
identification of any sorts (except whether it's HW or SW), you don't
know how many more timestamps are there to come, which one is this one,
from whom it is, etc. In other words, the SO_TIMESTAMPING API is
fundamentally limited in that you can get a single HW timestamp from the
stack.

And the "-j adapter_unsynced" flag of tcpdump enables hardware
timestamping.

So let's imagine what happens when the DSA master decides it wants to
deliver TX timestamps to the skb's socket too:
- The timestamp that the user space sees is taken by the DSA master.
  Whereas the RX timestamp will eventually be overwritten by the DSA
  switch. So the RX and TX timestamps will be in different time bases
  (aka garbage).
- The user space applications have no way to deal with the second (real)
  TX timestamp finally delivered by the DSA switch, or even to know to
  wait for it.

Take ptp4l from the linuxptp project, for example. This is its behavior
after running tcpdump, before the patch:

ptp4l[172]: [6469.594] Unexpected data on socket err queue:
ptp4l[172]: [6469.693] rms    8 max   16 freq -21257 +/-  11 delay   748 +/-   0
ptp4l[172]: [6469.711] Unexpected data on socket err queue:
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 05 00 fd
ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: [6469.721] Unexpected data on socket err queue:
ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 01 c6 b1 00 fd
ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: [6469.838] Unexpected data on socket err queue:
ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 06 00 fd
ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: [6469.848] Unexpected data on socket err queue:
ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 13 02
ptp4l[172]: 0010 00 36 00 00 02 00 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 04 1a 45 05 7f
ptp4l[172]: 0030 00 00 5e 05 41 32 27 c2 1a 68 00 04 9f ff fe 05
ptp4l[172]: 0040 de 06 00 01
ptp4l[172]: [6469.855] Unexpected data on socket err queue:
ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 01 c6 b2 00 fd
ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: [6469.974] Unexpected data on socket err queue:
ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 07 00 fd
ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00

The ptp4l program itself is heavily patched to show this (more details
here [0]). Otherwise, by default it just hangs.

On the other hand, with the DSA patch to disallow HW timestamping
applied:

tcpdump -i eth2 -j adapter_unsynced
tcpdump: SIOCSHWTSTAMP failed: Device or resource busy

So it is a fact of life that PTP timestamping on the DSA master is
incompatible with timestamping on the switch MAC, at least with the
current API. And if the switch supports PTP, taking the timestamps from
the switch MAC is highly preferable anyway, due to the fact that those
don't contain the queuing latencies of the switch. So just disallow PTP
on the DSA master if there is any PTP-capable switch attached.

[0]: https://sourceforge.net/p/linuxptp/mailman/message/36880648/

Fixes: 0336369d3a4d ("net: dsa: forward hardware timestamping ioctls to switch driver")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
---
Changes in v2:
- Reworded the commit description, added some of Richard's
  clarifications.

 net/dsa/master.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/net/dsa/master.c b/net/dsa/master.c
index 3255dfc97f86..bd44bde272f4 100644
--- a/net/dsa/master.c
+++ b/net/dsa/master.c
@@ -197,6 +197,35 @@ static int dsa_master_get_phys_port_name(struct net_device *dev,
 	return 0;
 }
 
+static int dsa_master_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
+{
+	struct dsa_port *cpu_dp = dev->dsa_ptr;
+	struct dsa_switch *ds = cpu_dp->ds;
+	struct dsa_switch_tree *dst;
+	int err = -EOPNOTSUPP;
+	struct dsa_port *dp;
+
+	dst = ds->dst;
+
+	switch (cmd) {
+	case SIOCGHWTSTAMP:
+	case SIOCSHWTSTAMP:
+		/* Deny PTP operations on master if there is at least one
+		 * switch in the tree that is PTP capable.
+		 */
+		list_for_each_entry(dp, &dst->ports, list)
+			if (dp->ds->ops->port_hwtstamp_get ||
+			    dp->ds->ops->port_hwtstamp_set)
+				return -EBUSY;
+		break;
+	}
+
+	if (cpu_dp->orig_ndo_ops && cpu_dp->orig_ndo_ops->ndo_do_ioctl)
+		err = cpu_dp->orig_ndo_ops->ndo_do_ioctl(dev, ifr, cmd);
+
+	return err;
+}
+
 static int dsa_master_ethtool_setup(struct net_device *dev)
 {
 	struct dsa_port *cpu_dp = dev->dsa_ptr;
@@ -249,6 +278,7 @@ static int dsa_master_ndo_setup(struct net_device *dev)
 		memcpy(ops, cpu_dp->orig_ndo_ops, sizeof(*ops));
 
 	ops->ndo_get_phys_port_name = dsa_master_get_phys_port_name;
+	ops->ndo_do_ioctl = dsa_master_ioctl;
 
 	dev->netdev_ops  = ops;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 net 0/2] The DSA TX timestamping situation
  2019-12-28 13:30 [PATCH v2 net 0/2] The DSA TX timestamping situation Vladimir Oltean
  2019-12-28 13:30 ` [PATCH v2 net 1/2] gianfar: Fix TX timestamping with a stacked DSA driver Vladimir Oltean
  2019-12-28 13:30 ` [PATCH v2 net 2/2] net: dsa: Deny PTP on master if switch supports it Vladimir Oltean
@ 2019-12-28 19:43 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2019-12-28 19:43 UTC (permalink / raw)
  To: olteanv
  Cc: jakub.kicinski, richardcochran, f.fainelli, vivien.didelot,
	andrew, claudiu.manoil, yangbo.lu, netdev

From: Vladimir Oltean <olteanv@gmail.com>
Date: Sat, 28 Dec 2019 15:30:44 +0200

> This series is the moral v2 of "[PATCH net] net: dsa: sja1105: Fix
> double delivery of TX timestamps to socket error queue" [0] which did
> not manage to convince public opinion (actually it didn't convince me
> neither).
> 
> This fixes PTP timestamping on one particular board, where the DSA
> switch is sja1105 and the master is gianfar. Unfortunately there is no
> way to make the fix more general without committing logical
> inaccuracies: the SKBTX_IN_PROGRESS flag does serve a purpose, even if
> the sja1105 driver is not using it now: it prevents delivering a SW
> timestamp to the app socket when the HW timestamp will be provided. So
> not setting this flag (the approach from v1) might create avoidable
> complications in the future (not to mention that there isn't any
> satisfactory explanation on why that would be the correct solution).
> 
> So the goal of this change set is to create a more strict framework for
> DSA master devices when attached to PTP switches, and to fix the first
> master driver that is overstepping its duties and is delivering
> unsolicited TX timestamps.
> 
> [0]: https://www.spinics.net/lists/netdev/msg619699.html

Series applied, thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-12-28 19:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-28 13:30 [PATCH v2 net 0/2] The DSA TX timestamping situation Vladimir Oltean
2019-12-28 13:30 ` [PATCH v2 net 1/2] gianfar: Fix TX timestamping with a stacked DSA driver Vladimir Oltean
2019-12-28 13:30 ` [PATCH v2 net 2/2] net: dsa: Deny PTP on master if switch supports it Vladimir Oltean
2019-12-28 19:43 ` [PATCH v2 net 0/2] The DSA TX timestamping situation David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).