[net-next,1/3] net: dsa: optimize tx timestamp request handling
diff mbox series

Message ID 20210416123655.42783-2-yangbo.lu@nxp.com
State New
Headers show
Series
  • Support ocelot PTP Sync one-step timestamping
Related show

Commit Message

Y.b. Lu April 16, 2021, 12:36 p.m. UTC
Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
drivers should adapt to it.

- Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
  port_txtstamp, so that most skbs not requiring tx timestamp just return.

- No longer to identify PTP packets, and limit tx timestamping only for PTP
  packets. If device driver likes, let device driver do.

- It is a waste to clone skb directly in dsa_skb_tx_timestamp().
  For one-step timestamping, a clone is not needed. For any failure of
  port_txtstamp (this may usually happen), the skb clone has to be freed.
  So put skb cloning into port_txtstamp where it really needs.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
 Documentation/networking/timestamping.rst     |  7 +++++--
 .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 ++++++++++++------
 .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
 drivers/net/dsa/mv88e6xxx/hwtstamp.c          | 21 +++++++++++++------
 drivers/net/dsa/mv88e6xxx/hwtstamp.h          |  6 +++---
 drivers/net/dsa/ocelot/felix.c                | 11 ++++++----
 drivers/net/dsa/sja1105/sja1105_ptp.c         |  6 +++++-
 drivers/net/dsa/sja1105/sja1105_ptp.h         |  2 +-
 include/net/dsa.h                             |  2 +-
 net/dsa/slave.c                               | 20 +++++-------------
 10 files changed, 57 insertions(+), 40 deletions(-)

Comments

Vladimir Oltean April 18, 2021, 9:18 a.m. UTC | #1
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
> 
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.

Agree that this is a trivial performance optimization with no downside
that we should be making.

> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.

Agree that DSA has a way too heavy hand in imposing upon the driver
which packets should be timestampable and which ones shouldn't.

For example, I have a latency measurement tool called isochron which is
based on hardware timestamping of non-PTP packets (in order to not
disturb the PTP state machines):
https://github.com/vladimiroltean/tsn-scripts

I can't use it on DSA interfaces, for rather artificial reasons.

> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.

Mostly agree. For two-step timestamping, it is an operation which all
drivers need to do, so it is in the common potion. If we want to support
one-step, we need to avoid cloning the PTP packets.

> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> ---
>  Documentation/networking/timestamping.rst     |  7 +++++--
>  .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 ++++++++++++------
>  .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
>  drivers/net/dsa/mv88e6xxx/hwtstamp.c          | 21 +++++++++++++------
>  drivers/net/dsa/mv88e6xxx/hwtstamp.h          |  6 +++---
>  drivers/net/dsa/ocelot/felix.c                | 11 ++++++----
>  drivers/net/dsa/sja1105/sja1105_ptp.c         |  6 +++++-
>  drivers/net/dsa/sja1105/sja1105_ptp.h         |  2 +-
>  include/net/dsa.h                             |  2 +-
>  net/dsa/slave.c                               | 20 +++++-------------
>  10 files changed, 57 insertions(+), 40 deletions(-)
> 
> diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
> index f682e88fa87e..7f04a699a5d1 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -635,8 +635,8 @@ in generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
>  PTP event messages (any other packets, including PTP general messages, are not
>  timestamped), and provides two hooks to drivers:
>  
> -- ``.port_txtstamp()``: The driver is passed a clone of the timestampable skb
> -  to be transmitted, before actually transmitting it. Typically, a switch will
> +- ``.port_txtstamp()``: A clone of the timestampable skb to be transmitted
> +  is needed, before actually transmitting it. Typically, a switch will
>    have a PTP TX timestamp register (or sometimes a FIFO) where the timestamp
>    becomes available. There may be an IRQ that is raised upon this timestamp's
>    availability, or the driver might have to poll after invoking
> @@ -645,6 +645,9 @@ timestamped), and provides two hooks to drivers:
>    later use (when the timestamp becomes available). Each skb is annotated with
>    a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
>    job of keeping track of which clone belongs to which skb.
> +  But one-step timestamping request is handled differently with above two-step
> +  timestamping. The skb clone is no longer needed since hardware will insert
> +  TX time information on packet during egress.

Bonus points for updating the documentation, but I don't quite like the
end result. Please feel free to restructure more, in order to have a
clearer and more coherent explanation.

Also, this paragraph from right above is no longer true:

	In code, DSA provides for most of the infrastructure for timestamping already,
	in generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
	PTP event messages (any other packets, including PTP general messages, are not
	timestamped), and provides two hooks to drivers:

It's nothing like that anymore. It's more of a passthrough now with your
changes, the BPF classifier is not run by the DSA core but optionally by
individual taggers.

Here is my attempt of rewriting this documentation paragraph, feel free
to take which parts you consider relevant:

-----------------------------[cut here]-----------------------------

In the generic layer, DSA provides the following infrastructure for PTP
timestamping:

- ``.port_txtstamp()``: a hook called prior to the transmission of
  packets with a hardware TX timestamping request from user space.
  This is required for two-step timestamping, since the hardware
  timestamp becomes available after the actual MAC transmission, so the
  driver must be prepared to correlate the timestamp with the original
  packet so that it can re-enqueue the packet back into the socket's
  error queue. To save the packet for when the timestamp becomes
  available, the driver can call ``skb_clone_sk`` and save the resulting
  clone in ``DSA_SKB_CB(skb)->clone``. Typically, a switch will have a
  PTP TX timestamp register (or sometimes a FIFO) where the timestamp
  becomes available. In case of a FIFO, the hardware might store
  key-value pairs of PTP sequence ID/message type/domain number and the
  actual timestamp. To perform the correlation correctly between the
  packets in a queue waiting for timestamping and the actual timestamps,
  drivers can use a BPF classifier (``ptp_classify_raw``) to identify
  the PTP transport type, and ``ptp_parse_header`` to interpret the PTP
  header fields. There may be an IRQ that is raised upon this
  timestamp's availability, or the driver might have to poll after
  invoking ``dev_queue_xmit()`` towards the host interface.
  One-step TX timestamping do not require packet cloning, since there is
  no follow-up message required by the PTP protocol (because the
  TX timestamp is embedded into the packet by the MAC), and therefore
  user space does not expect the packet annotated with the TX timestamp
  to be re-enqueued into its socket's error queue.

- ``.port_rxtstamp()``: On RX, the BPF classifier is run by DSA to
  identify PTP event messages (any other packets, including PTP general
  messages, are not timestamped). The original (and only) timestampable
  skb is provided to the driver, for it to annotate it with a timestamp,
  if that is immediately available, or defer to later. On reception,
  timestamps might either be available in-band (through metadata in the
  DSA header, or attached in other ways to the packet), or out-of-band
  (through another RX timestamping FIFO). Deferral on RX is typically
  necessary when retrieving the timestamp needs a sleepable context. In
  that case, it is the responsibility of the DSA driver to call
  ``netif_rx_ni()`` on the freshly timestamped skb.

-----------------------------[cut here]-----------------------------

>  
>  - ``.port_rxtstamp()``: The original (and only) timestampable skb is provided
>    to the driver, for it to annotate it with a timestamp, if that is immediately
> diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> index 69dd9a2e8bb6..2ff4b7c08b72 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> @@ -374,31 +374,39 @@ long hellcreek_hwtstamp_work(struct ptp_clock_info *ptp)
>  }
>  
>  bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> -			     struct sk_buff *clone, unsigned int type)
> +			     struct sk_buff *skb, struct sk_buff **clone)
>  {
>  	struct hellcreek *hellcreek = ds->priv;
>  	struct hellcreek_port_hwtstamp *ps;
>  	struct ptp_header *hdr;
> +	unsigned int type;
>  
>  	ps = &hellcreek->ports[port].port_hwtstamp;
>  
> -	/* Check if the driver is expected to do HW timestamping */
> -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> +	type = ptp_classify_raw(skb);
> +	if (type == PTP_CLASS_NONE)
>  		return false;
>  
>  	/* Make sure the message is a PTP message that needs to be timestamped
>  	 * and the interaction with the HW timestamping is enabled. If not, stop
>  	 * here
>  	 */
> -	hdr = hellcreek_should_tstamp(hellcreek, port, clone, type);
> +	hdr = hellcreek_should_tstamp(hellcreek, port, skb, type);
>  	if (!hdr)
>  		return false;
>  
> +	*clone = skb_clone_sk(skb);
> +	if (!(*clone))
> +		return false;
> +
>  	if (test_and_set_bit_lock(HELLCREEK_HWTSTAMP_TX_IN_PROGRESS,
> -				  &ps->state))
> +				  &ps->state)) {
> +		kfree_skb(*clone);
> +		*clone = NULL;
>  		return false;
> +	}
>  
> -	ps->tx_skb = clone;
> +	ps->tx_skb = *clone;
>  
>  	/* store the number of ticks occurred since system start-up till this
>  	 * moment
> diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> index c0745ffa1ebb..58cc96642076 100644
> --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> @@ -45,7 +45,7 @@ int hellcreek_port_hwtstamp_get(struct dsa_switch *ds, int port,
>  bool hellcreek_port_rxtstamp(struct dsa_switch *ds, int port,
>  			     struct sk_buff *clone, unsigned int type);
>  bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> -			     struct sk_buff *clone, unsigned int type);
> +			     struct sk_buff *skb, struct sk_buff **clone);
>  
>  int hellcreek_get_ts_info(struct dsa_switch *ds, int port,
>  			  struct ethtool_ts_info *info);
> diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> index 094d17a1d037..280a95962861 100644
> --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> @@ -469,24 +469,33 @@ long mv88e6xxx_hwtstamp_work(struct ptp_clock_info *ptp)
>  }
>  
>  bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> -			     struct sk_buff *clone, unsigned int type)
> +			     struct sk_buff *skb, struct sk_buff **clone)
>  {
>  	struct mv88e6xxx_chip *chip = ds->priv;
>  	struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port];
>  	struct ptp_header *hdr;
> +	unsigned int type;
>  
> -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> -		return false;
> +	type = ptp_classify_raw(skb);
> +	if (type == PTP_CLASS_NONE)
> +		return false
>  
> -	hdr = mv88e6xxx_should_tstamp(chip, port, clone, type);
> +	hdr = mv88e6xxx_should_tstamp(chip, port, skb, type);
>  	if (!hdr)
>  		return false;
>  
> +	*clone = skb_clone_sk(skb);
> +	if (!(*clone))
> +		return false;
> +
>  	if (test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS,
> -				  &ps->state))
> +				  &ps->state)) {
> +		kfree_skb(*clone);
> +		*clone = NULL;
>  		return false;
> +	}
>  
> -	ps->tx_skb = clone;
> +	ps->tx_skb = *clone;
>  	ps->tx_tstamp_start = jiffies;
>  	ps->tx_seq_id = be16_to_cpu(hdr->sequence_id);
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.h b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> index 9da9f197ba02..da2b253334d0 100644
> --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> @@ -118,7 +118,7 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port,
>  bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port,
>  			     struct sk_buff *clone, unsigned int type);
>  bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> -			     struct sk_buff *clone, unsigned int type);
> +			     struct sk_buff *skb, struct sk_buff **clone);
>  
>  int mv88e6xxx_get_ts_info(struct dsa_switch *ds, int port,
>  			  struct ethtool_ts_info *info);
> @@ -152,8 +152,8 @@ static inline bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port,
>  }
>  
>  static inline bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> -					   struct sk_buff *clone,
> -					   unsigned int type)
> +					   struct sk_buff *skb,
> +					   struct sk_buff **clone)
>  {
>  	return false;
>  }
> diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
> index 6b5442be0230..cdec2f5e271c 100644
> --- a/drivers/net/dsa/ocelot/felix.c
> +++ b/drivers/net/dsa/ocelot/felix.c
> @@ -1396,14 +1396,17 @@ static bool felix_rxtstamp(struct dsa_switch *ds, int port,
>  }
>  
>  static bool felix_txtstamp(struct dsa_switch *ds, int port,
> -			   struct sk_buff *clone, unsigned int type)
> +			   struct sk_buff *skb, struct sk_buff **clone)
>  {
>  	struct ocelot *ocelot = ds->priv;
>  	struct ocelot_port *ocelot_port = ocelot->ports[port];
>  
> -	if (ocelot->ptp && (skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP) &&
> -	    ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
> -		ocelot_port_add_txtstamp_skb(ocelot, port, clone);
> +	if (ocelot->ptp && ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
> +		*clone = skb_clone_sk(skb);
> +		if (!(*clone))
> +			return false;
> +
> +		ocelot_port_add_txtstamp_skb(ocelot, port, *clone);
>  		return true;
>  	}
>  
> diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
> index 1b90570b257b..6a1f854a8c33 100644
> --- a/drivers/net/dsa/sja1105/sja1105_ptp.c
> +++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
> @@ -436,7 +436,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds, int port,
>   * callback, where we will timestamp it synchronously.
>   */
>  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> -			   struct sk_buff *skb, unsigned int type)
> +			   struct sk_buff *skb, struct sk_buff **clone)
>  {
>  	struct sja1105_private *priv = ds->priv;
>  	struct sja1105_port *sp = &priv->ports[port];
> @@ -444,6 +444,10 @@ bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
>  	if (!sp->hwts_tx_en)
>  		return false;
>  
> +	*clone = skb_clone_sk(skb);
> +	if (!(*clone))
> +		return false;
> +
>  	return true;
>  }
>  
> diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.h b/drivers/net/dsa/sja1105/sja1105_ptp.h
> index 3daa33e98e77..ab80b73219cb 100644
> --- a/drivers/net/dsa/sja1105/sja1105_ptp.h
> +++ b/drivers/net/dsa/sja1105/sja1105_ptp.h
> @@ -105,7 +105,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds, int port,
>  			   struct sk_buff *skb, unsigned int type);
>  
>  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> -			   struct sk_buff *skb, unsigned int type);
> +			   struct sk_buff *skb, struct sk_buff **clone);
>  
>  int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr);
>  
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 1259b0f40684..c8415c324e27 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -734,7 +734,7 @@ struct dsa_switch_ops {
>  	int	(*port_hwtstamp_set)(struct dsa_switch *ds, int port,
>  				     struct ifreq *ifr);
>  	bool	(*port_txtstamp)(struct dsa_switch *ds, int port,
> -				 struct sk_buff *clone, unsigned int type);
> +				 struct sk_buff *skb, struct sk_buff **clone);

How about not passing "clone" back to DSA as an argument by reference,
but instead require the driver to populate DSA_SKB_CB(skb)->clone if it
needs to do so?

Also, how about changing the return type to void? Returning true or
false makes no difference.

>  	bool	(*port_rxtstamp)(struct dsa_switch *ds, int port,
>  				 struct sk_buff *skb, unsigned int type);
>  
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 9300cb66e500..5b746a903ef4 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -19,7 +19,6 @@
>  #include <linux/if_bridge.h>
>  #include <linux/if_hsr.h>
>  #include <linux/netpoll.h>
> -#include <linux/ptp_classify.h>
>  
>  #include "dsa_priv.h"
>  
> @@ -555,26 +554,19 @@ static void dsa_skb_tx_timestamp(struct dsa_slave_priv *p,
>  				 struct sk_buff *skb)
>  {
>  	struct dsa_switch *ds = p->dp->ds;
> -	struct sk_buff *clone;
> -	unsigned int type;
> +	struct sk_buff *clone = NULL;
>  
> -	type = ptp_classify_raw(skb);
> -	if (type == PTP_CLASS_NONE)
> +	if (!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
>  		return;
>  
>  	if (!ds->ops->port_txtstamp)
>  		return;
>  
> -	clone = skb_clone_sk(skb);
> -	if (!clone)
> +	if (!ds->ops->port_txtstamp(ds, p->dp->index, skb, &clone))
>  		return;
>  
> -	if (ds->ops->port_txtstamp(ds, p->dp->index, clone, type)) {
> +	if (clone)
>  		DSA_SKB_CB(skb)->clone = clone;
> -		return;
> -	}
> -
> -	kfree_skb(clone);
>  }
>  
>  netdev_tx_t dsa_enqueue_skb(struct sk_buff *skb, struct net_device *dev)
> @@ -628,9 +620,7 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	DSA_SKB_CB(skb)->clone = NULL;
>  
> -	/* Identify PTP protocol packets, clone them, and pass them to the
> -	 * switch driver
> -	 */
> +	/* Handle tx timestamp request if has */

s/if has/if any/

>  	dsa_skb_tx_timestamp(p, skb);
>  
>  	if (dsa_realloc_skb(skb, dev)) {
> -- 
> 2.25.1
>
Richard Cochran April 18, 2021, 3:06 p.m. UTC | #2
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
> 
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> 
> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.
> 
> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.

This patch changes three things ^^^ at once.  These are AFAICT
independent changes, and so please break this one patch into three for
easier review.

Thanks,
Richard
Richard Cochran April 18, 2021, 3:08 p.m. UTC | #3
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> @@ -628,9 +620,7 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  	DSA_SKB_CB(skb)->clone = NULL;
>  
> -	/* Identify PTP protocol packets, clone them, and pass them to the
> -	 * switch driver
> -	 */
> +	/* Handle tx timestamp request if has */

"if has" what?

>  	dsa_skb_tx_timestamp(p, skb);
>  
>  	if (dsa_realloc_skb(skb, dev)) {
> -- 
> 2.25.1
> 

Thanks,
Richard
Richard Cochran April 18, 2021, 3:11 p.m. UTC | #4
On Sun, Apr 18, 2021 at 12:18:42PM +0300, Vladimir Oltean wrote:
> 
> How about not passing "clone" back to DSA as an argument by reference,
> but instead require the driver to populate DSA_SKB_CB(skb)->clone if it
> needs to do so?
> 
> Also, how about changing the return type to void? Returning true or
> false makes no difference.

+1

Thanks,
Richard
Kurt Kanzenbach April 19, 2021, 6:46 a.m. UTC | #5
On Fri Apr 16 2021, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
>
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.
>
> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.
>
> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.
>
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

PTP still works.

Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Y.b. Lu April 20, 2021, 7:39 a.m. UTC | #6
> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: 2021年4月18日 23:11
> To: Vladimir Oltean <olteanv@gmail.com>
> Cc: Y.b. Lu <yangbo.lu@nxp.com>; netdev@vger.kernel.org; Vladimir Oltean
> <vladimir.oltean@nxp.com>; David S . Miller <davem@davemloft.net>; Jakub
> Kicinski <kuba@kernel.org>; Jonathan Corbet <corbet@lwn.net>; Kurt
> Kanzenbach <kurt@linutronix.de>; Andrew Lunn <andrew@lunn.ch>; Vivien
> Didelot <vivien.didelot@gmail.com>; Florian Fainelli <f.fainelli@gmail.com>;
> Claudiu Manoil <claudiu.manoil@nxp.com>; Alexandre Belloni
> <alexandre.belloni@bootlin.com>; UNGLinuxDriver@microchip.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Sun, Apr 18, 2021 at 12:18:42PM +0300, Vladimir Oltean wrote:
> >
> > How about not passing "clone" back to DSA as an argument by reference,
> > but instead require the driver to populate DSA_SKB_CB(skb)->clone if
> > it needs to do so?
> >
> > Also, how about changing the return type to void? Returning true or
> > false makes no difference.

Thank you. That's good idea.
And how about letting driver store the skb clone pointer, or not? I copied my comments in patch #3 here,

" Can we totally drop dsa_skb_cb in dsa core? The only usage of it is holding a skb clone pointer, for only felix and sja1105.
Actually we can move such pointer in <device>_skb_cb, instead of reserving the space of skb for any drivers."

> 
> +1
> 
> Thanks,
> Richard
Y.b. Lu April 20, 2021, 7:40 a.m. UTC | #7
> -----Original Message-----
> From: Richard Cochran <richardcochran@gmail.com>
> Sent: 2021年4月18日 23:06
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: netdev@vger.kernel.org; Vladimir Oltean <vladimir.oltean@nxp.com>;
> David S . Miller <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>;
> Jonathan Corbet <corbet@lwn.net>; Kurt Kanzenbach <kurt@linutronix.de>;
> Andrew Lunn <andrew@lunn.ch>; Vivien Didelot <vivien.didelot@gmail.com>;
> Florian Fainelli <f.fainelli@gmail.com>; Claudiu Manoil
> <claudiu.manoil@nxp.com>; Alexandre Belloni
> <alexandre.belloni@bootlin.com>; UNGLinuxDriver@microchip.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> > Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> > drivers should adapt to it.
> >
> > - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
> >   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> >
> > - No longer to identify PTP packets, and limit tx timestamping only for PTP
> >   packets. If device driver likes, let device driver do.
> >
> > - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
> >   For one-step timestamping, a clone is not needed. For any failure of
> >   port_txtstamp (this may usually happen), the skb clone has to be freed.
> >   So put skb cloning into port_txtstamp where it really needs.
> 
> This patch changes three things ^^^ at once.  These are AFAICT independent
> changes, and so please break this one patch into three for easier review.

Will split it in next version. Thank you.

> 
> Thanks,
> Richard
Y.b. Lu April 20, 2021, 7:48 a.m. UTC | #8
> -----Original Message-----
> From: Vladimir Oltean <olteanv@gmail.com>
> Sent: 2021年4月18日 17:19
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: netdev@vger.kernel.org; Richard Cochran <richardcochran@gmail.com>;
> Vladimir Oltean <vladimir.oltean@nxp.com>; David S . Miller
> <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>; Jonathan Corbet
> <corbet@lwn.net>; Kurt Kanzenbach <kurt@linutronix.de>; Andrew Lunn
> <andrew@lunn.ch>; Vivien Didelot <vivien.didelot@gmail.com>; Florian
> Fainelli <f.fainelli@gmail.com>; Claudiu Manoil <claudiu.manoil@nxp.com>;
> Alexandre Belloni <alexandre.belloni@bootlin.com>;
> UNGLinuxDriver@microchip.com; linux-doc@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> > Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> > drivers should adapt to it.
> >
> > - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
> >   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> 
> Agree that this is a trivial performance optimization with no downside that we
> should be making.
> 
> > - No longer to identify PTP packets, and limit tx timestamping only for PTP
> >   packets. If device driver likes, let device driver do.
> 
> Agree that DSA has a way too heavy hand in imposing upon the driver which
> packets should be timestampable and which ones shouldn't.
> 
> For example, I have a latency measurement tool called isochron which is based
> on hardware timestamping of non-PTP packets (in order to not disturb the PTP
> state machines):
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2Fvladimiroltean%2Ftsn-scripts&amp;data=04%7C01%7Cyangbo.lu%40
> nxp.com%7C3772f63deb6f4491933208d9024af750%7C686ea1d3bc2b4c6fa9
> 2cd99c5c301635%7C0%7C0%7C637543343267914018%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> CI6Mn0%3D%7C1000&amp;sdata=deOn03j6biPEUwppNkv%2F9if1u5HIdLEtzz
> AkWW0p1rc%3D&amp;reserved=0
> 
> I can't use it on DSA interfaces, for rather artificial reasons.
> 
> > - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
> >   For one-step timestamping, a clone is not needed. For any failure of
> >   port_txtstamp (this may usually happen), the skb clone has to be freed.
> >   So put skb cloning into port_txtstamp where it really needs.
> 
> Mostly agree. For two-step timestamping, it is an operation which all drivers
> need to do, so it is in the common potion. If we want to support one-step, we
> need to avoid cloning the PTP packets.
> 

Thanks a lot for your ACK.

> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> > ---
> >  Documentation/networking/timestamping.rst     |  7 +++++--
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 ++++++++++++------
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.c          | 21
> +++++++++++++------
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.h          |  6 +++---
> >  drivers/net/dsa/ocelot/felix.c                | 11 ++++++----
> >  drivers/net/dsa/sja1105/sja1105_ptp.c         |  6 +++++-
> >  drivers/net/dsa/sja1105/sja1105_ptp.h         |  2 +-
> >  include/net/dsa.h                             |  2 +-
> >  net/dsa/slave.c                               | 20 +++++-------------
> >  10 files changed, 57 insertions(+), 40 deletions(-)
> >
> > diff --git a/Documentation/networking/timestamping.rst
> > b/Documentation/networking/timestamping.rst
> > index f682e88fa87e..7f04a699a5d1 100644
> > --- a/Documentation/networking/timestamping.rst
> > +++ b/Documentation/networking/timestamping.rst
> > @@ -635,8 +635,8 @@ in generic code: a BPF classifier
> > (``ptp_classify_raw``) is used to identify  PTP event messages (any
> > other packets, including PTP general messages, are not  timestamped), and
> provides two hooks to drivers:
> >
> > -- ``.port_txtstamp()``: The driver is passed a clone of the
> > timestampable skb
> > -  to be transmitted, before actually transmitting it. Typically, a
> > switch will
> > +- ``.port_txtstamp()``: A clone of the timestampable skb to be
> > +transmitted
> > +  is needed, before actually transmitting it. Typically, a switch
> > +will
> >    have a PTP TX timestamp register (or sometimes a FIFO) where the
> timestamp
> >    becomes available. There may be an IRQ that is raised upon this
> timestamp's
> >    availability, or the driver might have to poll after invoking @@
> > -645,6 +645,9 @@ timestamped), and provides two hooks to drivers:
> >    later use (when the timestamp becomes available). Each skb is annotated
> with
> >    a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
> >    job of keeping track of which clone belongs to which skb.
> > +  But one-step timestamping request is handled differently with above
> > + two-step  timestamping. The skb clone is no longer needed since
> > + hardware will insert  TX time information on packet during egress.
> 
> Bonus points for updating the documentation, but I don't quite like the end
> result. Please feel free to restructure more, in order to have a clearer and more
> coherent explanation.
> 
> Also, this paragraph from right above is no longer true:
> 
> 	In code, DSA provides for most of the infrastructure for timestamping
> already,
> 	in generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
> 	PTP event messages (any other packets, including PTP general messages,
> are not
> 	timestamped), and provides two hooks to drivers:
> 
> It's nothing like that anymore. It's more of a passthrough now with your
> changes, the BPF classifier is not run by the DSA core but optionally by
> individual taggers.
> 
> Here is my attempt of rewriting this documentation paragraph, feel free to
> take which parts you consider relevant:
> 
> -----------------------------[cut here]-----------------------------
> 
> In the generic layer, DSA provides the following infrastructure for PTP
> timestamping:
> 
> - ``.port_txtstamp()``: a hook called prior to the transmission of
>   packets with a hardware TX timestamping request from user space.
>   This is required for two-step timestamping, since the hardware
>   timestamp becomes available after the actual MAC transmission, so the
>   driver must be prepared to correlate the timestamp with the original
>   packet so that it can re-enqueue the packet back into the socket's
>   error queue. To save the packet for when the timestamp becomes
>   available, the driver can call ``skb_clone_sk`` and save the resulting
>   clone in ``DSA_SKB_CB(skb)->clone``. Typically, a switch will have a
>   PTP TX timestamp register (or sometimes a FIFO) where the timestamp
>   becomes available. In case of a FIFO, the hardware might store
>   key-value pairs of PTP sequence ID/message type/domain number and the
>   actual timestamp. To perform the correlation correctly between the
>   packets in a queue waiting for timestamping and the actual timestamps,
>   drivers can use a BPF classifier (``ptp_classify_raw``) to identify
>   the PTP transport type, and ``ptp_parse_header`` to interpret the PTP
>   header fields. There may be an IRQ that is raised upon this
>   timestamp's availability, or the driver might have to poll after
>   invoking ``dev_queue_xmit()`` towards the host interface.
>   One-step TX timestamping do not require packet cloning, since there is
>   no follow-up message required by the PTP protocol (because the
>   TX timestamp is embedded into the packet by the MAC), and therefore
>   user space does not expect the packet annotated with the TX timestamp
>   to be re-enqueued into its socket's error queue.
> 
> - ``.port_rxtstamp()``: On RX, the BPF classifier is run by DSA to
>   identify PTP event messages (any other packets, including PTP general
>   messages, are not timestamped). The original (and only) timestampable
>   skb is provided to the driver, for it to annotate it with a timestamp,
>   if that is immediately available, or defer to later. On reception,
>   timestamps might either be available in-band (through metadata in the
>   DSA header, or attached in other ways to the packet), or out-of-band
>   (through another RX timestamping FIFO). Deferral on RX is typically
>   necessary when retrieving the timestamp needs a sleepable context. In
>   that case, it is the responsibility of the DSA driver to call
>   ``netif_rx_ni()`` on the freshly timestamped skb.
> 
> -----------------------------[cut here]-----------------------------
> 

That's really considerate and accuracy description. I will update with this. (Will adjust if we can drop dsa_skb_cb in dsa core.)
Thanks.

> >
> >  - ``.port_rxtstamp()``: The original (and only) timestampable skb is provided
> >    to the driver, for it to annotate it with a timestamp, if that is
> > immediately diff --git
> > a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > index 69dd9a2e8bb6..2ff4b7c08b72 100644
> > --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > @@ -374,31 +374,39 @@ long hellcreek_hwtstamp_work(struct
> > ptp_clock_info *ptp)  }
> >
> >  bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type)
> > +			     struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct hellcreek *hellcreek = ds->priv;
> >  	struct hellcreek_port_hwtstamp *ps;
> >  	struct ptp_header *hdr;
> > +	unsigned int type;
> >
> >  	ps = &hellcreek->ports[port].port_hwtstamp;
> >
> > -	/* Check if the driver is expected to do HW timestamping */
> > -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> > +	type = ptp_classify_raw(skb);
> > +	if (type == PTP_CLASS_NONE)
> >  		return false;
> >
> >  	/* Make sure the message is a PTP message that needs to be
> timestamped
> >  	 * and the interaction with the HW timestamping is enabled. If not, stop
> >  	 * here
> >  	 */
> > -	hdr = hellcreek_should_tstamp(hellcreek, port, clone, type);
> > +	hdr = hellcreek_should_tstamp(hellcreek, port, skb, type);
> >  	if (!hdr)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	if (test_and_set_bit_lock(HELLCREEK_HWTSTAMP_TX_IN_PROGRESS,
> > -				  &ps->state))
> > +				  &ps->state)) {
> > +		kfree_skb(*clone);
> > +		*clone = NULL;
> >  		return false;
> > +	}
> >
> > -	ps->tx_skb = clone;
> > +	ps->tx_skb = *clone;
> >
> >  	/* store the number of ticks occurred since system start-up till this
> >  	 * moment
> > diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > index c0745ffa1ebb..58cc96642076 100644
> > --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > @@ -45,7 +45,7 @@ int hellcreek_port_hwtstamp_get(struct dsa_switch
> > *ds, int port,  bool hellcreek_port_rxtstamp(struct dsa_switch *ds, int port,
> >  			     struct sk_buff *clone, unsigned int type);  bool
> > hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type);
> > +			     struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int hellcreek_get_ts_info(struct dsa_switch *ds, int port,
> >  			  struct ethtool_ts_info *info);
> > diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > index 094d17a1d037..280a95962861 100644
> > --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > @@ -469,24 +469,33 @@ long mv88e6xxx_hwtstamp_work(struct
> > ptp_clock_info *ptp)  }
> >
> >  bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type)
> > +			     struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct mv88e6xxx_chip *chip = ds->priv;
> >  	struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port];
> >  	struct ptp_header *hdr;
> > +	unsigned int type;
> >
> > -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> > -		return false;
> > +	type = ptp_classify_raw(skb);
> > +	if (type == PTP_CLASS_NONE)
> > +		return false
> >
> > -	hdr = mv88e6xxx_should_tstamp(chip, port, clone, type);
> > +	hdr = mv88e6xxx_should_tstamp(chip, port, skb, type);
> >  	if (!hdr)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	if (test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS,
> > -				  &ps->state))
> > +				  &ps->state)) {
> > +		kfree_skb(*clone);
> > +		*clone = NULL;
> >  		return false;
> > +	}
> >
> > -	ps->tx_skb = clone;
> > +	ps->tx_skb = *clone;
> >  	ps->tx_tstamp_start = jiffies;
> >  	ps->tx_seq_id = be16_to_cpu(hdr->sequence_id);
> >
> > diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > index 9da9f197ba02..da2b253334d0 100644
> > --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > @@ -118,7 +118,7 @@ int mv88e6xxx_port_hwtstamp_get(struct
> dsa_switch
> > *ds, int port,  bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int
> port,
> >  			     struct sk_buff *clone, unsigned int type);  bool
> > mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type);
> > +			     struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int mv88e6xxx_get_ts_info(struct dsa_switch *ds, int port,
> >  			  struct ethtool_ts_info *info);
> > @@ -152,8 +152,8 @@ static inline bool mv88e6xxx_port_rxtstamp(struct
> > dsa_switch *ds, int port,  }
> >
> >  static inline bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -					   struct sk_buff *clone,
> > -					   unsigned int type)
> > +					   struct sk_buff *skb,
> > +					   struct sk_buff **clone)
> >  {
> >  	return false;
> >  }
> > diff --git a/drivers/net/dsa/ocelot/felix.c
> > b/drivers/net/dsa/ocelot/felix.c index 6b5442be0230..cdec2f5e271c
> > 100644
> > --- a/drivers/net/dsa/ocelot/felix.c
> > +++ b/drivers/net/dsa/ocelot/felix.c
> > @@ -1396,14 +1396,17 @@ static bool felix_rxtstamp(struct dsa_switch
> > *ds, int port,  }
> >
> >  static bool felix_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *clone, unsigned int type)
> > +			   struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct ocelot *ocelot = ds->priv;
> >  	struct ocelot_port *ocelot_port = ocelot->ports[port];
> >
> > -	if (ocelot->ptp && (skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP)
> &&
> > -	    ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
> > -		ocelot_port_add_txtstamp_skb(ocelot, port, clone);
> > +	if (ocelot->ptp && ocelot_port->ptp_cmd ==
> IFH_REW_OP_TWO_STEP_PTP) {
> > +		*clone = skb_clone_sk(skb);
> > +		if (!(*clone))
> > +			return false;
> > +
> > +		ocelot_port_add_txtstamp_skb(ocelot, port, *clone);
> >  		return true;
> >  	}
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c
> > b/drivers/net/dsa/sja1105/sja1105_ptp.c
> > index 1b90570b257b..6a1f854a8c33 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_ptp.c
> > +++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
> > @@ -436,7 +436,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds,
> int port,
> >   * callback, where we will timestamp it synchronously.
> >   */
> >  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *skb, unsigned int type)
> > +			   struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct sja1105_private *priv = ds->priv;
> >  	struct sja1105_port *sp = &priv->ports[port]; @@ -444,6 +444,10 @@
> > bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> >  	if (!sp->hwts_tx_en)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	return true;
> >  }
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.h
> > b/drivers/net/dsa/sja1105/sja1105_ptp.h
> > index 3daa33e98e77..ab80b73219cb 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_ptp.h
> > +++ b/drivers/net/dsa/sja1105/sja1105_ptp.h
> > @@ -105,7 +105,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds,
> int port,
> >  			   struct sk_buff *skb, unsigned int type);
> >
> >  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *skb, unsigned int type);
> > +			   struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct
> > ifreq *ifr);
> >
> > diff --git a/include/net/dsa.h b/include/net/dsa.h index
> > 1259b0f40684..c8415c324e27 100644
> > --- a/include/net/dsa.h
> > +++ b/include/net/dsa.h
> > @@ -734,7 +734,7 @@ struct dsa_switch_ops {
> >  	int	(*port_hwtstamp_set)(struct dsa_switch *ds, int port,
> >  				     struct ifreq *ifr);
> >  	bool	(*port_txtstamp)(struct dsa_switch *ds, int port,
> > -				 struct sk_buff *clone, unsigned int type);
> > +				 struct sk_buff *skb, struct sk_buff **clone);
> 
> How about not passing "clone" back to DSA as an argument by reference, but
> instead require the driver to populate DSA_SKB_CB(skb)->clone if it needs to do
> so?
> 
> Also, how about changing the return type to void? Returning true or false
> makes no difference.
> 
> >  	bool	(*port_rxtstamp)(struct dsa_switch *ds, int port,
> >  				 struct sk_buff *skb, unsigned int type);
> >
> > diff --git a/net/dsa/slave.c b/net/dsa/slave.c index
> > 9300cb66e500..5b746a903ef4 100644
> > --- a/net/dsa/slave.c
> > +++ b/net/dsa/slave.c
> > @@ -19,7 +19,6 @@
> >  #include <linux/if_bridge.h>
> >  #include <linux/if_hsr.h>
> >  #include <linux/netpoll.h>
> > -#include <linux/ptp_classify.h>
> >
> >  #include "dsa_priv.h"
> >
> > @@ -555,26 +554,19 @@ static void dsa_skb_tx_timestamp(struct
> dsa_slave_priv *p,
> >  				 struct sk_buff *skb)
> >  {
> >  	struct dsa_switch *ds = p->dp->ds;
> > -	struct sk_buff *clone;
> > -	unsigned int type;
> > +	struct sk_buff *clone = NULL;
> >
> > -	type = ptp_classify_raw(skb);
> > -	if (type == PTP_CLASS_NONE)
> > +	if (!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
> >  		return;
> >
> >  	if (!ds->ops->port_txtstamp)
> >  		return;
> >
> > -	clone = skb_clone_sk(skb);
> > -	if (!clone)
> > +	if (!ds->ops->port_txtstamp(ds, p->dp->index, skb, &clone))
> >  		return;
> >
> > -	if (ds->ops->port_txtstamp(ds, p->dp->index, clone, type)) {
> > +	if (clone)
> >  		DSA_SKB_CB(skb)->clone = clone;
> > -		return;
> > -	}
> > -
> > -	kfree_skb(clone);
> >  }
> >
> >  netdev_tx_t dsa_enqueue_skb(struct sk_buff *skb, struct net_device
> > *dev) @@ -628,9 +620,7 @@ static netdev_tx_t dsa_slave_xmit(struct
> > sk_buff *skb, struct net_device *dev)
> >
> >  	DSA_SKB_CB(skb)->clone = NULL;
> >
> > -	/* Identify PTP protocol packets, clone them, and pass them to the
> > -	 * switch driver
> > -	 */
> > +	/* Handle tx timestamp request if has */
> 
> s/if has/if any/

Sorry. Will fix.

> 
> >  	dsa_skb_tx_timestamp(p, skb);
> >
> >  	if (dsa_realloc_skb(skb, dev)) {
> > --
> > 2.25.1
> >

Patch
diff mbox series

diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index f682e88fa87e..7f04a699a5d1 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -635,8 +635,8 @@  in generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
 PTP event messages (any other packets, including PTP general messages, are not
 timestamped), and provides two hooks to drivers:
 
-- ``.port_txtstamp()``: The driver is passed a clone of the timestampable skb
-  to be transmitted, before actually transmitting it. Typically, a switch will
+- ``.port_txtstamp()``: A clone of the timestampable skb to be transmitted
+  is needed, before actually transmitting it. Typically, a switch will
   have a PTP TX timestamp register (or sometimes a FIFO) where the timestamp
   becomes available. There may be an IRQ that is raised upon this timestamp's
   availability, or the driver might have to poll after invoking
@@ -645,6 +645,9 @@  timestamped), and provides two hooks to drivers:
   later use (when the timestamp becomes available). Each skb is annotated with
   a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
   job of keeping track of which clone belongs to which skb.
+  But one-step timestamping request is handled differently with above two-step
+  timestamping. The skb clone is no longer needed since hardware will insert
+  TX time information on packet during egress.
 
 - ``.port_rxtstamp()``: The original (and only) timestampable skb is provided
   to the driver, for it to annotate it with a timestamp, if that is immediately
diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
index 69dd9a2e8bb6..2ff4b7c08b72 100644
--- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
+++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
@@ -374,31 +374,39 @@  long hellcreek_hwtstamp_work(struct ptp_clock_info *ptp)
 }
 
 bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
-			     struct sk_buff *clone, unsigned int type)
+			     struct sk_buff *skb, struct sk_buff **clone)
 {
 	struct hellcreek *hellcreek = ds->priv;
 	struct hellcreek_port_hwtstamp *ps;
 	struct ptp_header *hdr;
+	unsigned int type;
 
 	ps = &hellcreek->ports[port].port_hwtstamp;
 
-	/* Check if the driver is expected to do HW timestamping */
-	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
+	type = ptp_classify_raw(skb);
+	if (type == PTP_CLASS_NONE)
 		return false;
 
 	/* Make sure the message is a PTP message that needs to be timestamped
 	 * and the interaction with the HW timestamping is enabled. If not, stop
 	 * here
 	 */
-	hdr = hellcreek_should_tstamp(hellcreek, port, clone, type);
+	hdr = hellcreek_should_tstamp(hellcreek, port, skb, type);
 	if (!hdr)
 		return false;
 
+	*clone = skb_clone_sk(skb);
+	if (!(*clone))
+		return false;
+
 	if (test_and_set_bit_lock(HELLCREEK_HWTSTAMP_TX_IN_PROGRESS,
-				  &ps->state))
+				  &ps->state)) {
+		kfree_skb(*clone);
+		*clone = NULL;
 		return false;
+	}
 
-	ps->tx_skb = clone;
+	ps->tx_skb = *clone;
 
 	/* store the number of ticks occurred since system start-up till this
 	 * moment
diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
index c0745ffa1ebb..58cc96642076 100644
--- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
+++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
@@ -45,7 +45,7 @@  int hellcreek_port_hwtstamp_get(struct dsa_switch *ds, int port,
 bool hellcreek_port_rxtstamp(struct dsa_switch *ds, int port,
 			     struct sk_buff *clone, unsigned int type);
 bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
-			     struct sk_buff *clone, unsigned int type);
+			     struct sk_buff *skb, struct sk_buff **clone);
 
 int hellcreek_get_ts_info(struct dsa_switch *ds, int port,
 			  struct ethtool_ts_info *info);
diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
index 094d17a1d037..280a95962861 100644
--- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
+++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
@@ -469,24 +469,33 @@  long mv88e6xxx_hwtstamp_work(struct ptp_clock_info *ptp)
 }
 
 bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
-			     struct sk_buff *clone, unsigned int type)
+			     struct sk_buff *skb, struct sk_buff **clone)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 	struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port];
 	struct ptp_header *hdr;
+	unsigned int type;
 
-	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
-		return false;
+	type = ptp_classify_raw(skb);
+	if (type == PTP_CLASS_NONE)
+		return false
 
-	hdr = mv88e6xxx_should_tstamp(chip, port, clone, type);
+	hdr = mv88e6xxx_should_tstamp(chip, port, skb, type);
 	if (!hdr)
 		return false;
 
+	*clone = skb_clone_sk(skb);
+	if (!(*clone))
+		return false;
+
 	if (test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS,
-				  &ps->state))
+				  &ps->state)) {
+		kfree_skb(*clone);
+		*clone = NULL;
 		return false;
+	}
 
-	ps->tx_skb = clone;
+	ps->tx_skb = *clone;
 	ps->tx_tstamp_start = jiffies;
 	ps->tx_seq_id = be16_to_cpu(hdr->sequence_id);
 
diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.h b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
index 9da9f197ba02..da2b253334d0 100644
--- a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
+++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
@@ -118,7 +118,7 @@  int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port,
 bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port,
 			     struct sk_buff *clone, unsigned int type);
 bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
-			     struct sk_buff *clone, unsigned int type);
+			     struct sk_buff *skb, struct sk_buff **clone);
 
 int mv88e6xxx_get_ts_info(struct dsa_switch *ds, int port,
 			  struct ethtool_ts_info *info);
@@ -152,8 +152,8 @@  static inline bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port,
 }
 
 static inline bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
-					   struct sk_buff *clone,
-					   unsigned int type)
+					   struct sk_buff *skb,
+					   struct sk_buff **clone)
 {
 	return false;
 }
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
index 6b5442be0230..cdec2f5e271c 100644
--- a/drivers/net/dsa/ocelot/felix.c
+++ b/drivers/net/dsa/ocelot/felix.c
@@ -1396,14 +1396,17 @@  static bool felix_rxtstamp(struct dsa_switch *ds, int port,
 }
 
 static bool felix_txtstamp(struct dsa_switch *ds, int port,
-			   struct sk_buff *clone, unsigned int type)
+			   struct sk_buff *skb, struct sk_buff **clone)
 {
 	struct ocelot *ocelot = ds->priv;
 	struct ocelot_port *ocelot_port = ocelot->ports[port];
 
-	if (ocelot->ptp && (skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP) &&
-	    ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
-		ocelot_port_add_txtstamp_skb(ocelot, port, clone);
+	if (ocelot->ptp && ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
+		*clone = skb_clone_sk(skb);
+		if (!(*clone))
+			return false;
+
+		ocelot_port_add_txtstamp_skb(ocelot, port, *clone);
 		return true;
 	}
 
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 1b90570b257b..6a1f854a8c33 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -436,7 +436,7 @@  bool sja1105_port_rxtstamp(struct dsa_switch *ds, int port,
  * callback, where we will timestamp it synchronously.
  */
 bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
-			   struct sk_buff *skb, unsigned int type)
+			   struct sk_buff *skb, struct sk_buff **clone)
 {
 	struct sja1105_private *priv = ds->priv;
 	struct sja1105_port *sp = &priv->ports[port];
@@ -444,6 +444,10 @@  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
 	if (!sp->hwts_tx_en)
 		return false;
 
+	*clone = skb_clone_sk(skb);
+	if (!(*clone))
+		return false;
+
 	return true;
 }
 
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.h b/drivers/net/dsa/sja1105/sja1105_ptp.h
index 3daa33e98e77..ab80b73219cb 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.h
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.h
@@ -105,7 +105,7 @@  bool sja1105_port_rxtstamp(struct dsa_switch *ds, int port,
 			   struct sk_buff *skb, unsigned int type);
 
 bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
-			   struct sk_buff *skb, unsigned int type);
+			   struct sk_buff *skb, struct sk_buff **clone);
 
 int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr);
 
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 1259b0f40684..c8415c324e27 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -734,7 +734,7 @@  struct dsa_switch_ops {
 	int	(*port_hwtstamp_set)(struct dsa_switch *ds, int port,
 				     struct ifreq *ifr);
 	bool	(*port_txtstamp)(struct dsa_switch *ds, int port,
-				 struct sk_buff *clone, unsigned int type);
+				 struct sk_buff *skb, struct sk_buff **clone);
 	bool	(*port_rxtstamp)(struct dsa_switch *ds, int port,
 				 struct sk_buff *skb, unsigned int type);
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 9300cb66e500..5b746a903ef4 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -19,7 +19,6 @@ 
 #include <linux/if_bridge.h>
 #include <linux/if_hsr.h>
 #include <linux/netpoll.h>
-#include <linux/ptp_classify.h>
 
 #include "dsa_priv.h"
 
@@ -555,26 +554,19 @@  static void dsa_skb_tx_timestamp(struct dsa_slave_priv *p,
 				 struct sk_buff *skb)
 {
 	struct dsa_switch *ds = p->dp->ds;
-	struct sk_buff *clone;
-	unsigned int type;
+	struct sk_buff *clone = NULL;
 
-	type = ptp_classify_raw(skb);
-	if (type == PTP_CLASS_NONE)
+	if (!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
 		return;
 
 	if (!ds->ops->port_txtstamp)
 		return;
 
-	clone = skb_clone_sk(skb);
-	if (!clone)
+	if (!ds->ops->port_txtstamp(ds, p->dp->index, skb, &clone))
 		return;
 
-	if (ds->ops->port_txtstamp(ds, p->dp->index, clone, type)) {
+	if (clone)
 		DSA_SKB_CB(skb)->clone = clone;
-		return;
-	}
-
-	kfree_skb(clone);
 }
 
 netdev_tx_t dsa_enqueue_skb(struct sk_buff *skb, struct net_device *dev)
@@ -628,9 +620,7 @@  static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	DSA_SKB_CB(skb)->clone = NULL;
 
-	/* Identify PTP protocol packets, clone them, and pass them to the
-	 * switch driver
-	 */
+	/* Handle tx timestamp request if has */
 	dsa_skb_tx_timestamp(p, skb);
 
 	if (dsa_realloc_skb(skb, dev)) {