linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
@ 2024-05-03  2:25 Joe Damato
  2024-05-03  2:25 ` [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats Joe Damato
  2024-05-03 10:55 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Zhu Yanjun
  0 siblings, 2 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-03  2:25 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: gal, nalramli, Joe Damato, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

Hi:

This is only 1 patch, so I know a cover letter isn't necessary, but it
seems there are a few things to mention.

This change adds support for the per queue netdev-genl API to mlx5,
which seems to output stats:

./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
         --dump qstats-get --json '{"scope": "queue"}'

...snip
 {'ifindex': 7,
  'queue-id': 28,
  'queue-type': 'tx',
  'tx-bytes': 399462,
  'tx-packets': 3311},
...snip

I've tried to use the tooling suggested to verify that the per queue
stats match the rtnl stats by doing this:

  NETIF=eth0 tools/testing/selftests/drivers/net/stats.py

And the tool outputs that there is a failure:

  # Exception| Exception: Qstats are lower, fetched later
  not ok 3 stats.pkt_byte_sum

The other tests all pass (including stats.qstat_by_ifindex).

This appears to mean that the netdev-genl queue stats have lower numbers
than the rtnl stats even though the rtnl stats are fetched first. I
added some debugging and found that both rx and tx bytes and packets are
slightly lower.

The only explanations I can think of for this are:

1. tx_ptp_opened and rx_ptp_opened are both true, in which case
   mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
   might account for the difference. I skip this case in my
   implementation, so that could certainly explain it.
2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
   and that's why the numbers are slightly off?

It appears that the driver uses a workqueue to queue stats updates which
happen periodically.

 0. the driver occasionally calls queue_work on the update_stats_work
    workqueue.
 1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
    in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
    to begin by first memsetting the internal stats struct where stats are
    aggregated to zero. This would mean, I think, the get_base_stats
    netdev-genl API implementation that I have is correct: simply set
    everything to 0.... otherwise we'd end up double counting in the
    netdev-genl RX and TX handlers.
 2. Next, each of the stats helpers are called to collect stats into the
    freshly 0'd internal struct (for example:
    mlx5e_stats_grp_sw_update_stats_rq_stats).

That seems to be how stats are aggregated, which would suggest that if I
simply .... do what I'm doing in this change the numbers should line up.

But they don't and its either because of PTP or because I am
misunderstanding/doing something wrong.

Maybe the MLNX folks can suggest a hint?

Thanks,
Joe

Joe Damato (1):
  net/mlx5e: Add per queue netdev-genl stats

 .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
 1 file changed, 68 insertions(+)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats
  2024-05-03  2:25 [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
@ 2024-05-03  2:25 ` Joe Damato
  2024-05-03 10:55 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Zhu Yanjun
  1 sibling, 0 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-03  2:25 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: gal, nalramli, Joe Damato, Leon Romanovsky, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX5 core VPI driver

Add functions to support the netdev-genl per queue stats API.

./cli.py --spec netlink/specs/netdev.yaml \
--dump qstats-get --json '{"scope": "queue"}'

...snip

 {'ifindex': 7,
  'queue-id': 62,
  'queue-type': 'rx',
  'rx-alloc-fail': 0,
  'rx-bytes': 105965251,
  'rx-packets': 179790},
 {'ifindex': 7,
  'queue-id': 0,
  'queue-type': 'tx',
  'tx-bytes': 9402665,
  'tx-packets': 17551},

...snip

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 3bd0695845c7..3bd85d5a3686 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -39,6 +39,7 @@
 #include <linux/debugfs.h>
 #include <linux/if_bridge.h>
 #include <linux/filter.h>
+#include <net/netdev_queues.h>
 #include <net/page_pool/types.h>
 #include <net/pkt_sched.h>
 #include <net/xdp_sock_drv.h>
@@ -5282,6 +5283,72 @@ static bool mlx5e_tunnel_any_tx_proto_supported(struct mlx5_core_dev *mdev)
 	return (mlx5_vxlan_allowed(mdev->vxlan) || mlx5_geneve_tx_allowed(mdev));
 }
 
+static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
+				     struct netdev_queue_stats_rx *stats)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+
+	if (mlx5e_is_uplink_rep(priv))
+		return;
+
+	if (i < priv->stats_nch) {
+		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
+		struct mlx5e_rq_stats *xskrq_stats = &channel_stats->xskrq;
+		struct mlx5e_rq_stats *rq_stats = &channel_stats->rq;
+
+		stats->packets = rq_stats->packets + xskrq_stats->packets;
+		stats->bytes = rq_stats->bytes + xskrq_stats->bytes;
+		stats->alloc_fail = rq_stats->buff_alloc_err +
+				    xskrq_stats->buff_alloc_err;
+	}
+}
+
+static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
+				     struct netdev_queue_stats_tx *stats)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+	int j;
+
+	if (mlx5e_is_uplink_rep(priv))
+		return;
+
+	if (i < priv->stats_nch)  {
+		struct mlx5e_channel_stats *channel_stats = priv->channel_stats[i];
+
+		stats->packets = 0;
+		stats->bytes = 0;
+
+		for (j = 0; j < priv->max_opened_tc; j++) {
+			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
+
+			stats->packets += sq_stats->packets;
+			stats->bytes += sq_stats->bytes;
+		}
+	}
+}
+
+static void mlx5e_get_base_stats(struct net_device *dev,
+				 struct netdev_queue_stats_rx *rx,
+				 struct netdev_queue_stats_tx *tx)
+{
+	struct mlx5e_priv *priv = netdev_priv(dev);
+
+	if (!mlx5e_is_uplink_rep(priv)) {
+		rx->packets = 0;
+		rx->bytes = 0;
+		rx->alloc_fail = 0;
+	}
+
+	tx->packets = 0;
+	tx->bytes = 0;
+}
+
+static const struct netdev_stat_ops mlx5e_stat_ops = {
+	.get_queue_stats_rx     = mlx5e_get_queue_stats_rx,
+	.get_queue_stats_tx     = mlx5e_get_queue_stats_tx,
+	.get_base_stats         = mlx5e_get_base_stats,
+};
+
 static void mlx5e_build_nic_netdev(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -5299,6 +5366,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 
 	netdev->watchdog_timeo    = 15 * HZ;
 
+	netdev->stat_ops          = &mlx5e_stat_ops;
 	netdev->ethtool_ops	  = &mlx5e_ethtool_ops;
 
 	netdev->vlan_features    |= NETIF_F_SG;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-03  2:25 [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
  2024-05-03  2:25 ` [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats Joe Damato
@ 2024-05-03 10:55 ` Zhu Yanjun
  2024-05-03 18:43   ` Joe Damato
  1 sibling, 1 reply; 21+ messages in thread
From: Zhu Yanjun @ 2024-05-03 10:55 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev, tariqt, saeedm
  Cc: gal, nalramli, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Leon Romanovsky, open list:MELLANOX MLX5 core VPI driver,
	Paolo Abeni

On 03.05.24 04:25, Joe Damato wrote:
> Hi:
> 
> This is only 1 patch, so I know a cover letter isn't necessary, but it
> seems there are a few things to mention.
> 
> This change adds support for the per queue netdev-genl API to mlx5,
> which seems to output stats:
> 
> ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
>           --dump qstats-get --json '{"scope": "queue"}'
> 
> ...snip
>   {'ifindex': 7,
>    'queue-id': 28,
>    'queue-type': 'tx',
>    'tx-bytes': 399462,
>    'tx-packets': 3311},
> ...snip

Ethtool -S ethx can get the above information
"
...
      tx-0.packets: 2094
      tx-0.bytes: 294141
      rx-0.packets: 2200
      rx-0.bytes: 267673
...
"

> 
> I've tried to use the tooling suggested to verify that the per queue
> stats match the rtnl stats by doing this:
> 
>    NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> 
> And the tool outputs that there is a failure:
> 
>    # Exception| Exception: Qstats are lower, fetched later
>    not ok 3 stats.pkt_byte_sum

With ethtool, does the above problem still occur?

Zhu Yanjun

> 
> The other tests all pass (including stats.qstat_by_ifindex).
> 
> This appears to mean that the netdev-genl queue stats have lower numbers
> than the rtnl stats even though the rtnl stats are fetched first. I
> added some debugging and found that both rx and tx bytes and packets are
> slightly lower.
> 
> The only explanations I can think of for this are:
> 
> 1. tx_ptp_opened and rx_ptp_opened are both true, in which case
>     mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
>     might account for the difference. I skip this case in my
>     implementation, so that could certainly explain it.
> 2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
>     and that's why the numbers are slightly off?
> 
> It appears that the driver uses a workqueue to queue stats updates which
> happen periodically.
> 
>   0. the driver occasionally calls queue_work on the update_stats_work
>      workqueue.
>   1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
>      in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
>      to begin by first memsetting the internal stats struct where stats are
>      aggregated to zero. This would mean, I think, the get_base_stats
>      netdev-genl API implementation that I have is correct: simply set
>      everything to 0.... otherwise we'd end up double counting in the
>      netdev-genl RX and TX handlers.
>   2. Next, each of the stats helpers are called to collect stats into the
>      freshly 0'd internal struct (for example:
>      mlx5e_stats_grp_sw_update_stats_rq_stats).
> 
> That seems to be how stats are aggregated, which would suggest that if I
> simply .... do what I'm doing in this change the numbers should line up.
> 
> But they don't and its either because of PTP or because I am
> misunderstanding/doing something wrong.
> 
> Maybe the MLNX folks can suggest a hint?
> 
> Thanks,
> Joe
> 
> Joe Damato (1):
>    net/mlx5e: Add per queue netdev-genl stats
> 
>   .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
>   1 file changed, 68 insertions(+)
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-03 10:55 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Zhu Yanjun
@ 2024-05-03 18:43   ` Joe Damato
  2024-05-03 21:58     ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-03 18:43 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: linux-kernel, netdev, tariqt, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

On Fri, May 03, 2024 at 12:55:41PM +0200, Zhu Yanjun wrote:
> On 03.05.24 04:25, Joe Damato wrote:
> > Hi:
> > 
> > This is only 1 patch, so I know a cover letter isn't necessary, but it
> > seems there are a few things to mention.
> > 
> > This change adds support for the per queue netdev-genl API to mlx5,
> > which seems to output stats:
> > 
> > ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
> >           --dump qstats-get --json '{"scope": "queue"}'
> > 
> > ...snip
> >   {'ifindex': 7,
> >    'queue-id': 28,
> >    'queue-type': 'tx',
> >    'tx-bytes': 399462,
> >    'tx-packets': 3311},
> > ...snip
> 
> Ethtool -S ethx can get the above information
> "
> ...
>      tx-0.packets: 2094
>      tx-0.bytes: 294141
>      rx-0.packets: 2200
>      rx-0.bytes: 267673
> ...
> "
> 
> > 
> > I've tried to use the tooling suggested to verify that the per queue
> > stats match the rtnl stats by doing this:
> > 
> >    NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> > 
> > And the tool outputs that there is a failure:
> > 
> >    # Exception| Exception: Qstats are lower, fetched later
> >    not ok 3 stats.pkt_byte_sum
> 
> With ethtool, does the above problem still occur?

Thanks for the suggestion, with ethtool it seems correct using the same
logic as the test, I understand correctly.

The failing test fetches rtnl first then qstats, but sees lower qstats - the
test expects qstats to be equal or higher since they are read later. In order to
reproduce this with ethtool, I'd need to fetch with ethtool first and then
fetch qstats and compare.

A correct output would show equal or higher stats from qstats than ethtool
because there is minor delay in running the commands.

Here's a quick example using ethtool of what I get (note that in the output of
cli.py the bytes are printed before the packets):

$ ethtool -S eth0 | egrep '(rx[0-3]_(bytes|packets))' && \
  echo "======" && \
  ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
           --dump qstats-get --json '{"scope": "queue", "ifindex": 7}' \
  |  egrep '(rx-(packets|bytes))'

     rx0_packets: 10799916
     rx0_bytes: 4949724904
     rx1_packets: 26170804
     rx1_bytes: 12694250232
     rx2_packets: 11901885
     rx2_bytes: 5593129387
     rx3_packets: 13219784
     rx3_bytes: 6151431963
======
  'rx-bytes': 4949927222,
  'rx-packets': 10800354},
  'rx-bytes': 12694488803,
  'rx-packets': 26171309},
  'rx-bytes': 5593321247,
  'rx-packets': 11902360},
  'rx-bytes': 6151735533,
  'rx-packets': 13220389},


So you can see that the numbers "look right", the qstats (collected by cli.py)
are collected later and are slightly larger, as expected:

rx0_packets from ethtool: 10799916
rx0_packets from cli.py:  10800354

rx0_bytes from ethtool: 4949724904
rx0_bytes from cli.py:  4949927222

So this looks correct to me and in this case I'd be more inclinded to assume
that RTNL on mlx5 is "overcounting" because:

1. it includes the PTP stats that I don't include in my qstats, and/or
2. some other reason I don't understand

> > 
> > The other tests all pass (including stats.qstat_by_ifindex).
> > 
> > This appears to mean that the netdev-genl queue stats have lower numbers
> > than the rtnl stats even though the rtnl stats are fetched first. I
> > added some debugging and found that both rx and tx bytes and packets are
> > slightly lower.
> > 
> > The only explanations I can think of for this are:
> > 
> > 1. tx_ptp_opened and rx_ptp_opened are both true, in which case
> >     mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
> >     might account for the difference. I skip this case in my
> >     implementation, so that could certainly explain it.
> > 2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
> >     and that's why the numbers are slightly off?
> > 
> > It appears that the driver uses a workqueue to queue stats updates which
> > happen periodically.
> > 
> >   0. the driver occasionally calls queue_work on the update_stats_work
> >      workqueue.
> >   1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
> >      in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
> >      to begin by first memsetting the internal stats struct where stats are
> >      aggregated to zero. This would mean, I think, the get_base_stats
> >      netdev-genl API implementation that I have is correct: simply set
> >      everything to 0.... otherwise we'd end up double counting in the
> >      netdev-genl RX and TX handlers.
> >   2. Next, each of the stats helpers are called to collect stats into the
> >      freshly 0'd internal struct (for example:
> >      mlx5e_stats_grp_sw_update_stats_rq_stats).
> > 
> > That seems to be how stats are aggregated, which would suggest that if I
> > simply .... do what I'm doing in this change the numbers should line up.
> > 
> > But they don't and its either because of PTP or because I am
> > misunderstanding/doing something wrong.
> > 
> > Maybe the MLNX folks can suggest a hint?
> > 
> > Thanks,
> > Joe
> > 
> > Joe Damato (1):
> >    net/mlx5e: Add per queue netdev-genl stats
> > 
> >   .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
> >   1 file changed, 68 insertions(+)
> > 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-03 18:43   ` Joe Damato
@ 2024-05-03 21:58     ` Jakub Kicinski
  2024-05-03 23:53       ` Joe Damato
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2024-05-03 21:58 UTC (permalink / raw)
  To: Joe Damato
  Cc: Zhu Yanjun, linux-kernel, netdev, tariqt, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

On Fri, 3 May 2024 11:43:27 -0700 Joe Damato wrote:
> 1. it includes the PTP stats that I don't include in my qstats, and/or
> 2. some other reason I don't understand

Can you add the PTP stats to the "base" values? 
I.e. inside mlx5e_get_base_stats()?

We should probably touch up the kdoc a little bit, but it sounds like
the sort of thing which would fall into the realm of "misc delta"
values:

diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
index c7ac4539eafc..f5d9f3ad5b66 100644
--- a/include/net/netdev_queues.h
+++ b/include/net/netdev_queues.h
@@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
  * statistics will not generally add up to the total number of events for
  * the device. The @get_base_stats callback allows filling in the delta
  * between events for currently live queues and overall device history.
+ * @get_base_stats can also be used to report any miscellaneous packets
+ * transferred outside of the main set of queues used by the networking stack.
  * When the statistics for the entire device are queried, first @get_base_stats
  * is issued to collect the delta, and then a series of per-queue callbacks.
  * Only statistics which are set in @get_base_stats will be reported


SG?

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-03 21:58     ` Jakub Kicinski
@ 2024-05-03 23:53       ` Joe Damato
  2024-05-04  0:34         ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-03 23:53 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Zhu Yanjun, linux-kernel, netdev, tariqt, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

On Fri, May 03, 2024 at 02:58:08PM -0700, Jakub Kicinski wrote:
> On Fri, 3 May 2024 11:43:27 -0700 Joe Damato wrote:
> > 1. it includes the PTP stats that I don't include in my qstats, and/or
> > 2. some other reason I don't understand
> 
> Can you add the PTP stats to the "base" values? 
> I.e. inside mlx5e_get_base_stats()?

I tried adding them to rx and tx and mlx5e_get_base_stats (similar to what
mlx5e_fold_sw_stats64 does) and the test still fails.

Maybe something about the rtnl stats are what's off here and the queue
stats are fine?

FWIW: I spoke with the Mellanox folks off list several weeks ago and they
seemed to suggest skipping the PTP stats made the most sense.

I think at that time I didn't really understand get_base_stats that well,
so maybe we'd have come to a different conclusion then.

FWIW, here's what I tried and the rtnl vs qstat test still failed in
exactly the same way:

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5337,10 +5337,25 @@ static void mlx5e_get_base_stats(struct net_device *dev,
                rx->packets = 0;
                rx->bytes = 0;
                rx->alloc_fail = 0;
+               if (priv->rx_ptp_opened) {
+                       struct mlx5e_rq_stats *rq_stats = &priv->ptp_stats.rq;
+                       rx->packets = rq_stats->packets;
+                       rx->bytes = rq_stats->bytes;
+               }
        }

        tx->packets = 0;
        tx->bytes = 0;
+
+       if (priv->tx_ptp_opened) {
+               int i;
+               for (i = 0; i < priv->max_opened_tc; i++) {
+                       struct mlx5e_sq_stats *sq_stats = &priv->ptp_stats.sq[i];
+
+                       tx->packets    += sq_stats->packets;
+                       tx->bytes      += sq_stats->bytes;
+               }
+       }
 }

> We should probably touch up the kdoc a little bit, but it sounds like
> the sort of thing which would fall into the realm of "misc delta"
> values:
> 
> diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> index c7ac4539eafc..f5d9f3ad5b66 100644
> --- a/include/net/netdev_queues.h
> +++ b/include/net/netdev_queues.h
> @@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
>   * statistics will not generally add up to the total number of events for
>   * the device. The @get_base_stats callback allows filling in the delta
>   * between events for currently live queues and overall device history.
> + * @get_base_stats can also be used to report any miscellaneous packets
> + * transferred outside of the main set of queues used by the networking stack.
>   * When the statistics for the entire device are queried, first @get_base_stats
>   * is issued to collect the delta, and then a series of per-queue callbacks.
>   * Only statistics which are set in @get_base_stats will be reported
> 
> 
> SG?

I think that sounds good and makes sense, yea. By that definition, then I
should leave the PTP stats as shown above. If you agree, I'll add that
to the v2.

I feel like I should probably wait before sending a v2 with PTP included in
get_base_stats to see if the Mellanox folks have any hints about why rtnl
!= queue stats on mlx5?

What do you think?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-03 23:53       ` Joe Damato
@ 2024-05-04  0:34         ` Jakub Kicinski
  2024-05-06 18:04           ` Joe Damato
  0 siblings, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2024-05-04  0:34 UTC (permalink / raw)
  To: Joe Damato
  Cc: Zhu Yanjun, linux-kernel, netdev, tariqt, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

On Fri, 3 May 2024 16:53:40 -0700 Joe Damato wrote:
> > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > index c7ac4539eafc..f5d9f3ad5b66 100644
> > --- a/include/net/netdev_queues.h
> > +++ b/include/net/netdev_queues.h
> > @@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
> >   * statistics will not generally add up to the total number of events for
> >   * the device. The @get_base_stats callback allows filling in the delta
> >   * between events for currently live queues and overall device history.
> > + * @get_base_stats can also be used to report any miscellaneous packets
> > + * transferred outside of the main set of queues used by the networking stack.
> >   * When the statistics for the entire device are queried, first @get_base_stats
> >   * is issued to collect the delta, and then a series of per-queue callbacks.
> >   * Only statistics which are set in @get_base_stats will be reported
> > 
> > 
> > SG?  
> 
> I think that sounds good and makes sense, yea. By that definition, then I
> should leave the PTP stats as shown above. If you agree, I'll add that
> to the v2.

Yup, agreed.

> I feel like I should probably wait before sending a v2 with PTP included in
> get_base_stats to see if the Mellanox folks have any hints about why rtnl
> != queue stats on mlx5?
> 
> What do you think?

Very odd, the code doesn't appear to be doing any magic :S Did you try
to print what the delta in values is? Does bringing the interface up and
down affect the size of it?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-04  0:34         ` Jakub Kicinski
@ 2024-05-06 18:04           ` Joe Damato
  2024-05-08 21:40             ` Tariq Toukan
  0 siblings, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-06 18:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Zhu Yanjun, linux-kernel, netdev, tariqt, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni

On Fri, May 03, 2024 at 05:34:29PM -0700, Jakub Kicinski wrote:
> On Fri, 3 May 2024 16:53:40 -0700 Joe Damato wrote:
> > > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > > index c7ac4539eafc..f5d9f3ad5b66 100644
> > > --- a/include/net/netdev_queues.h
> > > +++ b/include/net/netdev_queues.h
> > > @@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
> > >   * statistics will not generally add up to the total number of events for
> > >   * the device. The @get_base_stats callback allows filling in the delta
> > >   * between events for currently live queues and overall device history.
> > > + * @get_base_stats can also be used to report any miscellaneous packets
> > > + * transferred outside of the main set of queues used by the networking stack.
> > >   * When the statistics for the entire device are queried, first @get_base_stats
> > >   * is issued to collect the delta, and then a series of per-queue callbacks.
> > >   * Only statistics which are set in @get_base_stats will be reported
> > > 
> > > 
> > > SG?  
> > 
> > I think that sounds good and makes sense, yea. By that definition, then I
> > should leave the PTP stats as shown above. If you agree, I'll add that
> > to the v2.
> 
> Yup, agreed.
> 
> > I feel like I should probably wait before sending a v2 with PTP included in
> > get_base_stats to see if the Mellanox folks have any hints about why rtnl
> > != queue stats on mlx5?
> > 
> > What do you think?
> 
> Very odd, the code doesn't appear to be doing any magic :S Did you try
> to print what the delta in values is? Does bringing the interface up and
> down affect the size of it?

I booted the kernel which includes PTP stats in the base stats as you've
suggested (as shown in the diff in this thread) and I've brought the
interface down and back up:

$ sudo ip link set dev eth0 down
$ sudo ip link set dev eth0 up

Re ran the test script, which includes some mild debugging print out I
added to show the delta for rx-packets (but I think all stats are off):

  # Exception| Exception: Qstats are lower, fetched later

key: rx-packets rstat: 1192281902 qstat: 1186755777
key: rx-packets rstat: 1192281902 qstat: 1186755781

So qstat is lower by (1192281902 - 1186755781) = 5,526,121

Not really sure why, but I'll take another look at the code this morning to
see if I can figure out what's going on.

I'm clearly doing something wrong or misunderstanding something about the
accounting that will seem extremely obvious in retrospect.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-06 18:04           ` Joe Damato
@ 2024-05-08 21:40             ` Tariq Toukan
  2024-05-08 23:24               ` Joe Damato
  0 siblings, 1 reply; 21+ messages in thread
From: Tariq Toukan @ 2024-05-08 21:40 UTC (permalink / raw)
  To: Joe Damato, Jakub Kicinski
  Cc: Zhu Yanjun, linux-kernel, netdev, saeedm, gal, nalramli,
	David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan



On 06/05/2024 21:04, Joe Damato wrote:
> On Fri, May 03, 2024 at 05:34:29PM -0700, Jakub Kicinski wrote:
>> On Fri, 3 May 2024 16:53:40 -0700 Joe Damato wrote:
>>>> diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
>>>> index c7ac4539eafc..f5d9f3ad5b66 100644
>>>> --- a/include/net/netdev_queues.h
>>>> +++ b/include/net/netdev_queues.h
>>>> @@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
>>>>    * statistics will not generally add up to the total number of events for
>>>>    * the device. The @get_base_stats callback allows filling in the delta
>>>>    * between events for currently live queues and overall device history.
>>>> + * @get_base_stats can also be used to report any miscellaneous packets
>>>> + * transferred outside of the main set of queues used by the networking stack.
>>>>    * When the statistics for the entire device are queried, first @get_base_stats
>>>>    * is issued to collect the delta, and then a series of per-queue callbacks.
>>>>    * Only statistics which are set in @get_base_stats will be reported
>>>>
>>>>
>>>> SG?
>>>
>>> I think that sounds good and makes sense, yea. By that definition, then I
>>> should leave the PTP stats as shown above. If you agree, I'll add that
>>> to the v2.
>>
>> Yup, agreed.
>>
>>> I feel like I should probably wait before sending a v2 with PTP included in
>>> get_base_stats to see if the Mellanox folks have any hints about why rtnl
>>> != queue stats on mlx5?
>>>
>>> What do you think?
>>
>> Very odd, the code doesn't appear to be doing any magic :S Did you try
>> to print what the delta in values is? Does bringing the interface up and
>> down affect the size of it?
> 
> I booted the kernel which includes PTP stats in the base stats as you've
> suggested (as shown in the diff in this thread) and I've brought the
> interface down and back up:
> 
> $ sudo ip link set dev eth0 down
> $ sudo ip link set dev eth0 up
> 
> Re ran the test script, which includes some mild debugging print out I
> added to show the delta for rx-packets (but I think all stats are off):
> 
>    # Exception| Exception: Qstats are lower, fetched later
> 
> key: rx-packets rstat: 1192281902 qstat: 1186755777
> key: rx-packets rstat: 1192281902 qstat: 1186755781
> 
> So qstat is lower by (1192281902 - 1186755781) = 5,526,121
> 
> Not really sure why, but I'll take another look at the code this morning to
> see if I can figure out what's going on.
> 
> I'm clearly doing something wrong or misunderstanding something about the
> accounting that will seem extremely obvious in retrospect.

Hi Joe,

Thanks for your patch.
Apologies for the late response. I was on PTO for some time.

 From first look the patch looks okay. The overall approach seems correct.

The off-channels queues (like PTP) do not exist in default. So they are 
out of the game unless you explicitly enables them.

A possible reason for this difference is the queues included in the sum.
Our stats are persistent across configuration changes, so they doesn't 
reset when number of channels changes for example.

We keep stats entries for al ring indices that ever existed. Our driver 
loops and sums up the stats for all of them, while the stack loops only 
up to the current netdev->real_num_rx_queues.

Can this explain the diff here?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-08 21:40             ` Tariq Toukan
@ 2024-05-08 23:24               ` Joe Damato
  2024-05-09  0:56                 ` Jakub Kicinski
  2024-05-09  9:42                 ` Tariq Toukan
  0 siblings, 2 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-08 23:24 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, May 09, 2024 at 12:40:01AM +0300, Tariq Toukan wrote:
> 
> 
> On 06/05/2024 21:04, Joe Damato wrote:
> > On Fri, May 03, 2024 at 05:34:29PM -0700, Jakub Kicinski wrote:
> > > On Fri, 3 May 2024 16:53:40 -0700 Joe Damato wrote:
> > > > > diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h
> > > > > index c7ac4539eafc..f5d9f3ad5b66 100644
> > > > > --- a/include/net/netdev_queues.h
> > > > > +++ b/include/net/netdev_queues.h
> > > > > @@ -59,6 +59,8 @@ struct netdev_queue_stats_tx {
> > > > >    * statistics will not generally add up to the total number of events for
> > > > >    * the device. The @get_base_stats callback allows filling in the delta
> > > > >    * between events for currently live queues and overall device history.
> > > > > + * @get_base_stats can also be used to report any miscellaneous packets
> > > > > + * transferred outside of the main set of queues used by the networking stack.
> > > > >    * When the statistics for the entire device are queried, first @get_base_stats
> > > > >    * is issued to collect the delta, and then a series of per-queue callbacks.
> > > > >    * Only statistics which are set in @get_base_stats will be reported
> > > > > 
> > > > > 
> > > > > SG?
> > > > 
> > > > I think that sounds good and makes sense, yea. By that definition, then I
> > > > should leave the PTP stats as shown above. If you agree, I'll add that
> > > > to the v2.
> > > 
> > > Yup, agreed.
> > > 
> > > > I feel like I should probably wait before sending a v2 with PTP included in
> > > > get_base_stats to see if the Mellanox folks have any hints about why rtnl
> > > > != queue stats on mlx5?
> > > > 
> > > > What do you think?
> > > 
> > > Very odd, the code doesn't appear to be doing any magic :S Did you try
> > > to print what the delta in values is? Does bringing the interface up and
> > > down affect the size of it?
> > 
> > I booted the kernel which includes PTP stats in the base stats as you've
> > suggested (as shown in the diff in this thread) and I've brought the
> > interface down and back up:
> > 
> > $ sudo ip link set dev eth0 down
> > $ sudo ip link set dev eth0 up
> > 
> > Re ran the test script, which includes some mild debugging print out I
> > added to show the delta for rx-packets (but I think all stats are off):
> > 
> >    # Exception| Exception: Qstats are lower, fetched later
> > 
> > key: rx-packets rstat: 1192281902 qstat: 1186755777
> > key: rx-packets rstat: 1192281902 qstat: 1186755781
> > 
> > So qstat is lower by (1192281902 - 1186755781) = 5,526,121
> > 
> > Not really sure why, but I'll take another look at the code this morning to
> > see if I can figure out what's going on.
> > 
> > I'm clearly doing something wrong or misunderstanding something about the
> > accounting that will seem extremely obvious in retrospect.
> 
> Hi Joe,
> 
> Thanks for your patch.
> Apologies for the late response. I was on PTO for some time.

No worries, I hope you enjoyed your PTO. I appreciate your response, time,
and energy.

> From first look the patch looks okay. The overall approach seems correct.

Sounds good to me!
 
> The off-channels queues (like PTP) do not exist in default. So they are out
> of the game unless you explicitly enables them.

I did not enable them, but if you saw the thread, it sounds like Jakub's
preference is that in the v2 I include the PTP stats in get_base_stats.

Are you OK with that?
Are there other queue stats I should include as well?

> A possible reason for this difference is the queues included in the sum.
> Our stats are persistent across configuration changes, so they doesn't reset
> when number of channels changes for example.
> 
> We keep stats entries for al ring indices that ever existed. Our driver
> loops and sums up the stats for all of them, while the stack loops only up
> to the current netdev->real_num_rx_queues.
> 
> Can this explain the diff here?

Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
script to adjust the queue count shortly after booting.

I disabled that and re-ran:

  NETIF=eth0 tools/testing/selftests/drivers/net/stats.py

and all tests pass.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-08 23:24               ` Joe Damato
@ 2024-05-09  0:56                 ` Jakub Kicinski
  2024-05-09  1:57                   ` Joe Damato
  2024-05-09  9:42                 ` Tariq Toukan
  1 sibling, 1 reply; 21+ messages in thread
From: Jakub Kicinski @ 2024-05-09  0:56 UTC (permalink / raw)
  To: Joe Damato
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Wed, 8 May 2024 16:24:08 -0700 Joe Damato wrote:
> > A possible reason for this difference is the queues included in the sum.
> > Our stats are persistent across configuration changes, so they doesn't reset
> > when number of channels changes for example.
> > 
> > We keep stats entries for al ring indices that ever existed. Our driver
> > loops and sums up the stats for all of them, while the stack loops only up
> > to the current netdev->real_num_rx_queues.
> > 
> > Can this explain the diff here?  
> 
> Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
> script to adjust the queue count shortly after booting.
> 
> I disabled that and re-ran:
> 
>   NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> 
> and all tests pass.

Stating the obvious, perhaps, but in this case we should add the stats
from inactive queues to the base (which when the NIC is down means all
queues).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  0:56                 ` Jakub Kicinski
@ 2024-05-09  1:57                   ` Joe Damato
  2024-05-09  2:08                     ` Jakub Kicinski
  0 siblings, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-09  1:57 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Wed, May 08, 2024 at 05:56:38PM -0700, Jakub Kicinski wrote:
> On Wed, 8 May 2024 16:24:08 -0700 Joe Damato wrote:
> > > A possible reason for this difference is the queues included in the sum.
> > > Our stats are persistent across configuration changes, so they doesn't reset
> > > when number of channels changes for example.
> > > 
> > > We keep stats entries for al ring indices that ever existed. Our driver
> > > loops and sums up the stats for all of them, while the stack loops only up
> > > to the current netdev->real_num_rx_queues.
> > > 
> > > Can this explain the diff here?  
> > 
> > Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
> > script to adjust the queue count shortly after booting.
> > 
> > I disabled that and re-ran:
> > 
> >   NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> > 
> > and all tests pass.
> 
> Stating the obvious, perhaps, but in this case we should add the stats
> from inactive queues to the base (which when the NIC is down means all
> queues).

If I'm following that right and understanding mlx5 (two things I am
unlikely to do simultaneously), that sounds to me like:

- mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
  priv->channels.params.num_channels (instead of priv->stats_nch), and when
  summing mlx5e_sq_stats in the latter function, it's up to
  priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.

- mlx5e_get_base_stats accumulates and outputs stats for everything from
  priv->channels.params.num_channels to priv->stats_nch, and
  priv->channels.params.mqprio.num_tc to priv->max_opened_tc... which
  should cover the inactive queues, I think.

Just writing that all out to avoid hacking up the wrong thing for the v2
and to reduce overall noise on the list :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  1:57                   ` Joe Damato
@ 2024-05-09  2:08                     ` Jakub Kicinski
  2024-05-09  4:11                       ` Joe Damato
  2024-05-09  6:30                       ` Joe Damato
  0 siblings, 2 replies; 21+ messages in thread
From: Jakub Kicinski @ 2024-05-09  2:08 UTC (permalink / raw)
  To: Joe Damato
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> If I'm following that right and understanding mlx5 (two things I am
> unlikely to do simultaneously), that sounds to me like:
> 
> - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
>   priv->channels.params.num_channels (instead of priv->stats_nch),

Yes, tho, not sure whether the "if i < ...num_channels" is even
necessary, as core already checks against real_num_rx_queues.

>   and when
>   summing mlx5e_sq_stats in the latter function, it's up to
>   priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> 
> - mlx5e_get_base_stats accumulates and outputs stats for everything from
>   priv->channels.params.num_channels to priv->stats_nch, and

I'm not sure num_channels gets set to 0 when device is down so possibly
from "0 if down else ...num_channels" to stats_nch.

>   priv->channels.params.mqprio.num_tc to priv->max_opened_tc... which
>   should cover the inactive queues, I think.
> 
> Just writing that all out to avoid hacking up the wrong thing for the v2
> and to reduce overall noise on the list :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  2:08                     ` Jakub Kicinski
@ 2024-05-09  4:11                       ` Joe Damato
  2024-05-09  6:30                       ` Joe Damato
  1 sibling, 0 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-09  4:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > If I'm following that right and understanding mlx5 (two things I am
> > unlikely to do simultaneously), that sounds to me like:
> > 
> > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> >   priv->channels.params.num_channels (instead of priv->stats_nch),
> 
> Yes, tho, not sure whether the "if i < ...num_channels" is even
> necessary, as core already checks against real_num_rx_queues.

OK, I'll omit the i < ... check in the v2, then.

> >   and when
> >   summing mlx5e_sq_stats in the latter function, it's up to
> >   priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > 
> > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> >   priv->channels.params.num_channels to priv->stats_nch, and
> 
> I'm not sure num_channels gets set to 0 when device is down so possibly
> from "0 if down else ...num_channels" to stats_nch.

Yea, it looks like priv->channels.params.num_channels is untouched on
ndo_stop, but:

mlx5e_close (ndo_close)
  mlx5e_close_locked
    mlx5e_close_channels
      priv->channels->num = 0;

and on open priv->channels->num is restored from 0 to
priv->channels.params.num_channels.

So, priv->channels->num to priv->stats_nch would be, I think, the inactive
queues. I'll give it a try locally real quick.

> >   priv->channels.params.mqprio.num_tc to priv->max_opened_tc... which
> >   should cover the inactive queues, I think.
> > 
> > Just writing that all out to avoid hacking up the wrong thing for the v2
> > and to reduce overall noise on the list :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  2:08                     ` Jakub Kicinski
  2024-05-09  4:11                       ` Joe Damato
@ 2024-05-09  6:30                       ` Joe Damato
  2024-05-09 10:16                         ` Tariq Toukan
  1 sibling, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-09  6:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > If I'm following that right and understanding mlx5 (two things I am
> > unlikely to do simultaneously), that sounds to me like:
> > 
> > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> >   priv->channels.params.num_channels (instead of priv->stats_nch),
> 
> Yes, tho, not sure whether the "if i < ...num_channels" is even
> necessary, as core already checks against real_num_rx_queues.
> 
> >   and when
> >   summing mlx5e_sq_stats in the latter function, it's up to
> >   priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > 
> > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> >   priv->channels.params.num_channels to priv->stats_nch, and
> 
> I'm not sure num_channels gets set to 0 when device is down so possibly
> from "0 if down else ...num_channels" to stats_nch.

Yea, you were right:

  if (priv->channels.num == 0)
          i = 0;
  else
          i = priv->channels.params.num_channels;
  
  for (; i < priv->stats_nch; i++) {

Seems to be working now when I adjust the queue count and the test is
passing as I adjust the queue count up or down. Cool.

Adding TCs to the NIC triggers the test to fail, so there's still some bug
in how I'm accumulating stats from the hw TCs.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-08 23:24               ` Joe Damato
  2024-05-09  0:56                 ` Jakub Kicinski
@ 2024-05-09  9:42                 ` Tariq Toukan
  2024-05-09 23:06                   ` Joe Damato
  1 sibling, 1 reply; 21+ messages in thread
From: Tariq Toukan @ 2024-05-09  9:42 UTC (permalink / raw)
  To: Joe Damato
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

..

>> The off-channels queues (like PTP) do not exist in default. So they are out
>> of the game unless you explicitly enables them.
> 
> I did not enable them, but if you saw the thread, it sounds like Jakub's
> preference is that in the v2 I include the PTP stats in get_base_stats.
> 
> Are you OK with that?

Sounds good.

> Are there other queue stats I should include as well?
> 

The QOS/HTB queues.
See mlx5e_stats_grp_sw_update_stats_qos.

>> A possible reason for this difference is the queues included in the sum.
>> Our stats are persistent across configuration changes, so they doesn't reset
>> when number of channels changes for example.
>>
>> We keep stats entries for al ring indices that ever existed. Our driver
>> loops and sums up the stats for all of them, while the stack loops only up
>> to the current netdev->real_num_rx_queues.
>>
>> Can this explain the diff here?
> 
> Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
> script to adjust the queue count shortly after booting.
> 
> I disabled that and re-ran:
> 
>    NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> 
> and all tests pass.
> 

Great!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  6:30                       ` Joe Damato
@ 2024-05-09 10:16                         ` Tariq Toukan
  2024-05-09 23:14                           ` Joe Damato
  2024-05-10  0:31                           ` Joe Damato
  0 siblings, 2 replies; 21+ messages in thread
From: Tariq Toukan @ 2024-05-09 10:16 UTC (permalink / raw)
  To: Joe Damato, Jakub Kicinski
  Cc: Tariq Toukan, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan



On 09/05/2024 9:30, Joe Damato wrote:
> On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
>> On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
>>> If I'm following that right and understanding mlx5 (two things I am
>>> unlikely to do simultaneously), that sounds to me like:
>>>
>>> - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
>>>    priv->channels.params.num_channels (instead of priv->stats_nch),
>>
>> Yes, tho, not sure whether the "if i < ...num_channels" is even
>> necessary, as core already checks against real_num_rx_queues.
>>
>>>    and when
>>>    summing mlx5e_sq_stats in the latter function, it's up to
>>>    priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
>>>
>>> - mlx5e_get_base_stats accumulates and outputs stats for everything from
>>>    priv->channels.params.num_channels to priv->stats_nch, and
>>
>> I'm not sure num_channels gets set to 0 when device is down so possibly
>> from "0 if down else ...num_channels" to stats_nch.
> 
> Yea, you were right:
> 
>    if (priv->channels.num == 0)
>            i = 0;
>    else
>            i = priv->channels.params.num_channels;
>    
>    for (; i < priv->stats_nch; i++) {
> 
> Seems to be working now when I adjust the queue count and the test is
> passing as I adjust the queue count up or down. Cool.
> 

I agree that get_base should include all inactive queues stats.
But it's not straight forward to implement.

A few guiding points:

Use mlx5e_get_dcb_num_tc(params) for current num_tc.

txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc * 
params->num_channels.

The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].

It means, in the base stats you should include SQ stats for:
1. all SQs of non-active channels, i.e. ch in [params.num_channels, 
priv->stats_nch), tc in [0, priv->max_opened_tc).
2. all SQs of non-active TCs in active channels [0, 
params.num_channels), tc in [mlx5e_get_dcb_num_tc(params), 
priv->max_opened_tc).

Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
You should not loop over all TCs of channel index i.
You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], 
and then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].

> Adding TCs to the NIC triggers the test to fail, so there's still some bug
> in how I'm accumulating stats from the hw TCs.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09  9:42                 ` Tariq Toukan
@ 2024-05-09 23:06                   ` Joe Damato
  0 siblings, 0 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-09 23:06 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, May 09, 2024 at 12:42:11PM +0300, Tariq Toukan wrote:
> ..
> 
> > > The off-channels queues (like PTP) do not exist in default. So they are out
> > > of the game unless you explicitly enables them.
> > 
> > I did not enable them, but if you saw the thread, it sounds like Jakub's
> > preference is that in the v2 I include the PTP stats in get_base_stats.
> > 
> > Are you OK with that?
> 
> Sounds good.
> 
> > Are there other queue stats I should include as well?
> > 
> 
> The QOS/HTB queues.
> See mlx5e_stats_grp_sw_update_stats_qos.

Sure, thanks, I can take a look. I think maybe an issue might be that if
I include QOS/HTB queues then tools/testing/selftests/drivers/net/stats.py
will start to fail.

I could be mistaken, but it seems that QOS/HTB are not included in the rtnl
stats, is that right?

If the goal is for queue stats to match rtnl then maybe I should leave
QOS/HTB out and they can be added to both RTNL and queue stats together at
a later time.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09 10:16                         ` Tariq Toukan
@ 2024-05-09 23:14                           ` Joe Damato
  2024-05-10  0:31                           ` Joe Damato
  1 sibling, 0 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-09 23:14 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, May 09, 2024 at 01:16:15PM +0300, Tariq Toukan wrote:
> 
> 
> On 09/05/2024 9:30, Joe Damato wrote:
> > On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> > > On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > > > If I'm following that right and understanding mlx5 (two things I am
> > > > unlikely to do simultaneously), that sounds to me like:
> > > > 
> > > > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> > > >    priv->channels.params.num_channels (instead of priv->stats_nch),
> > > 
> > > Yes, tho, not sure whether the "if i < ...num_channels" is even
> > > necessary, as core already checks against real_num_rx_queues.
> > > 
> > > >    and when
> > > >    summing mlx5e_sq_stats in the latter function, it's up to
> > > >    priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > > > 
> > > > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> > > >    priv->channels.params.num_channels to priv->stats_nch, and
> > > 
> > > I'm not sure num_channels gets set to 0 when device is down so possibly
> > > from "0 if down else ...num_channels" to stats_nch.
> > 
> > Yea, you were right:
> > 
> >    if (priv->channels.num == 0)
> >            i = 0;
> >    else
> >            i = priv->channels.params.num_channels;
> >    for (; i < priv->stats_nch; i++) {
> > 
> > Seems to be working now when I adjust the queue count and the test is
> > passing as I adjust the queue count up or down. Cool.
> > 
> 
> I agree that get_base should include all inactive queues stats.
> But it's not straight forward to implement.
> 
> A few guiding points:

Thanks for the guiding points - it is very helpful.

> Use mlx5e_get_dcb_num_tc(params) for current num_tc.
> 
> txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
> params->num_channels.
> 
> The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
> 
> It means, in the base stats you should include SQ stats for:
> 1. all SQs of non-active channels, i.e. ch in [params.num_channels,
> priv->stats_nch), tc in [0, priv->max_opened_tc).
> 2. all SQs of non-active TCs in active channels [0, params.num_channels), tc
> in [mlx5e_get_dcb_num_tc(params), priv->max_opened_tc).

Thanks yea this is what I was working on last night -- I realized that I
need to include the non-active TCs on the active channels, too and have
some code that does that.

I'm still off slightly, but am giving it another look now.

> Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
> You should not loop over all TCs of channel index i.
> You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], and
> then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].

OK, thanks for explaining that, I'll take a closer look at this as well. 

> > Adding TCs to the NIC triggers the test to fail, so there's still some bug
> > in how I'm accumulating stats from the hw TCs.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-09 10:16                         ` Tariq Toukan
  2024-05-09 23:14                           ` Joe Damato
@ 2024-05-10  0:31                           ` Joe Damato
  2024-05-10  4:27                             ` Joe Damato
  1 sibling, 1 reply; 21+ messages in thread
From: Joe Damato @ 2024-05-10  0:31 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, May 09, 2024 at 01:16:15PM +0300, Tariq Toukan wrote:
> 
> 
> On 09/05/2024 9:30, Joe Damato wrote:
> > On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> > > On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > > > If I'm following that right and understanding mlx5 (two things I am
> > > > unlikely to do simultaneously), that sounds to me like:
> > > > 
> > > > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> > > >    priv->channels.params.num_channels (instead of priv->stats_nch),
> > > 
> > > Yes, tho, not sure whether the "if i < ...num_channels" is even
> > > necessary, as core already checks against real_num_rx_queues.
> > > 
> > > >    and when
> > > >    summing mlx5e_sq_stats in the latter function, it's up to
> > > >    priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > > > 
> > > > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> > > >    priv->channels.params.num_channels to priv->stats_nch, and
> > > 
> > > I'm not sure num_channels gets set to 0 when device is down so possibly
> > > from "0 if down else ...num_channels" to stats_nch.
> > 
> > Yea, you were right:
> > 
> >    if (priv->channels.num == 0)
> >            i = 0;
> >    else
> >            i = priv->channels.params.num_channels;
> >    for (; i < priv->stats_nch; i++) {
> > 
> > Seems to be working now when I adjust the queue count and the test is
> > passing as I adjust the queue count up or down. Cool.
> > 
> 
> I agree that get_base should include all inactive queues stats.
> But it's not straight forward to implement.
> 
> A few guiding points:
> 
> Use mlx5e_get_dcb_num_tc(params) for current num_tc.
> 
> txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
> params->num_channels.
> 
> The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
> 
> It means, in the base stats you should include SQ stats for:
> 1. all SQs of non-active channels, i.e. ch in [params.num_channels,
> priv->stats_nch), tc in [0, priv->max_opened_tc).
> 2. all SQs of non-active TCs in active channels [0, params.num_channels), tc
> in [mlx5e_get_dcb_num_tc(params), priv->max_opened_tc).
> 
> Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
> You should not loop over all TCs of channel index i.
> You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], and
> then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].

It looks like txq2sq probably will help with this?

Something like:

for (j = 0; j < mlx5e_get_dcb_num_tc(); j++) {
  sq = priv->txq2sq[j];
  if (sq->ch_ix == i) {
    /* this sq->stats is what I need */
  }
}

Is that right?

Not sure if I'm missing something obvious here, sorry, I've been puzzling
over the mlx5 source for a bit.

BTW: kind of related but in mlx5e_alloc_txqsq the int tc param is unused (I
think). It might be helpful to struct mlx5e_txqsq to have a tc field and
then in mlx5e_alloc_txqsq:

  sq->tc = tc;

Not sure if that'd be helpful in general, but I could send that as a
separate patch.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
  2024-05-10  0:31                           ` Joe Damato
@ 2024-05-10  4:27                             ` Joe Damato
  0 siblings, 0 replies; 21+ messages in thread
From: Joe Damato @ 2024-05-10  4:27 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Jakub Kicinski, Zhu Yanjun, linux-kernel, netdev, saeedm, gal,
	nalramli, David S. Miller, Eric Dumazet, Leon Romanovsky,
	open list:MELLANOX MLX5 core VPI driver, Paolo Abeni,
	Tariq Toukan

On Thu, May 09, 2024 at 05:31:06PM -0700, Joe Damato wrote:
> On Thu, May 09, 2024 at 01:16:15PM +0300, Tariq Toukan wrote:
> > 
> > 
> > On 09/05/2024 9:30, Joe Damato wrote:
> > > On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> > > > On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > > > > If I'm following that right and understanding mlx5 (two things I am
> > > > > unlikely to do simultaneously), that sounds to me like:
> > > > > 
> > > > > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> > > > >    priv->channels.params.num_channels (instead of priv->stats_nch),
> > > > 
> > > > Yes, tho, not sure whether the "if i < ...num_channels" is even
> > > > necessary, as core already checks against real_num_rx_queues.
> > > > 
> > > > >    and when
> > > > >    summing mlx5e_sq_stats in the latter function, it's up to
> > > > >    priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > > > > 
> > > > > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> > > > >    priv->channels.params.num_channels to priv->stats_nch, and
> > > > 
> > > > I'm not sure num_channels gets set to 0 when device is down so possibly
> > > > from "0 if down else ...num_channels" to stats_nch.
> > > 
> > > Yea, you were right:
> > > 
> > >    if (priv->channels.num == 0)
> > >            i = 0;
> > >    else
> > >            i = priv->channels.params.num_channels;
> > >    for (; i < priv->stats_nch; i++) {
> > > 
> > > Seems to be working now when I adjust the queue count and the test is
> > > passing as I adjust the queue count up or down. Cool.
> > > 
> > 
> > I agree that get_base should include all inactive queues stats.
> > But it's not straight forward to implement.
> > 
> > A few guiding points:
> > 
> > Use mlx5e_get_dcb_num_tc(params) for current num_tc.
> > 
> > txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
> > params->num_channels.
> > 
> > The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
> > 
> > It means, in the base stats you should include SQ stats for:
> > 1. all SQs of non-active channels, i.e. ch in [params.num_channels,
> > priv->stats_nch), tc in [0, priv->max_opened_tc).
> > 2. all SQs of non-active TCs in active channels [0, params.num_channels), tc
> > in [mlx5e_get_dcb_num_tc(params), priv->max_opened_tc).
> > 
> > Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
> > You should not loop over all TCs of channel index i.
> > You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], and
> > then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].
> 
> It looks like txq2sq probably will help with this?
> 
> Something like:
> 
> for (j = 0; j < mlx5e_get_dcb_num_tc(); j++) {
>   sq = priv->txq2sq[j];
>   if (sq->ch_ix == i) {
>     /* this sq->stats is what I need */
>   }
> }
> 
> Is that right?

This was incorrect, but I think I got it in the v2 I just sent out. When
you have the time, please take a look at that version.

Thanks for the guidance, it was very helpful.
 
> Not sure if I'm missing something obvious here, sorry, I've been puzzling
> over the mlx5 source for a bit.
> 
> BTW: kind of related but in mlx5e_alloc_txqsq the int tc param is unused (I
> think). It might be helpful to struct mlx5e_txqsq to have a tc field and
> then in mlx5e_alloc_txqsq:
> 
>   sq->tc = tc;
> 
> Not sure if that'd be helpful in general, but I could send that as a
> separate patch.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-05-10  4:27 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-03  2:25 [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Joe Damato
2024-05-03  2:25 ` [PATCH net-next 1/1] net/mlx5e: Add per queue netdev-genl stats Joe Damato
2024-05-03 10:55 ` [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats Zhu Yanjun
2024-05-03 18:43   ` Joe Damato
2024-05-03 21:58     ` Jakub Kicinski
2024-05-03 23:53       ` Joe Damato
2024-05-04  0:34         ` Jakub Kicinski
2024-05-06 18:04           ` Joe Damato
2024-05-08 21:40             ` Tariq Toukan
2024-05-08 23:24               ` Joe Damato
2024-05-09  0:56                 ` Jakub Kicinski
2024-05-09  1:57                   ` Joe Damato
2024-05-09  2:08                     ` Jakub Kicinski
2024-05-09  4:11                       ` Joe Damato
2024-05-09  6:30                       ` Joe Damato
2024-05-09 10:16                         ` Tariq Toukan
2024-05-09 23:14                           ` Joe Damato
2024-05-10  0:31                           ` Joe Damato
2024-05-10  4:27                             ` Joe Damato
2024-05-09  9:42                 ` Tariq Toukan
2024-05-09 23:06                   ` Joe Damato

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).