linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] mlx4: Add support for netdev-genl API
@ 2024-04-23 19:49 Joe Damato
  2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Joe Damato @ 2024-04-23 19:49 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski,
	open list:MELLANOX MLX4 core VPI driver, Paolo Abeni

Hi:

This series adds support to mlx4 for the netdev-genl API which makes it
much easier for users and user programs to map NAPI IDs back to
ifindexes, queues, and IRQs. This is extremely useful for a number of
use cases, including epoll-based busy poll.

In addition, this series includes a patch to generate per-queue
statistics using the netlink API, as well.

To facilitate the stats, patch 1/3 makes use of an existing field,
"dropped" which was already being exported in the ethtool stats by the
driver, but was never incremented. As of patch 1/3, it is now being
incremented by the driver in an appropriate place and used in patch 3/3
as alloc_fail.

Please note: I do not have access to mlx4 hardware, but I've been
working closely with Martin Karsten from University of Waterloo (CC'd)
who has very graciously tested my patches on their mlx4 hardware (hence
his Tested-by attribution in each commit). His latest research work is
particularly interesting [1] and this series helps to support that (and
future) work.

[1]: https://dl.acm.org/doi/pdf/10.1145/3626780

Thanks,
Joe

Joe Damato (3):
  net/mlx4: Track RX allocation failures in a stat
  net/mlx4: link NAPI instances to queues and IRQs
  net/mlx4: support per-queue statistics via netlink

 drivers/net/ethernet/mellanox/mlx4/en_cq.c    | 14 +++
 .../net/ethernet/mellanox/mlx4/en_netdev.c    | 91 +++++++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/en_port.c  |  4 +-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |  4 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |  1 +
 5 files changed, 112 insertions(+), 2 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-23 19:49 [PATCH net-next 0/3] mlx4: Add support for netdev-genl API Joe Damato
@ 2024-04-23 19:49 ` Joe Damato
  2024-04-23 22:38   ` Joe Damato
  2024-04-24  6:27   ` kernel test robot
  2024-04-23 19:49 ` [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
  2024-04-23 19:49 ` [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
  2 siblings, 2 replies; 16+ messages in thread
From: Joe Damato @ 2024-04-23 19:49 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
fails but does not increment a stat field when this occurs.

struct mlx4_en_rx_ring has a dropped field which is tabulated in
mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.

This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
dropped field for the -ENOMEM case.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 drivers/net/ethernet/mellanox/mlx4/en_port.c | 4 +++-
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c b/drivers/net/ethernet/mellanox/mlx4/en_port.c
index 532997eba698..c4b1062158e1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
-	unsigned long packets, bytes;
+	unsigned long packets, bytes, dropped;
 	int i;
 
 	if (!priv->port_up || mlx4_is_master(mdev->dev))
@@ -164,9 +164,11 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
 
 		packets += READ_ONCE(ring->packets);
 		bytes   += READ_ONCE(ring->bytes);
+		dropped += READ_ONCE(ring->dropped);
 	}
 	dev->stats.rx_packets = packets;
 	dev->stats.rx_bytes = bytes;
+	dev->stats.rx_missed_errors = dropped;
 
 	packets = 0;
 	bytes = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 8328df8645d5..573ae10300c7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
 
 	for (i = 0; i < priv->num_frags; i++, frags++) {
 		if (!frags->page) {
-			if (mlx4_alloc_page(priv, frags, gfp))
+			if (mlx4_alloc_page(priv, frags, gfp)) {
+				ring->dropped++;
 				return -ENOMEM;
+			}
 			ring->rx_alloc_pages++;
 		}
 		rx_desc->data[i].addr = cpu_to_be64(frags->dma +
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs
  2024-04-23 19:49 [PATCH net-next 0/3] mlx4: Add support for netdev-genl API Joe Damato
  2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-04-23 19:49 ` Joe Damato
  2024-04-24  7:41   ` kernel test robot
  2024-04-23 19:49 ` [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
  2 siblings, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-04-23 19:49 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

Make mlx4 compatible with the newly added netlink queue GET APIs.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 drivers/net/ethernet/mellanox/mlx4/en_cq.c   | 14 ++++++++++++++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 1184ac5751e1..d212ed9bfd60 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -126,6 +126,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 		cq_idx = cq_idx % priv->rx_ring_num;
 		rx_cq = priv->rx_cq[cq_idx];
 		cq->vector = rx_cq->vector;
+		irq = mlx4_eq_get_irq(mdev->dev, cq->vector);
 	}
 
 	if (cq->type == RX)
@@ -142,18 +143,23 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 	if (err)
 		goto free_eq;
 
+	cq->cq_idx = cq_idx;
 	cq->mcq.event = mlx4_en_cq_event;
 
 	switch (cq->type) {
 	case TX:
 		cq->mcq.comp = mlx4_en_tx_irq;
 		netif_napi_add_tx(cq->dev, &cq->napi, mlx4_en_poll_tx_cq);
+		netif_napi_set_irq(&cq->napi, irq);
 		napi_enable(&cq->napi);
+		netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_TX, &cq->napi);
 		break;
 	case RX:
 		cq->mcq.comp = mlx4_en_rx_irq;
 		netif_napi_add(cq->dev, &cq->napi, mlx4_en_poll_rx_cq);
+		netif_napi_set_irq(&cq->napi, irq);
 		napi_enable(&cq->napi);
+		netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_RX, &cq->napi);
 		break;
 	case TX_XDP:
 		/* nothing regarding napi, it's shared with rx ring */
@@ -188,7 +194,15 @@ void mlx4_en_destroy_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq **pcq)
 
 void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
 {
+	enum netdev_queue_type qtype;
+
 	if (cq->type != TX_XDP) {
+		if (cq->type == RX)
+			qtype = NETDEV_QUEUE_TYPE_RX;
+		else if (cq->type == TX)
+			qtype = NETDEV_QUEUE_TYPE_TX;
+
+		netif_queue_set_napi(cq->dev, cq->cq_idx, qtype, NULL);
 		napi_disable(&cq->napi);
 		netif_napi_del(&cq->napi);
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index efe3f97b874f..896f985549a4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -379,6 +379,7 @@ struct mlx4_en_cq {
 #define MLX4_EN_OPCODE_ERROR	0x1e
 
 	const struct cpumask *aff_mask;
+	int cq_idx;
 };
 
 struct mlx4_en_port_profile {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-23 19:49 [PATCH net-next 0/3] mlx4: Add support for netdev-genl API Joe Damato
  2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
  2024-04-23 19:49 ` [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
@ 2024-04-23 19:49 ` Joe Damato
  2024-04-23 22:42   ` Joe Damato
  2 siblings, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-04-23 19:49 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

Make mlx4 compatible with the newly added netlink queue stats API.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
 .../net/ethernet/mellanox/mlx4/en_netdev.c    | 91 +++++++++++++++++++
 1 file changed, 91 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 5d3fde63b273..c7f04d4820c6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -43,6 +43,7 @@
 #include <net/vxlan.h>
 #include <net/devlink.h>
 #include <net/rps.h>
+#include <net/netdev_queues.h>
 
 #include <linux/mlx4/driver.h>
 #include <linux/mlx4/device.h>
@@ -3099,6 +3100,95 @@ void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
 	last_i += NUM_PHY_STATS;
 }
 
+static void mlx4_get_queue_stats_rx(struct net_device *dev, int i,
+				    struct netdev_queue_stats_rx *stats)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	const struct mlx4_en_rx_ring *ring;
+
+	stats->packets = 0xff;
+	stats->bytes = 0xff;
+	stats->alloc_fail = 0xff;
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	ring = priv->rx_ring[i];
+	stats->packets = READ_ONCE(ring->packets);
+	stats->bytes   = READ_ONCE(ring->bytes);
+	stats->alloc_fail = READ_ONCE(ring->dropped);
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_queue_stats_tx(struct net_device *dev, int i,
+				    struct netdev_queue_stats_tx *stats)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	const struct mlx4_en_tx_ring *ring;
+
+	stats->packets = 0xff;
+	stats->bytes = 0xff;
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	ring = priv->tx_ring[TX][i];
+	stats->packets = READ_ONCE(ring->packets);
+	stats->bytes   = READ_ONCE(ring->bytes);
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_base_stats(struct net_device *dev,
+				struct netdev_queue_stats_rx *rx,
+				struct netdev_queue_stats_tx *tx)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+	int i;
+
+	rx->packets = 0xff;
+	rx->bytes = 0xff;
+	rx->alloc_fail = 0xff;
+	tx->packets = 0xff;
+	tx->bytes = 0xff;
+
+	spin_lock_bh(&priv->stats_lock);
+
+	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+		goto out_unlock;
+
+	for (i = 0; i < priv->rx_ring_num; i++) {
+		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
+
+		rx->packets += READ_ONCE(ring->packets);
+		rx->bytes += READ_ONCE(ring->bytes);
+		rx->alloc_fail += READ_ONCE(ring->dropped);
+	}
+
+	for (i = 0; i < priv->tx_ring_num[TX]; i++) {
+		const struct mlx4_en_tx_ring *ring = priv->tx_ring[TX][i];
+
+		tx->packets += READ_ONCE(ring->packets);
+		tx->bytes   += READ_ONCE(ring->bytes);
+	}
+
+out_unlock:
+	spin_unlock_bh(&priv->stats_lock);
+}
+
+static const struct netdev_stat_ops mlx4_stat_ops = {
+	.get_queue_stats_rx     = mlx4_get_queue_stats_rx,
+	.get_queue_stats_tx     = mlx4_get_queue_stats_tx,
+	.get_base_stats         = mlx4_get_base_stats,
+};
+
 int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 			struct mlx4_en_port_profile *prof)
 {
@@ -3262,6 +3352,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
 	netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
 
+	dev->stat_ops = &mlx4_stat_ops;
 	dev->ethtool_ops = &mlx4_en_ethtool_ops;
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-04-23 22:38   ` Joe Damato
  2024-04-24  6:27   ` kernel test robot
  1 sibling, 0 replies; 16+ messages in thread
From: Joe Damato @ 2024-04-23 22:38 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, Apr 23, 2024 at 07:49:28PM +0000, Joe Damato wrote:
> mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
> fails but does not increment a stat field when this occurs.
> 
> struct mlx4_en_rx_ring has a dropped field which is tabulated in
> mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.
> 
> This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
> dropped field for the -ENOMEM case.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_port.c | 4 +++-
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_port.c b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> index 532997eba698..c4b1062158e1 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> @@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
>  {
>  	struct mlx4_en_priv *priv = netdev_priv(dev);
>  	struct mlx4_en_dev *mdev = priv->mdev;
> -	unsigned long packets, bytes;
> +	unsigned long packets, bytes, dropped;

Err, sorry forgot to initialize this to 0 below with the rest of the init.
Will fix this in v2 (and whatever other feedback there is). Apologies on
the error here on my part.

>  	int i;
>  
>  	if (!priv->port_up || mlx4_is_master(mdev->dev))
> @@ -164,9 +164,11 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
>  
>  		packets += READ_ONCE(ring->packets);
>  		bytes   += READ_ONCE(ring->bytes);
> +		dropped += READ_ONCE(ring->dropped);
>  	}
>  	dev->stats.rx_packets = packets;
>  	dev->stats.rx_bytes = bytes;
> +	dev->stats.rx_missed_errors = dropped;
>  
>  	packets = 0;
>  	bytes = 0;
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 8328df8645d5..573ae10300c7 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
>  
>  	for (i = 0; i < priv->num_frags; i++, frags++) {
>  		if (!frags->page) {
> -			if (mlx4_alloc_page(priv, frags, gfp))
> +			if (mlx4_alloc_page(priv, frags, gfp)) {
> +				ring->dropped++;
>  				return -ENOMEM;
> +			}
>  			ring->rx_alloc_pages++;
>  		}
>  		rx_desc->data[i].addr = cpu_to_be64(frags->dma +
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-23 19:49 ` [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
@ 2024-04-23 22:42   ` Joe Damato
  2024-04-24  0:57     ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-04-23 22:42 UTC (permalink / raw)
  To: linux-kernel, netdev, tariqt, saeedm
  Cc: mkarsten, gal, nalramli, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, Apr 23, 2024 at 07:49:30PM +0000, Joe Damato wrote:
> Make mlx4 compatible with the newly added netlink queue stats API.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
> ---
>  .../net/ethernet/mellanox/mlx4/en_netdev.c    | 91 +++++++++++++++++++
>  1 file changed, 91 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 5d3fde63b273..c7f04d4820c6 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -43,6 +43,7 @@
>  #include <net/vxlan.h>
>  #include <net/devlink.h>
>  #include <net/rps.h>
> +#include <net/netdev_queues.h>
>  
>  #include <linux/mlx4/driver.h>
>  #include <linux/mlx4/device.h>
> @@ -3099,6 +3100,95 @@ void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
>  	last_i += NUM_PHY_STATS;
>  }
>  
> +static void mlx4_get_queue_stats_rx(struct net_device *dev, int i,
> +				    struct netdev_queue_stats_rx *stats)
> +{
> +	struct mlx4_en_priv *priv = netdev_priv(dev);
> +	const struct mlx4_en_rx_ring *ring;
> +
> +	stats->packets = 0xff;
> +	stats->bytes = 0xff;
> +	stats->alloc_fail = 0xff;
> +
> +	spin_lock_bh(&priv->stats_lock);
> +
> +	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
> +		goto out_unlock;
> +
> +	ring = priv->rx_ring[i];
> +	stats->packets = READ_ONCE(ring->packets);
> +	stats->bytes   = READ_ONCE(ring->bytes);
> +	stats->alloc_fail = READ_ONCE(ring->dropped);
> +
> +out_unlock:
> +	spin_unlock_bh(&priv->stats_lock);
> +}
> +
> +static void mlx4_get_queue_stats_tx(struct net_device *dev, int i,
> +				    struct netdev_queue_stats_tx *stats)
> +{
> +	struct mlx4_en_priv *priv = netdev_priv(dev);
> +	const struct mlx4_en_tx_ring *ring;
> +
> +	stats->packets = 0xff;
> +	stats->bytes = 0xff;
> +
> +	spin_lock_bh(&priv->stats_lock);
> +
> +	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
> +		goto out_unlock;
> +
> +	ring = priv->tx_ring[TX][i];
> +	stats->packets = READ_ONCE(ring->packets);
> +	stats->bytes   = READ_ONCE(ring->bytes);
> +
> +out_unlock:
> +	spin_unlock_bh(&priv->stats_lock);
> +}
> +
> +static void mlx4_get_base_stats(struct net_device *dev,
> +				struct netdev_queue_stats_rx *rx,
> +				struct netdev_queue_stats_tx *tx)
> +{
> +	struct mlx4_en_priv *priv = netdev_priv(dev);
> +	int i;
> +
> +	rx->packets = 0xff;
> +	rx->bytes = 0xff;
> +	rx->alloc_fail = 0xff;
> +	tx->packets = 0xff;
> +	tx->bytes = 0xff;
> +
> +	spin_lock_bh(&priv->stats_lock);
> +
> +	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
> +		goto out_unlock;

I realized in this case, I'll need to set the fields initialized to 0xff
above to 0 before doing the increments below.

Sorry about that; just realized that now and will fix that in the v2 (along
with any other feedback I get), probably something:

  if (priv->rx_ring_num) {
          rx->packets = 0;
          rx->bytes = 0;
          rx->alloc_fail = 0;
  }

Here for the RX side and see below for the TX side.

> +	for (i = 0; i < priv->rx_ring_num; i++) {
> +		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
> +
> +		rx->packets += READ_ONCE(ring->packets);
> +		rx->bytes += READ_ONCE(ring->bytes);
> +		rx->alloc_fail += READ_ONCE(ring->dropped);
> +	}

Similar to above, probably will fix with something like this here:

  if (priv->tx_ring_num[TX]) {
          tx->packets = 0;
          tx->bytes = 0;
  }

Sorry for the noise, I should have noticed this before sending it out.

> +	for (i = 0; i < priv->tx_ring_num[TX]; i++) {
> +		const struct mlx4_en_tx_ring *ring = priv->tx_ring[TX][i];
> +
> +		tx->packets += READ_ONCE(ring->packets);
> +		tx->bytes   += READ_ONCE(ring->bytes);
> +	}
> +
> +out_unlock:
> +	spin_unlock_bh(&priv->stats_lock);
> +}
> +
> +static const struct netdev_stat_ops mlx4_stat_ops = {
> +	.get_queue_stats_rx     = mlx4_get_queue_stats_rx,
> +	.get_queue_stats_tx     = mlx4_get_queue_stats_tx,
> +	.get_base_stats         = mlx4_get_base_stats,
> +};
> +
>  int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
>  			struct mlx4_en_port_profile *prof)
>  {
> @@ -3262,6 +3352,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
>  	netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
>  	netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
>  
> +	dev->stat_ops = &mlx4_stat_ops;
>  	dev->ethtool_ops = &mlx4_en_ethtool_ops;
>  
>  	/*
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-23 22:42   ` Joe Damato
@ 2024-04-24  0:57     ` Jakub Kicinski
  2024-04-24  5:54       ` Joe Damato
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2024-04-24  0:57 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:
> I realized in this case, I'll need to set the fields initialized to 0xff
> above to 0 before doing the increments below.

I don't know mlx4 very well, but glancing at the code - are you sure we
need to loop over the queues is the "base" callbacks?

The base callbacks are for getting "historical" data, i.e. info which
was associated with queues which are no longer present. You seem to
sweep all queues, so I'd have expected "base" to just set the values 
to 0. And the real values to come from the per-queue callbacks.

The init to 0xff looks quite sus.

Also what does this:

>	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))

do? 🤔️ what's a "master" in this context?

> Sorry about that; just realized that now and will fix that in the v2 (along
> with any other feedback I get), probably something:
> 
>   if (priv->rx_ring_num) {
>           rx->packets = 0;
>           rx->bytes = 0;
>           rx->alloc_fail = 0;
>   }
> 
> Here for the RX side and see below for the TX side.

FWIW I added a simple test for making sure queue stats match interface
stats, it's tools/testing/selftests/drivers/net/stats.py

You have to export NETIF=$name to make it run on a real interface.

To copy the tests to a remote machine I do:

make -C tools/testing/selftests/ TARGETS="net drivers/net drivers/net/hw" install INSTALL_PATH=/tmp/ksft-net-drv
rsync -ra --delete /tmp/ksft-net-drv root@${machine}:/root/

HTH

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-24  0:57     ` Jakub Kicinski
@ 2024-04-24  5:54       ` Joe Damato
  2024-04-24 14:28         ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-04-24  5:54 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:
> > I realized in this case, I'll need to set the fields initialized to 0xff
> > above to 0 before doing the increments below.
> 
> I don't know mlx4 very well, but glancing at the code - are you sure we
> need to loop over the queues is the "base" callbacks?
> 
> The base callbacks are for getting "historical" data, i.e. info which
> was associated with queues which are no longer present. You seem to
> sweep all queues, so I'd have expected "base" to just set the values 
> to 0. And the real values to come from the per-queue callbacks.

Hmm. Sorry I must have totally misunderstood what the purpose of "base"
was. I've just now more closely looked at bnxt which (maybe?) is the only
driver that implements base and I think maybe I kind of get it now.

For some reason, I thought it meant "the total stats of all queues"; I didn't
know it was intended to provide "historical" data as you say.

Making it set everything to 0 makes sense to me. I suppose I could also simply
omit it? What do you think?

> The init to 0xff looks quite sus.

Yes the init to 0xff is wrong, too. I noticed that, as well.

Here's what I have listed so far in my changelog for the v2 (which I haven't
sent yet), but perhaps the maintainers of mlx4 can weigh in?

v1 -> v2:
 - Patch 1/3 now initializes dropped to 0.
 - Patch 3/3 includes several changes:
   - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
     valid before proceeding.
   - All initialization to 0xff for stats fields has been omit. The
     network stack does this before calling into the driver functions, so
     I've adjusted the driver functions to only set values if there is
     data to set, leaving the network stack's 0xff in place if not.
   - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).

Let me know if that sounds vaguely correct?

> Also what does this:
> 
> >	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
> 
> do? 🤔️ what's a "master" in this context?

I have a guess, but I'd rather let the Mellanox folks provide the official
answer :)

> > Sorry about that; just realized that now and will fix that in the v2 (along
> > with any other feedback I get), probably something:
> > 
> >   if (priv->rx_ring_num) {
> >           rx->packets = 0;
> >           rx->bytes = 0;
> >           rx->alloc_fail = 0;
> >   }
> > 
> > Here for the RX side and see below for the TX side.
> 
> FWIW I added a simple test for making sure queue stats match interface
> stats, it's tools/testing/selftests/drivers/net/stats.py
> 
> You have to export NETIF=$name to make it run on a real interface.
> 
> To copy the tests to a remote machine I do:
> 
> make -C tools/testing/selftests/ TARGETS="net drivers/net drivers/net/hw" install INSTALL_PATH=/tmp/ksft-net-drv
> rsync -ra --delete /tmp/ksft-net-drv root@${machine}:/root/
> 
> HTH

Thanks, this is a great help actually.

I have a similar changeset for mlx5 (which is hardware I do have access to)
that adds the per-queue stats stuff so I'll definitely give your test a try.

Seeing as I made a lot of errors in this series, I'll hold off on sending the
mlx5 series until this mlx4 series is fixed and accepted, that way I can
produce a much better v1 for mlx5.

Thanks,
Joe

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat
  2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
  2024-04-23 22:38   ` Joe Damato
@ 2024-04-24  6:27   ` kernel test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kernel test robot @ 2024-04-24  6:27 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev, tariqt, saeedm
  Cc: llvm, oe-kbuild-all, mkarsten, gal, nalramli, Joe Damato,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-rdma

Hi Joe,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Joe-Damato/net-mlx4-Track-RX-allocation-failures-in-a-stat/20240424-035224
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240423194931.97013-2-jdamato%40fastly.com
patch subject: [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat
config: s390-defconfig (https://download.01.org/0day-ci/archive/20240424/202404241411.3d5OJHs6-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 5ef5eb66fb428aaf61fb51b709f065c069c11242)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240424/202404241411.3d5OJHs6-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202404241411.3d5OJHs6-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:10:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/net/ethernet/mellanox/mlx4/en_port.c:35:
   In file included from include/linux/if_vlan.h:10:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/mellanox/mlx4/en_port.c:35:
   In file included from include/linux/if_vlan.h:10:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/mellanox/mlx4/en_port.c:35:
   In file included from include/linux/if_vlan.h:10:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/net/ethernet/mellanox/mlx4/en_port.c:167:3: warning: variable 'dropped' is uninitialized when used here [-Wuninitialized]
     167 |                 dropped += READ_ONCE(ring->dropped);
         |                 ^~~~~~~
   drivers/net/ethernet/mellanox/mlx4/en_port.c:154:39: note: initialize the variable 'dropped' to silence this warning
     154 |         unsigned long packets, bytes, dropped;
         |                                              ^
         |                                               = 0
   18 warnings generated.


vim +/dropped +167 drivers/net/ethernet/mellanox/mlx4/en_port.c

   149	
   150	void mlx4_en_fold_software_stats(struct net_device *dev)
   151	{
   152		struct mlx4_en_priv *priv = netdev_priv(dev);
   153		struct mlx4_en_dev *mdev = priv->mdev;
   154		unsigned long packets, bytes, dropped;
   155		int i;
   156	
   157		if (!priv->port_up || mlx4_is_master(mdev->dev))
   158			return;
   159	
   160		packets = 0;
   161		bytes = 0;
   162		for (i = 0; i < priv->rx_ring_num; i++) {
   163			const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
   164	
   165			packets += READ_ONCE(ring->packets);
   166			bytes   += READ_ONCE(ring->bytes);
 > 167			dropped += READ_ONCE(ring->dropped);
   168		}
   169		dev->stats.rx_packets = packets;
   170		dev->stats.rx_bytes = bytes;
   171		dev->stats.rx_missed_errors = dropped;
   172	
   173		packets = 0;
   174		bytes = 0;
   175		for (i = 0; i < priv->tx_ring_num[TX]; i++) {
   176			const struct mlx4_en_tx_ring *ring = priv->tx_ring[TX][i];
   177	
   178			packets += READ_ONCE(ring->packets);
   179			bytes   += READ_ONCE(ring->bytes);
   180		}
   181		dev->stats.tx_packets = packets;
   182		dev->stats.tx_bytes = bytes;
   183	}
   184	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs
  2024-04-23 19:49 ` [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
@ 2024-04-24  7:41   ` kernel test robot
  0 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2024-04-24  7:41 UTC (permalink / raw)
  To: Joe Damato, linux-kernel, netdev, tariqt, saeedm
  Cc: llvm, oe-kbuild-all, mkarsten, gal, nalramli, Joe Damato,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-rdma

Hi Joe,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Joe-Damato/net-mlx4-Track-RX-allocation-failures-in-a-stat/20240424-035224
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240423194931.97013-3-jdamato%40fastly.com
patch subject: [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs
config: s390-defconfig (https://download.01.org/0day-ci/archive/20240424/202404241518.4TvjoN07-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 5ef5eb66fb428aaf61fb51b709f065c069c11242)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240424/202404241518.4TvjoN07-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202404241518.4TvjoN07-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/linux/elf.h:6:
   In file included from arch/s390/include/asm/elf.h:173:
   In file included from arch/s390/include/asm/mmu_context.h:11:
   In file included from arch/s390/include/asm/pgalloc.h:18:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:508:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     508 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     509 |                            item];
         |                            ~~~~
   include/linux/vmstat.h:515:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     515 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     516 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     522 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   include/linux/vmstat.h:527:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     527 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     528 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:536:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
     536 |         return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~ ^
     537 |                            NR_VM_NUMA_EVENT_ITEMS +
         |                            ~~~~~~~~~~~~~~~~~~~~~~
   In file included from drivers/net/ethernet/mellanox/mlx4/en_cq.c:34:
   In file included from include/linux/mlx4/cq.h:39:
   In file included from include/linux/mlx4/device.h:37:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/mellanox/mlx4/en_cq.c:34:
   In file included from include/linux/mlx4/cq.h:39:
   In file included from include/linux/mlx4/device.h:37:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from drivers/net/ethernet/mellanox/mlx4/en_cq.c:34:
   In file included from include/linux/mlx4/cq.h:39:
   In file included from include/linux/mlx4/device.h:37:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:28:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:78:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/net/ethernet/mellanox/mlx4/en_cq.c:202:12: warning: variable 'qtype' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
     202 |                 else if (cq->type == TX)
         |                          ^~~~~~~~~~~~~~
   drivers/net/ethernet/mellanox/mlx4/en_cq.c:205:45: note: uninitialized use occurs here
     205 |                 netif_queue_set_napi(cq->dev, cq->cq_idx, qtype, NULL);
         |                                                           ^~~~~
   drivers/net/ethernet/mellanox/mlx4/en_cq.c:202:8: note: remove the 'if' if its condition is always true
     202 |                 else if (cq->type == TX)
         |                      ^~~~~~~~~~~~~~~~~~~
     203 |                         qtype = NETDEV_QUEUE_TYPE_TX;
   drivers/net/ethernet/mellanox/mlx4/en_cq.c:197:2: note: variable 'qtype' is declared here
     197 |         enum netdev_queue_type qtype;
         |         ^
   18 warnings generated.


vim +202 drivers/net/ethernet/mellanox/mlx4/en_cq.c

   194	
   195	void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
   196	{
   197		enum netdev_queue_type qtype;
   198	
   199		if (cq->type != TX_XDP) {
   200			if (cq->type == RX)
   201				qtype = NETDEV_QUEUE_TYPE_RX;
 > 202			else if (cq->type == TX)
   203				qtype = NETDEV_QUEUE_TYPE_TX;
   204	
   205			netif_queue_set_napi(cq->dev, cq->cq_idx, qtype, NULL);
   206			napi_disable(&cq->napi);
   207			netif_napi_del(&cq->napi);
   208		}
   209	
   210		mlx4_cq_free(priv->mdev->dev, &cq->mcq);
   211	}
   212	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-24  5:54       ` Joe Damato
@ 2024-04-24 14:28         ` Jakub Kicinski
  2024-04-24 16:39           ` Joe Damato
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2024-04-24 14:28 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, 23 Apr 2024 22:54:50 -0700 Joe Damato wrote:
> On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> > On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:  
> > > I realized in this case, I'll need to set the fields initialized to 0xff
> > > above to 0 before doing the increments below.  
> > 
> > I don't know mlx4 very well, but glancing at the code - are you sure we
> > need to loop over the queues is the "base" callbacks?
> > 
> > The base callbacks are for getting "historical" data, i.e. info which
> > was associated with queues which are no longer present. You seem to
> > sweep all queues, so I'd have expected "base" to just set the values 
> > to 0. And the real values to come from the per-queue callbacks.  
> 
> Hmm. Sorry I must have totally misunderstood what the purpose of "base"
> was. I've just now more closely looked at bnxt which (maybe?) is the only
> driver that implements base and I think maybe I kind of get it now.
> 
> For some reason, I thought it meant "the total stats of all queues"; I didn't
> know it was intended to provide "historical" data as you say.
> 
> Making it set everything to 0 makes sense to me. I suppose I could also simply
> omit it? What do you think?

The base is used to figure out which stats are reported when we dump 
a summary for the whole device. So you gotta set them to 0.

> > The init to 0xff looks quite sus.  
> 
> Yes the init to 0xff is wrong, too. I noticed that, as well.
> 
> Here's what I have listed so far in my changelog for the v2 (which I haven't
> sent yet), but perhaps the maintainers of mlx4 can weigh in?
> 
> v1 -> v2:
>  - Patch 1/3 now initializes dropped to 0.
>  - Patch 3/3 includes several changes:
>    - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
>      valid before proceeding.
>    - All initialization to 0xff for stats fields has been omit. The
>      network stack does this before calling into the driver functions, so
>      I've adjusted the driver functions to only set values if there is
>      data to set, leaving the network stack's 0xff in place if not.
>    - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).

All the ones you report right? Not just zero the struct.
Any day now (tm) someone will add a lot more stats to the struct
so the init should be selective only to the stats that are actually
supported.

> Let me know if that sounds vaguely correct?
> 
> > Also what does this:
> >   
> > >	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))  
> > 
> > do? 🤔️ what's a "master" in this context?  
> 
> I have a guess, but I'd rather let the Mellanox folks provide the official
> answer :)

My guess is that on multi-port only one of the netdevs is "in charge"
of the PCIe function. But these are queue stat, PCIe ownership may
matter for refresh but not for having the stats. So my guess must be
wrong..

> > > Sorry about that; just realized that now and will fix that in the v2 (along
> > > with any other feedback I get), probably something:
> > > 
> > >   if (priv->rx_ring_num) {
> > >           rx->packets = 0;
> > >           rx->bytes = 0;
> > >           rx->alloc_fail = 0;
> > >   }
> > > 
> > > Here for the RX side and see below for the TX side.  
> > 
> > FWIW I added a simple test for making sure queue stats match interface
> > stats, it's tools/testing/selftests/drivers/net/stats.py
> > 
> > You have to export NETIF=$name to make it run on a real interface.
> > 
> > To copy the tests to a remote machine I do:
> > 
> > make -C tools/testing/selftests/ TARGETS="net drivers/net drivers/net/hw" install INSTALL_PATH=/tmp/ksft-net-drv
> > rsync -ra --delete /tmp/ksft-net-drv root@${machine}:/root/
> > 
> > HTH  
> 
> Thanks, this is a great help actually.
> 
> I have a similar changeset for mlx5 (which is hardware I do have access to)
> that adds the per-queue stats stuff so I'll definitely give your test a try.
> 
> Seeing as I made a lot of errors in this series, I'll hold off on sending the
> mlx5 series until this mlx4 series is fixed and accepted, that way I can
> produce a much better v1 for mlx5.

mlx5 would be awesome! But no pressure on sending ASAP. I was writing a
test for page pool allocation failures, which depends on qstat to check
if driver actually saw the errors (I'll send it later today), and I
couldn't confirm it working on mlx5 :(

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-24 14:28         ` Jakub Kicinski
@ 2024-04-24 16:39           ` Joe Damato
  2024-04-25  3:46             ` Jakub Kicinski
  2024-04-25 16:31             ` Joe Damato
  0 siblings, 2 replies; 16+ messages in thread
From: Joe Damato @ 2024-04-24 16:39 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Wed, Apr 24, 2024 at 07:28:18AM -0700, Jakub Kicinski wrote:
> On Tue, 23 Apr 2024 22:54:50 -0700 Joe Damato wrote:
> > On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> > > On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:  
> > > > I realized in this case, I'll need to set the fields initialized to 0xff
> > > > above to 0 before doing the increments below.  
> > > 
> > > I don't know mlx4 very well, but glancing at the code - are you sure we
> > > need to loop over the queues is the "base" callbacks?
> > > 
> > > The base callbacks are for getting "historical" data, i.e. info which
> > > was associated with queues which are no longer present. You seem to
> > > sweep all queues, so I'd have expected "base" to just set the values 
> > > to 0. And the real values to come from the per-queue callbacks.  
> > 
> > Hmm. Sorry I must have totally misunderstood what the purpose of "base"
> > was. I've just now more closely looked at bnxt which (maybe?) is the only
> > driver that implements base and I think maybe I kind of get it now.
> > 
> > For some reason, I thought it meant "the total stats of all queues"; I didn't
> > know it was intended to provide "historical" data as you say.
> > 
> > Making it set everything to 0 makes sense to me. I suppose I could also simply
> > omit it? What do you think?
> 
> The base is used to figure out which stats are reported when we dump 
> a summary for the whole device. So you gotta set them to 0.

OK, thanks for your patience and the explanation. Will do.

> > > The init to 0xff looks quite sus.  
> > 
> > Yes the init to 0xff is wrong, too. I noticed that, as well.
> > 
> > Here's what I have listed so far in my changelog for the v2 (which I haven't
> > sent yet), but perhaps the maintainers of mlx4 can weigh in?
> > 
> > v1 -> v2:
> >  - Patch 1/3 now initializes dropped to 0.
> >  - Patch 3/3 includes several changes:
> >    - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> >      valid before proceeding.
> >    - All initialization to 0xff for stats fields has been omit. The
> >      network stack does this before calling into the driver functions, so
> >      I've adjusted the driver functions to only set values if there is
> >      data to set, leaving the network stack's 0xff in place if not.
> >    - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).
> 
> All the ones you report right? Not just zero the struct.
> Any day now (tm) someone will add a lot more stats to the struct
> so the init should be selective only to the stats that are actually
> supported.

Yes, not just zero the struct. Since I am reporting bytes packets for both
RX and TX and alloc_fail for RX I'll be setting those fields to 0
explicitly.

And there's also a warning about unused qtype (oops) in patch 2/3.

So, the revised v2 list (pending anything Mellanox wants) is:

  v1 -> v2:
   - Patch 1/3 now initializes dropped to 0.
   - Patch 2/3 fix use of unitialized qtype warning.
   - Patch 3/3 includes several changes:
     - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
       valid before proceeding.
     - All initialization to 0xff for stats fields has been omit. The
       network stack does this before calling into the driver functions, so
       I've adjusted the driver functions to only set values if there is
       data to set, leaving the network stack's 0xff in place if not.
     - mlx4_get_base_stats set all stat fields to 0 individually (no locking etc needed).

I'll hold off on sending this v2 until we hear back from Mellanox to be
sure there isn't anything else I'm missing.

> > Let me know if that sounds vaguely correct?
> > 
> > > Also what does this:
> > >   
> > > >	if (!priv->port_up || mlx4_is_master(priv->mdev->dev))  
> > > 
> > > do? 🤔️ what's a "master" in this context?  
> > 
> > I have a guess, but I'd rather let the Mellanox folks provide the official
> > answer :)
> 
> My guess is that on multi-port only one of the netdevs is "in charge"
> of the PCIe function. But these are queue stat, PCIe ownership may
> matter for refresh but not for having the stats. So my guess must be
> wrong..
>
> > > > Sorry about that; just realized that now and will fix that in the v2 (along
> > > > with any other feedback I get), probably something:
> > > > 
> > > >   if (priv->rx_ring_num) {
> > > >           rx->packets = 0;
> > > >           rx->bytes = 0;
> > > >           rx->alloc_fail = 0;
> > > >   }
> > > > 
> > > > Here for the RX side and see below for the TX side.  
> > > 
> > > FWIW I added a simple test for making sure queue stats match interface
> > > stats, it's tools/testing/selftests/drivers/net/stats.py
> > > 
> > > You have to export NETIF=$name to make it run on a real interface.
> > > 
> > > To copy the tests to a remote machine I do:
> > > 
> > > make -C tools/testing/selftests/ TARGETS="net drivers/net drivers/net/hw" install INSTALL_PATH=/tmp/ksft-net-drv
> > > rsync -ra --delete /tmp/ksft-net-drv root@${machine}:/root/
> > > 
> > > HTH  
> > 
> > Thanks, this is a great help actually.
> > 
> > I have a similar changeset for mlx5 (which is hardware I do have access to)
> > that adds the per-queue stats stuff so I'll definitely give your test a try.
> > 
> > Seeing as I made a lot of errors in this series, I'll hold off on sending the
> > mlx5 series until this mlx4 series is fixed and accepted, that way I can
> > produce a much better v1 for mlx5.
> 
> mlx5 would be awesome! But no pressure on sending ASAP. I was writing a
> test for page pool allocation failures, which depends on qstat to check
> if driver actually saw the errors (I'll send it later today), and I
> couldn't confirm it working on mlx5 :(

My hesitation with sending mlx5 changes is that I didn't want to add more
noise than I already have with my existing errors. I figured by getting
mlx4 working/accepted I'd know enough about the API to send a much better
v1 for mlx5 reducing the work for you (and everyone else) involved.

FWIW, I also attempted to implement this API for i40e (hardware I also
have):

  https://lore.kernel.org/lkml/20240410043936.206169-1-jdamato@fastly.com/

But there are some complications I haven't resolved, so I'm focusing on
mlx4 and mlx5, first, and will have to come back to i40e later.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-24 16:39           ` Joe Damato
@ 2024-04-25  3:46             ` Jakub Kicinski
  2024-04-25 16:31             ` Joe Damato
  1 sibling, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2024-04-25  3:46 UTC (permalink / raw)
  To: Joe Damato
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Wed, 24 Apr 2024 09:39:40 -0700 Joe Damato wrote:
> FWIW, I also attempted to implement this API for i40e (hardware I also
> have):
> 
>   https://lore.kernel.org/lkml/20240410043936.206169-1-jdamato@fastly.com/

Ah, missed the second patch on that thread initially!

> But there are some complications I haven't resolved, so I'm focusing on
> mlx4 and mlx5, first, and will have to come back to i40e later.

FWIW I hope this series will get ironed out soon and we'll have far
more qstats defined:
https://lore.kernel.org/all/20240423113141.1752-1-xuanzhuo@linux.alibaba.com/

The drop counts in particular could be useful in production, not so
sure about the rest :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-24 16:39           ` Joe Damato
  2024-04-25  3:46             ` Jakub Kicinski
@ 2024-04-25 16:31             ` Joe Damato
  2024-04-30 12:33               ` Leon Romanovsky
  1 sibling, 1 reply; 16+ messages in thread
From: Joe Damato @ 2024-04-25 16:31 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
	David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Wed, Apr 24, 2024 at 09:39:43AM -0700, Joe Damato wrote:
> On Wed, Apr 24, 2024 at 07:28:18AM -0700, Jakub Kicinski wrote:
> > On Tue, 23 Apr 2024 22:54:50 -0700 Joe Damato wrote:
> > > On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> > > > On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:  
> > > > > I realized in this case, I'll need to set the fields initialized to 0xff
> > > > > above to 0 before doing the increments below.  
> > > > 
> > > > I don't know mlx4 very well, but glancing at the code - are you sure we
> > > > need to loop over the queues is the "base" callbacks?
> > > > 
> > > > The base callbacks are for getting "historical" data, i.e. info which
> > > > was associated with queues which are no longer present. You seem to
> > > > sweep all queues, so I'd have expected "base" to just set the values 
> > > > to 0. And the real values to come from the per-queue callbacks.  
> > > 
> > > Hmm. Sorry I must have totally misunderstood what the purpose of "base"
> > > was. I've just now more closely looked at bnxt which (maybe?) is the only
> > > driver that implements base and I think maybe I kind of get it now.
> > > 
> > > For some reason, I thought it meant "the total stats of all queues"; I didn't
> > > know it was intended to provide "historical" data as you say.
> > > 
> > > Making it set everything to 0 makes sense to me. I suppose I could also simply
> > > omit it? What do you think?
> > 
> > The base is used to figure out which stats are reported when we dump 
> > a summary for the whole device. So you gotta set them to 0.
> 
> OK, thanks for your patience and the explanation. Will do.
> 
> > > > The init to 0xff looks quite sus.  
> > > 
> > > Yes the init to 0xff is wrong, too. I noticed that, as well.
> > > 
> > > Here's what I have listed so far in my changelog for the v2 (which I haven't
> > > sent yet), but perhaps the maintainers of mlx4 can weigh in?
> > > 
> > > v1 -> v2:
> > >  - Patch 1/3 now initializes dropped to 0.
> > >  - Patch 3/3 includes several changes:
> > >    - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> > >      valid before proceeding.
> > >    - All initialization to 0xff for stats fields has been omit. The
> > >      network stack does this before calling into the driver functions, so
> > >      I've adjusted the driver functions to only set values if there is
> > >      data to set, leaving the network stack's 0xff in place if not.
> > >    - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).
> > 
> > All the ones you report right? Not just zero the struct.
> > Any day now (tm) someone will add a lot more stats to the struct
> > so the init should be selective only to the stats that are actually
> > supported.
> 
> Yes, not just zero the struct. Since I am reporting bytes packets for both
> RX and TX and alloc_fail for RX I'll be setting those fields to 0
> explicitly.
> 
> And there's also a warning about unused qtype (oops) in patch 2/3.
> 
> So, the revised v2 list (pending anything Mellanox wants) is:
> 
>   v1 -> v2:
>    - Patch 1/3 now initializes dropped to 0.
>    - Patch 2/3 fix use of unitialized qtype warning.
>    - Patch 3/3 includes several changes:
>      - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
>        valid before proceeding.
>      - All initialization to 0xff for stats fields has been omit. The
>        network stack does this before calling into the driver functions, so
>        I've adjusted the driver functions to only set values if there is
>        data to set, leaving the network stack's 0xff in place if not.
>      - mlx4_get_base_stats set all stat fields to 0 individually (no locking etc needed).
> 
> I'll hold off on sending this v2 until we hear back from Mellanox to be
> sure there isn't anything else I'm missing.

It's been a few days and I haven't heard back from the mlx4 folks, so I
think I'll probably send my v2 later today which, hopefully, will fix most
of the above issues.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-25 16:31             ` Joe Damato
@ 2024-04-30 12:33               ` Leon Romanovsky
  2024-04-30 16:18                 ` Joe Damato
  0 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2024-04-30 12:33 UTC (permalink / raw)
  To: Joe Damato
  Cc: Jakub Kicinski, linux-kernel, netdev, tariqt, saeedm, mkarsten,
	gal, nalramli, David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Thu, Apr 25, 2024 at 09:31:14AM -0700, Joe Damato wrote:
> On Wed, Apr 24, 2024 at 09:39:43AM -0700, Joe Damato wrote:
> > On Wed, Apr 24, 2024 at 07:28:18AM -0700, Jakub Kicinski wrote:
> > > On Tue, 23 Apr 2024 22:54:50 -0700 Joe Damato wrote:
> > > > On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> > > > > On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:  
> > > > > > I realized in this case, I'll need to set the fields initialized to 0xff
> > > > > > above to 0 before doing the increments below.  
> > > > > 
> > > > > I don't know mlx4 very well, but glancing at the code - are you sure we
> > > > > need to loop over the queues is the "base" callbacks?
> > > > > 
> > > > > The base callbacks are for getting "historical" data, i.e. info which
> > > > > was associated with queues which are no longer present. You seem to
> > > > > sweep all queues, so I'd have expected "base" to just set the values 
> > > > > to 0. And the real values to come from the per-queue callbacks.  
> > > > 
> > > > Hmm. Sorry I must have totally misunderstood what the purpose of "base"
> > > > was. I've just now more closely looked at bnxt which (maybe?) is the only
> > > > driver that implements base and I think maybe I kind of get it now.
> > > > 
> > > > For some reason, I thought it meant "the total stats of all queues"; I didn't
> > > > know it was intended to provide "historical" data as you say.
> > > > 
> > > > Making it set everything to 0 makes sense to me. I suppose I could also simply
> > > > omit it? What do you think?
> > > 
> > > The base is used to figure out which stats are reported when we dump 
> > > a summary for the whole device. So you gotta set them to 0.
> > 
> > OK, thanks for your patience and the explanation. Will do.
> > 
> > > > > The init to 0xff looks quite sus.  
> > > > 
> > > > Yes the init to 0xff is wrong, too. I noticed that, as well.
> > > > 
> > > > Here's what I have listed so far in my changelog for the v2 (which I haven't
> > > > sent yet), but perhaps the maintainers of mlx4 can weigh in?
> > > > 
> > > > v1 -> v2:
> > > >  - Patch 1/3 now initializes dropped to 0.
> > > >  - Patch 3/3 includes several changes:
> > > >    - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> > > >      valid before proceeding.
> > > >    - All initialization to 0xff for stats fields has been omit. The
> > > >      network stack does this before calling into the driver functions, so
> > > >      I've adjusted the driver functions to only set values if there is
> > > >      data to set, leaving the network stack's 0xff in place if not.
> > > >    - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).
> > > 
> > > All the ones you report right? Not just zero the struct.
> > > Any day now (tm) someone will add a lot more stats to the struct
> > > so the init should be selective only to the stats that are actually
> > > supported.
> > 
> > Yes, not just zero the struct. Since I am reporting bytes packets for both
> > RX and TX and alloc_fail for RX I'll be setting those fields to 0
> > explicitly.
> > 
> > And there's also a warning about unused qtype (oops) in patch 2/3.
> > 
> > So, the revised v2 list (pending anything Mellanox wants) is:
> > 
> >   v1 -> v2:
> >    - Patch 1/3 now initializes dropped to 0.
> >    - Patch 2/3 fix use of unitialized qtype warning.
> >    - Patch 3/3 includes several changes:
> >      - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> >        valid before proceeding.
> >      - All initialization to 0xff for stats fields has been omit. The
> >        network stack does this before calling into the driver functions, so
> >        I've adjusted the driver functions to only set values if there is
> >        data to set, leaving the network stack's 0xff in place if not.
> >      - mlx4_get_base_stats set all stat fields to 0 individually (no locking etc needed).
> > 
> > I'll hold off on sending this v2 until we hear back from Mellanox to be
> > sure there isn't anything else I'm missing.
> 
> It's been a few days and I haven't heard back from the mlx4 folks, so I
> think I'll probably send my v2 later today which, hopefully, will fix most
> of the above issues.

MLNX folks were in long vacation in last two weeks.

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink
  2024-04-30 12:33               ` Leon Romanovsky
@ 2024-04-30 16:18                 ` Joe Damato
  0 siblings, 0 replies; 16+ messages in thread
From: Joe Damato @ 2024-04-30 16:18 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jakub Kicinski, linux-kernel, netdev, tariqt, saeedm, mkarsten,
	gal, nalramli, David S. Miller, Eric Dumazet, Paolo Abeni,
	open list:MELLANOX MLX4 core VPI driver

On Tue, Apr 30, 2024 at 03:33:14PM +0300, Leon Romanovsky wrote:
> On Thu, Apr 25, 2024 at 09:31:14AM -0700, Joe Damato wrote:
> > On Wed, Apr 24, 2024 at 09:39:43AM -0700, Joe Damato wrote:
> > > On Wed, Apr 24, 2024 at 07:28:18AM -0700, Jakub Kicinski wrote:
> > > > On Tue, 23 Apr 2024 22:54:50 -0700 Joe Damato wrote:
> > > > > On Tue, Apr 23, 2024 at 05:57:18PM -0700, Jakub Kicinski wrote:
> > > > > > On Tue, 23 Apr 2024 12:42:13 -1000 Joe Damato wrote:  
> > > > > > > I realized in this case, I'll need to set the fields initialized to 0xff
> > > > > > > above to 0 before doing the increments below.  
> > > > > > 
> > > > > > I don't know mlx4 very well, but glancing at the code - are you sure we
> > > > > > need to loop over the queues is the "base" callbacks?
> > > > > > 
> > > > > > The base callbacks are for getting "historical" data, i.e. info which
> > > > > > was associated with queues which are no longer present. You seem to
> > > > > > sweep all queues, so I'd have expected "base" to just set the values 
> > > > > > to 0. And the real values to come from the per-queue callbacks.  
> > > > > 
> > > > > Hmm. Sorry I must have totally misunderstood what the purpose of "base"
> > > > > was. I've just now more closely looked at bnxt which (maybe?) is the only
> > > > > driver that implements base and I think maybe I kind of get it now.
> > > > > 
> > > > > For some reason, I thought it meant "the total stats of all queues"; I didn't
> > > > > know it was intended to provide "historical" data as you say.
> > > > > 
> > > > > Making it set everything to 0 makes sense to me. I suppose I could also simply
> > > > > omit it? What do you think?
> > > > 
> > > > The base is used to figure out which stats are reported when we dump 
> > > > a summary for the whole device. So you gotta set them to 0.
> > > 
> > > OK, thanks for your patience and the explanation. Will do.
> > > 
> > > > > > The init to 0xff looks quite sus.  
> > > > > 
> > > > > Yes the init to 0xff is wrong, too. I noticed that, as well.
> > > > > 
> > > > > Here's what I have listed so far in my changelog for the v2 (which I haven't
> > > > > sent yet), but perhaps the maintainers of mlx4 can weigh in?
> > > > > 
> > > > > v1 -> v2:
> > > > >  - Patch 1/3 now initializes dropped to 0.
> > > > >  - Patch 3/3 includes several changes:
> > > > >    - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> > > > >      valid before proceeding.
> > > > >    - All initialization to 0xff for stats fields has been omit. The
> > > > >      network stack does this before calling into the driver functions, so
> > > > >      I've adjusted the driver functions to only set values if there is
> > > > >      data to set, leaving the network stack's 0xff in place if not.
> > > > >    - mlx4_get_base_stats sets all stats to 0 (no locking etc needed).
> > > > 
> > > > All the ones you report right? Not just zero the struct.
> > > > Any day now (tm) someone will add a lot more stats to the struct
> > > > so the init should be selective only to the stats that are actually
> > > > supported.
> > > 
> > > Yes, not just zero the struct. Since I am reporting bytes packets for both
> > > RX and TX and alloc_fail for RX I'll be setting those fields to 0
> > > explicitly.
> > > 
> > > And there's also a warning about unused qtype (oops) in patch 2/3.
> > > 
> > > So, the revised v2 list (pending anything Mellanox wants) is:
> > > 
> > >   v1 -> v2:
> > >    - Patch 1/3 now initializes dropped to 0.
> > >    - Patch 2/3 fix use of unitialized qtype warning.
> > >    - Patch 3/3 includes several changes:
> > >      - mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
> > >        valid before proceeding.
> > >      - All initialization to 0xff for stats fields has been omit. The
> > >        network stack does this before calling into the driver functions, so
> > >        I've adjusted the driver functions to only set values if there is
> > >        data to set, leaving the network stack's 0xff in place if not.
> > >      - mlx4_get_base_stats set all stat fields to 0 individually (no locking etc needed).
> > > 
> > > I'll hold off on sending this v2 until we hear back from Mellanox to be
> > > sure there isn't anything else I'm missing.
> > 
> > It's been a few days and I haven't heard back from the mlx4 folks, so I
> > think I'll probably send my v2 later today which, hopefully, will fix most
> > of the above issues.
> 
> MLNX folks were in long vacation in last two weeks.

OK, thanks for letting me know.

If any of those folks are following this thread, I hope they'll take a look
at the v2 of this series.

I was prepared to send the v3 today, but given that they've been out for a
bit I will hold off on sending the v3 for another day or two.

Thanks,
Joe

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-30 16:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-23 19:49 [PATCH net-next 0/3] mlx4: Add support for netdev-genl API Joe Damato
2024-04-23 19:49 ` [PATCH net-next 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
2024-04-23 22:38   ` Joe Damato
2024-04-24  6:27   ` kernel test robot
2024-04-23 19:49 ` [PATCH net-next 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
2024-04-24  7:41   ` kernel test robot
2024-04-23 19:49 ` [PATCH net-next 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
2024-04-23 22:42   ` Joe Damato
2024-04-24  0:57     ` Jakub Kicinski
2024-04-24  5:54       ` Joe Damato
2024-04-24 14:28         ` Jakub Kicinski
2024-04-24 16:39           ` Joe Damato
2024-04-25  3:46             ` Jakub Kicinski
2024-04-25 16:31             ` Joe Damato
2024-04-30 12:33               ` Leon Romanovsky
2024-04-30 16:18                 ` Joe Damato

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).