* [PATCH net-next v3 0/3] mlx4: Add support for netdev-genl API
@ 2024-05-02 21:26 Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Joe Damato @ 2024-05-02 21:26 UTC (permalink / raw)
To: linux-kernel, netdev, tariqt, saeedm
Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
Eric Dumazet, Jakub Kicinski,
open list:MELLANOX MLX4 core VPI driver, Paolo Abeni
Greetings:
Welcome to v3.
This series adds support to mlx4 for the netdev-genl API which makes it
much easier for users and user programs to map NAPI IDs back to
ifindexes, queues, and IRQs. This is extremely useful for a number of
use cases, including epoll-based busy poll.
In addition, this series includes a patch to generate per-queue
statistics using the netlink API, as well.
To facilitate the stats, patch 1/3 makes use of an existing field,
"dropped" which was already being exported in the ethtool stats by the
driver, but was never incremented. As of patch 1/3, it is now being
incremented by the driver in an appropriate place and used in patch 3/3
as alloc_fail.
Please note: I do not have access to mlx4 hardware, but I've been
working closely with Martin Karsten from University of Waterloo (CC'd)
who has very graciously tested my patches on their mlx4 hardware (hence
his Tested-by attribution in each commit). His latest research work is
particularly interesting [1] and this series helps to support that (and
future) work.
Martin re-tested v2 using Jakub's tool [2] and the stats.pkt_byte_sum
and stats.qstat_by_ifindex tests passed. The v3 is only a very minor
change from v2 and was not re-tested.
[1]: https://dl.acm.org/doi/pdf/10.1145/3626780
[2]: https://lore.kernel.org/lkml/20240423175718.4ad4dc5a@kernel.org/
Thanks,
Joe
v2 -> v3:
- Patch 1/3 no longer sets rx_missed_errors. dropped is still
incremented on -ENOMEM, though, and reported as alloc_fail in the
stats API introduced in patch 3/3.
- Patch 2/3: Added Jakub's Acked-by to the commit message, no
functional changes.
- Patch 3/3: Added Jakub's Reviewed-by to the commit message, no
functional changes.
v1 -> v2:
- Patch 1/3 now initializes dropped to 0.
- Patch 2/3 fix use of uninitialized qtype warning.
- Patch 3/3 includes several changes:
- mlx4_get_queue_stats_rx and mlx4_get_queue_stats_tx check if i is
valid before proceeding.
- All initialization to 0xff for stats fields has been omit. The
network stack does this before calling into the driver functions, so
I've adjusted the driver functions to only set values if there is
data to set, leaving the network stack's 0xff in place if not.
- mlx4_get_base_stats set all stat fields to 0 individually if there
are RX and TX queues.
Joe Damato (3):
net/mlx4: Track RX allocation failures in a stat
net/mlx4: link NAPI instances to queues and IRQs
net/mlx4: support per-queue statistics via netlink
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 14 ++++
.../net/ethernet/mellanox/mlx4/en_netdev.c | 79 +++++++++++++++++++
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 4 +-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 +
4 files changed, 97 insertions(+), 1 deletion(-)
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat
2024-05-02 21:26 [PATCH net-next v3 0/3] mlx4: Add support for netdev-genl API Joe Damato
@ 2024-05-02 21:26 ` Joe Damato
2024-05-07 15:44 ` Larysa Zaremba
2024-05-02 21:26 ` [PATCH net-next v3 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
2 siblings, 1 reply; 6+ messages in thread
From: Joe Damato @ 2024-05-02 21:26 UTC (permalink / raw)
To: linux-kernel, netdev, tariqt, saeedm
Cc: mkarsten, gal, nalramli, Joe Damato, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list:MELLANOX MLX4 core VPI driver
mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
fails but does not increment a stat field when this occurs.
struct mlx4_en_rx_ring has a dropped field which is tabulated in
mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.
This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
dropped field for the -ENOMEM case.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 8328df8645d5..573ae10300c7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
for (i = 0; i < priv->num_frags; i++, frags++) {
if (!frags->page) {
- if (mlx4_alloc_page(priv, frags, gfp))
+ if (mlx4_alloc_page(priv, frags, gfp)) {
+ ring->dropped++;
return -ENOMEM;
+ }
ring->rx_alloc_pages++;
}
rx_desc->data[i].addr = cpu_to_be64(frags->dma +
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v3 2/3] net/mlx4: link NAPI instances to queues and IRQs
2024-05-02 21:26 [PATCH net-next v3 0/3] mlx4: Add support for netdev-genl API Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-05-02 21:26 ` Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
2 siblings, 0 replies; 6+ messages in thread
From: Joe Damato @ 2024-05-02 21:26 UTC (permalink / raw)
To: linux-kernel, netdev, tariqt, saeedm
Cc: mkarsten, gal, nalramli, Joe Damato, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni,
open list:MELLANOX MLX4 core VPI driver
Make mlx4 compatible with the newly added netlink queue GET APIs.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 14 ++++++++++++++
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 +
2 files changed, 15 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 1184ac5751e1..461cc2c79c71 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -126,6 +126,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
cq_idx = cq_idx % priv->rx_ring_num;
rx_cq = priv->rx_cq[cq_idx];
cq->vector = rx_cq->vector;
+ irq = mlx4_eq_get_irq(mdev->dev, cq->vector);
}
if (cq->type == RX)
@@ -142,18 +143,23 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
if (err)
goto free_eq;
+ cq->cq_idx = cq_idx;
cq->mcq.event = mlx4_en_cq_event;
switch (cq->type) {
case TX:
cq->mcq.comp = mlx4_en_tx_irq;
netif_napi_add_tx(cq->dev, &cq->napi, mlx4_en_poll_tx_cq);
+ netif_napi_set_irq(&cq->napi, irq);
napi_enable(&cq->napi);
+ netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_TX, &cq->napi);
break;
case RX:
cq->mcq.comp = mlx4_en_rx_irq;
netif_napi_add(cq->dev, &cq->napi, mlx4_en_poll_rx_cq);
+ netif_napi_set_irq(&cq->napi, irq);
napi_enable(&cq->napi);
+ netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_RX, &cq->napi);
break;
case TX_XDP:
/* nothing regarding napi, it's shared with rx ring */
@@ -189,6 +195,14 @@ void mlx4_en_destroy_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq **pcq)
void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
{
if (cq->type != TX_XDP) {
+ enum netdev_queue_type qtype;
+
+ if (cq->type == RX)
+ qtype = NETDEV_QUEUE_TYPE_RX;
+ else
+ qtype = NETDEV_QUEUE_TYPE_TX;
+
+ netif_queue_set_napi(cq->dev, cq->cq_idx, qtype, NULL);
napi_disable(&cq->napi);
netif_napi_del(&cq->napi);
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index efe3f97b874f..896f985549a4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -379,6 +379,7 @@ struct mlx4_en_cq {
#define MLX4_EN_OPCODE_ERROR 0x1e
const struct cpumask *aff_mask;
+ int cq_idx;
};
struct mlx4_en_port_profile {
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH net-next v3 3/3] net/mlx4: support per-queue statistics via netlink
2024-05-02 21:26 [PATCH net-next v3 0/3] mlx4: Add support for netdev-genl API Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
@ 2024-05-02 21:26 ` Joe Damato
2 siblings, 0 replies; 6+ messages in thread
From: Joe Damato @ 2024-05-02 21:26 UTC (permalink / raw)
To: linux-kernel, netdev, tariqt, saeedm
Cc: mkarsten, gal, nalramli, Joe Damato, Jakub Kicinski,
David S. Miller, Eric Dumazet, Paolo Abeni,
open list:MELLANOX MLX4 core VPI driver
Make mlx4 compatible with the newly added netlink queue stats API.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
---
.../net/ethernet/mellanox/mlx4/en_netdev.c | 79 +++++++++++++++++++
1 file changed, 79 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 5d3fde63b273..6875f8c5103a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -43,6 +43,7 @@
#include <net/vxlan.h>
#include <net/devlink.h>
#include <net/rps.h>
+#include <net/netdev_queues.h>
#include <linux/mlx4/driver.h>
#include <linux/mlx4/device.h>
@@ -3099,6 +3100,83 @@ void mlx4_en_set_stats_bitmap(struct mlx4_dev *dev,
last_i += NUM_PHY_STATS;
}
+static void mlx4_get_queue_stats_rx(struct net_device *dev, int i,
+ struct netdev_queue_stats_rx *stats)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ const struct mlx4_en_rx_ring *ring;
+
+ spin_lock_bh(&priv->stats_lock);
+
+ if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+ goto out_unlock;
+
+ if (i < 0 || i >= priv->rx_ring_num)
+ goto out_unlock;
+
+ ring = priv->rx_ring[i];
+ stats->packets = READ_ONCE(ring->packets);
+ stats->bytes = READ_ONCE(ring->bytes);
+ stats->alloc_fail = READ_ONCE(ring->dropped);
+
+out_unlock:
+ spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_queue_stats_tx(struct net_device *dev, int i,
+ struct netdev_queue_stats_tx *stats)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+ const struct mlx4_en_tx_ring *ring;
+
+ spin_lock_bh(&priv->stats_lock);
+
+ if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+ goto out_unlock;
+
+ if (i < 0 || i >= priv->tx_ring_num[TX])
+ goto out_unlock;
+
+ ring = priv->tx_ring[TX][i];
+ stats->packets = READ_ONCE(ring->packets);
+ stats->bytes = READ_ONCE(ring->bytes);
+
+out_unlock:
+ spin_unlock_bh(&priv->stats_lock);
+}
+
+static void mlx4_get_base_stats(struct net_device *dev,
+ struct netdev_queue_stats_rx *rx,
+ struct netdev_queue_stats_tx *tx)
+{
+ struct mlx4_en_priv *priv = netdev_priv(dev);
+
+ spin_lock_bh(&priv->stats_lock);
+
+ if (!priv->port_up || mlx4_is_master(priv->mdev->dev))
+ goto out_unlock;
+
+ if (priv->rx_ring_num) {
+ rx->packets = 0;
+ rx->bytes = 0;
+ rx->alloc_fail = 0;
+ }
+
+ if (priv->tx_ring_num[TX]) {
+ tx->packets = 0;
+ tx->bytes = 0;
+ }
+
+out_unlock:
+ spin_unlock_bh(&priv->stats_lock);
+}
+
+static const struct netdev_stat_ops mlx4_stat_ops = {
+ .get_queue_stats_rx = mlx4_get_queue_stats_rx,
+ .get_queue_stats_tx = mlx4_get_queue_stats_tx,
+ .get_base_stats = mlx4_get_base_stats,
+};
+
int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
struct mlx4_en_port_profile *prof)
{
@@ -3262,6 +3340,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
netif_set_real_num_tx_queues(dev, priv->tx_ring_num[TX]);
netif_set_real_num_rx_queues(dev, priv->rx_ring_num);
+ dev->stat_ops = &mlx4_stat_ops;
dev->ethtool_ops = &mlx4_en_ethtool_ops;
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat
2024-05-02 21:26 ` [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
@ 2024-05-07 15:44 ` Larysa Zaremba
2024-05-07 15:53 ` Joe Damato
0 siblings, 1 reply; 6+ messages in thread
From: Larysa Zaremba @ 2024-05-07 15:44 UTC (permalink / raw)
To: Joe Damato
Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list:MELLANOX MLX4 core VPI driver
On Thu, May 02, 2024 at 09:26:25PM +0000, Joe Damato wrote:
> mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
> fails but does not increment a stat field when this occurs.
>
> struct mlx4_en_rx_ring has a dropped field which is tabulated in
> mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.
>
> This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
> dropped field for the -ENOMEM case.
>
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
> ---
> drivers/net/ethernet/mellanox/mlx4/en_rx.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 8328df8645d5..573ae10300c7 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
>
> for (i = 0; i < priv->num_frags; i++, frags++) {
> if (!frags->page) {
> - if (mlx4_alloc_page(priv, frags, gfp))
> + if (mlx4_alloc_page(priv, frags, gfp)) {
> + ring->dropped++;
> return -ENOMEM;
> + }
Correct me if I'm wrong, but ring->dropped is added to rx_dropped stats in
mlx4_en_DUMP_ETH_STATS(). You have already established with Jakub that
allocation error does not mean dropped packet, but the counter contributes to
dropped packets stats.
Also, I do not think that using a `dropped` counter for something that does not
neccessarily result in a dropped packet is plain confusing.
> ring->rx_alloc_pages++;
> }
> rx_desc->data[i].addr = cpu_to_be64(frags->dma +
> --
> 2.25.1
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat
2024-05-07 15:44 ` Larysa Zaremba
@ 2024-05-07 15:53 ` Joe Damato
0 siblings, 0 replies; 6+ messages in thread
From: Joe Damato @ 2024-05-07 15:53 UTC (permalink / raw)
To: Larysa Zaremba
Cc: linux-kernel, netdev, tariqt, saeedm, mkarsten, gal, nalramli,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list:MELLANOX MLX4 core VPI driver
On Tue, May 07, 2024 at 05:44:27PM +0200, Larysa Zaremba wrote:
> On Thu, May 02, 2024 at 09:26:25PM +0000, Joe Damato wrote:
> > mlx4_en_alloc_frags currently returns -ENOMEM when mlx4_alloc_page
> > fails but does not increment a stat field when this occurs.
> >
> > struct mlx4_en_rx_ring has a dropped field which is tabulated in
> > mlx4_en_DUMP_ETH_STATS, but never incremented by the driver.
> >
> > This change modifies mlx4_en_alloc_frags to increment mlx4_en_rx_ring's
> > dropped field for the -ENOMEM case.
> >
> > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
> > ---
> > drivers/net/ethernet/mellanox/mlx4/en_rx.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > index 8328df8645d5..573ae10300c7 100644
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> > @@ -82,8 +82,10 @@ static int mlx4_en_alloc_frags(struct mlx4_en_priv *priv,
> >
> > for (i = 0; i < priv->num_frags; i++, frags++) {
> > if (!frags->page) {
> > - if (mlx4_alloc_page(priv, frags, gfp))
> > + if (mlx4_alloc_page(priv, frags, gfp)) {
> > + ring->dropped++;
> > return -ENOMEM;
> > + }
>
> Correct me if I'm wrong, but ring->dropped is added to rx_dropped stats in
> mlx4_en_DUMP_ETH_STATS(). You have already established with Jakub that
> allocation error does not mean dropped packet, but the counter contributes to
> dropped packets stats.
>
> Also, I do not think that using a `dropped` counter for something that does not
> neccessarily result in a dropped packet is plain confusing.
Fair enough; I could add a new field called "alloc_fail" to
mlx4_en_rx_ring and increment that instead as it is, according to the
earlier thread, an alloc_fail as far as netdev-genl is concerned, I
think.
FWIW: I had spoken to Mellanox about this off list many weeks ago and
they had agreed at that time to this general approach. I haven't heard
from them in some time, but I am open to something else if that's the
blocker here.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-05-07 15:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-02 21:26 [PATCH net-next v3 0/3] mlx4: Add support for netdev-genl API Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 1/3] net/mlx4: Track RX allocation failures in a stat Joe Damato
2024-05-07 15:44 ` Larysa Zaremba
2024-05-07 15:53 ` Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 2/3] net/mlx4: link NAPI instances to queues and IRQs Joe Damato
2024-05-02 21:26 ` [PATCH net-next v3 3/3] net/mlx4: support per-queue statistics via netlink Joe Damato
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).