All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, kuba@kernel.org, jiri@nvidia.com,
	amcohen@nvidia.com, danieller@nvidia.com, mlxsw@nvidia.com,
	roopa@nvidia.com, dsahern@gmail.com, andrew@lunn.ch,
	f.fainelli@gmail.com, vivien.didelot@gmail.com,
	saeedm@nvidia.com, tariqt@nvidia.com, ayal@nvidia.com,
	eranbe@nvidia.com, mkubecek@suse.cz,
	Ido Schimmel <idosch@nvidia.com>
Subject: [RFC PATCH net-next 0/6] devlink: Add device metric support
Date: Mon, 17 Aug 2020 15:50:53 +0300	[thread overview]
Message-ID: <20200817125059.193242-1-idosch@idosch.org> (raw)

From: Ido Schimmel <idosch@nvidia.com>

This patch set extends devlink to allow device drivers to expose device
metrics to user space in a standard and extensible fashion, as opposed
to the driver-specific debugfs approach.

This a joint work with Amit Cohen and Danielle Ratson during a two-day
company hackathon.

Motivation
==========

Certain devices have metrics (e.g., diagnostic counters, histograms)
that are useful to expose to user space for testing and debugging
purposes.

Currently, there is no standardized interface through which these
metrics can be exposed and most drivers resort to debugfs, which is not
very welcome in the networking subsystem and for good reasons. For one,
it is not a stable interface on which users can rely. Secondly, it
results in duplicated code and inconsistent interfaces in case drivers
are implementing similar functionality.

While Ethernet drivers can expose per-port counters to user space via
ethtool, they cannot expose device-wide metrics or configurable metrics
such as counters that can be enabled / disabled or histogram agents.

Solution overview
=================

Currently, the only supported metric type is a counter, but histograms
will be added in the future. The current interface is:

devlink dev metric show [ DEV metric METRIC | group GROUP ]
devlink dev metric set DEV metric METRIC [ group GROUP ]

Device drivers can dynamically register or unregister their metrics with
devlink by calling devlink_metric_counter_create() /
devlink_metric_destroy().

Grouping allows user space to group certain metrics together so that
they can be queried from the kernel using one request and retrieved in a
single response filtered by the kernel (i.e., kernel sets
NLM_F_DUMP_FILTERED).

Example
=======

Instantiate two netdevsim devices:

# echo "10 1" > /sys/bus/netdevsim/new_device
# echo "20 1" > /sys/bus/netdevsim/new_device

Dump all available metrics:

# devlink -s dev metric show
netdevsim/netdevsim10:
   metric dummy_counter type counter group 0 value 2
netdevsim/netdevsim20:
   metric dummy_counter type counter group 0 value 2

Dump a specific metric:

# devlink -s dev metric show netdevsim/netdevsim10 metric dummy_counter
netdevsim/netdevsim10:
   metric dummy_counter type counter group 0 value 3

Set a metric to a different group:

# devlink dev metric set netdevsim/netdevsim10 metric dummy_counter group 10

Dump all metrics in a specific group:

# devlink -s dev metric show group 10
netdevsim/netdevsim10:
   metric dummy_counter type counter group 10 value 4

Future extensions
=================

1. Enablement and disablement of metrics. This is useful in case the
metric adds latency when enabled or consumes limited resources (e.g.,
counters or histogram agents). It is up to the device driver to decide
if a metric is enabled by default or not. Proposed interface:

devlink dev metric set DEV metric METRIC [ group GROUP ]
	[ enable { true | false } ]

2. Histogram metrics. Some devices have the ability to calculate
histograms in hardware by sampling a specific parameter multiple times
per second. For example, the transmission queue depth of a port. This
enables the debugging of microbursts which would otherwise be invisible.
While this can be achieved in software using BPF, it is not applicable
when the data plane is offloaded as the CPU does not see the traffic.
Proposed interface:

devlink dev metric set DEV metric METRIC [ group GROUP ]
	[ enable { true | false } ] [ hist_type { linear | exp } ]
	[ hist_sample_interval SAMPLE ] [ hist_min MIN ] [ hist_max MAX ]
	[ hist_buckets BUCKETS ]

3. Per-port metrics. While all the metrics can be exposed as global and
namespaced as per-port by naming them accordingly, there is value in
allowing user space to dump all metrics related to a certain port.
Proposed interface:

devlink port metric set DEV/PORT_INDEX metric METRIC [ group GROUP ]
	[ enable { true | false } ] [ hist_type { linear | exp } ]
	[ hist_sample_interval SAMPLE ] [ hist_min MIN ] [ hist_max MAX ]
	[ hist_buckets BUCKETS ]
devlink port metric show [ DEV/PORT_INDEX metric METRIC | group GROUP ]

To avoid duplicating ethtool functionality we can decide to expose via
this interface only:

1. Configurable metrics
2. Metrics that are not only relevant to Ethernet ports

TODO
====

1. Add devlink-metric man page
2. Add selftests over mlxsw

Ido Schimmel (6):
  devlink: Add device metric infrastructure
  netdevsim: Add devlink metric support
  selftests: netdevsim: Add devlink metric tests
  mlxsw: reg: Add Tunneling NVE Counters Register
  mlxsw: reg: Add Tunneling NVE Counters Register Version 2
  mlxsw: spectrum_nve: Expose VXLAN counters via devlink-metric

 .../networking/devlink/devlink-metric.rst     |  37 ++
 Documentation/networking/devlink/index.rst    |   1 +
 Documentation/networking/devlink/mlxsw.rst    |  36 ++
 drivers/net/ethernet/mellanox/mlxsw/reg.h     | 104 ++++++
 .../ethernet/mellanox/mlxsw/spectrum_nve.h    |  10 +
 .../mellanox/mlxsw/spectrum_nve_vxlan.c       | 285 +++++++++++++++
 drivers/net/netdevsim/dev.c                   |  92 ++++-
 drivers/net/netdevsim/netdevsim.h             |   1 +
 include/net/devlink.h                         |  18 +
 include/uapi/linux/devlink.h                  |  19 +
 net/core/devlink.c                            | 346 ++++++++++++++++++
 .../drivers/net/netdevsim/devlink.sh          |  49 ++-
 12 files changed, 995 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-metric.rst

-- 
2.26.2


             reply	other threads:[~2020-08-17 12:52 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-17 12:50 Ido Schimmel [this message]
2020-08-17 12:50 ` [RFC PATCH net-next 1/6] devlink: Add device metric infrastructure Ido Schimmel
2020-08-17 14:12   ` Andrew Lunn
2020-08-17 12:50 ` [RFC PATCH net-next 2/6] netdevsim: Add devlink metric support Ido Schimmel
2020-08-17 12:50 ` [RFC PATCH net-next 3/6] selftests: netdevsim: Add devlink metric tests Ido Schimmel
2020-08-17 12:50 ` [RFC PATCH net-next 4/6] mlxsw: reg: Add Tunneling NVE Counters Register Ido Schimmel
2020-08-17 12:50 ` [RFC PATCH net-next 5/6] mlxsw: reg: Add Tunneling NVE Counters Register Version 2 Ido Schimmel
2020-08-17 12:50 ` [RFC PATCH net-next 6/6] mlxsw: spectrum_nve: Expose VXLAN counters via devlink-metric Ido Schimmel
2020-08-17 14:29   ` Andrew Lunn
2020-08-18  6:59     ` Ido Schimmel
2020-08-19  0:24 ` [RFC PATCH net-next 0/6] devlink: Add device metric support Jakub Kicinski
2020-08-19  2:43   ` David Ahern
2020-08-19  3:35     ` Jakub Kicinski
2020-08-19  4:30       ` Florian Fainelli
2020-08-19 16:18         ` Jakub Kicinski
2020-08-19 17:20           ` Florian Fainelli
2020-08-19 18:07             ` Jakub Kicinski
2020-08-20 14:35               ` David Ahern
2020-08-20 16:09                 ` Jakub Kicinski
2020-08-21 10:30                   ` Ido Schimmel
2020-08-21 16:53                     ` Jakub Kicinski
2020-08-21 19:12                       ` David Ahern
2020-08-21 23:50                         ` Jakub Kicinski
2020-08-21 23:59                           ` David Ahern
2020-08-22  0:37                             ` Jakub Kicinski
2020-08-22  1:18                               ` David Ahern
2020-08-22 16:27                                 ` Jakub Kicinski
2020-08-23  7:04                                   ` Ido Schimmel
2020-08-24 19:11                                     ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200817125059.193242-1-idosch@idosch.org \
    --to=idosch@idosch.org \
    --cc=amcohen@nvidia.com \
    --cc=andrew@lunn.ch \
    --cc=ayal@nvidia.com \
    --cc=danieller@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=eranbe@nvidia.com \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@nvidia.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=mlxsw@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.