netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND net-next v3 00/18] devlink: rate objects API
@ 2021-06-02 12:17 dlinkin
  2021-06-02 12:17 ` [PATCH RESEND net-next v3 01/18] netdevsim: Add max_vfs to bus_dev dlinkin
                   ` (19 more replies)
  0 siblings, 20 replies; 28+ messages in thread
From: dlinkin @ 2021-06-02 12:17 UTC (permalink / raw)
  To: netdev
  Cc: davem, kuba, jiri, stephen, dsahern, vladbu, parav, huyn, Dmytro Linkin

From: Dmytro Linkin <dlinkin@nvidia.com>

Resending without RFC.

Currently kernel provides a way to change tx rate of single VF in
switchdev mode via tc-police action. When lots of VFs are configured
management of theirs rates becomes non-trivial task and some grouping
mechanism is required. Implementing such grouping in tc-police will bring
flow related limitations and unwanted complications, like:
- tc-police is a policer and there is a user request for a traffic
  shaper, so shared tc-police action is not suitable;
- flows requires net device to be placed on, means "groups" wouldn't
  have net device instance itself. Taking into the account previous
  point was reviewed a sollution, when representor have a policer and
  the driver use a shaper if qdisc contains group of VFs - such approach
  ugly, compilated and misleading;
- TC is ingress only, while configuring "other" side of the wire looks
  more like a "real" picture where shaping is outside of the steering
  world, similar to "ip link" command;

According to that devlink is the most appropriate place.

This series introduces devlink API for managing tx rate of single devlink
port or of a group by invoking callbacks (see below) of corresponding
driver. Also devlink port or a group can be added to the parent group,
where driver responsible to handle rates of a group elements. To achieve
all of that new rate object is added. It can be one of the two types:
- leaf - represents a single devlink port; created/destroyed by the
  driver and bound to the devlink port. As example, some driver may
  create leaf rate object for every devlink port associated with VF.
  Since leaf have 1to1 mapping to it's devlink port, in user space it is
  referred as pci/<bus_addr>/<port_index>;
- node - represents a group of rate objects; created/deleted by request
  from the userspace; initially empty (no rate objects added). In
  userspace it is referred as pci/<bus_addr>/<node_name>, where node name
  can be any, except decimal number, to avoid collisions with leafs.

devlink_ops extended with following callbacks:
- rate_{leaf|node}_tx_{share|max}_set
- rate_node_{new|del}
- rate_{leaf|node}_parent_set

KAPI provides:
- creation/destruction of the leaf rate object associated with devlink
  port
- destruction of rate nodes to allow a vendor driver to free allocated
  resources on driver removal or due to the other reasons when nodes
  destruction required

UAPI provides:
- dumping all or single rate objects
- setting tx_{share|max} of rate object of any type
- creating/deleting node rate object
- setting/unsetting parent of any rate object

Added devlink rate object support for netdevsim driver

Issues/open questions:
- Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
  children of particular parent node? For example:
  $ devlink port function rate flush netdevsim/netdevsim10/group
- priv pointer passed to the callbacks is a source of bugs; in leaf case
  driver can embed rate object into internal structure and use
  container_of() on it; in node case it cannot be done since nodes are
  created from userspace

v1->v2:
- fixed kernel-doc for devlink_rate_leaf_{create|destroy}()
- s/func/function/ for all devlink port command occurences

v2->v3:
- devlink:
  - added devlink_rate_nodes_destroy() function
- netdevsim:
  - added call of devlink_rate_nodes_destroy() function

Dmytro Linkin (18):
  netdevsim: Add max_vfs to bus_dev
  netdevsim: Disable VFs on nsim_dev_reload_destroy() call
  netdevsim: Implement port types and indexing
  netdevsim: Implement VFs
  netdevsim: Implement legacy/switchdev mode for VFs
  devlink: Introduce rate object
  netdevsim: Register devlink rate leaf objects per VF
  selftest: netdevsim: Add devlink rate test
  devlink: Allow setting tx rate for devlink rate leaf objects
  netdevsim: Implement devlink rate leafs tx rate support
  selftest: netdevsim: Add devlink port shared/max tx rate test
  devlink: Introduce rate nodes
  netdevsim: Implement support for devlink rate nodes
  selftest: netdevsim: Add devlink rate nodes test
  devlink: Allow setting parent node of rate objects
  netdevsim: Allow setting parent node of rate objects
  selftest: netdevsim: Add devlink rate grouping test
  Documentation: devlink rate objects

 Documentation/networking/devlink/devlink-port.rst  |  35 ++
 Documentation/networking/devlink/netdevsim.rst     |  26 +
 drivers/net/netdevsim/bus.c                        | 131 +++-
 drivers/net/netdevsim/dev.c                        | 396 ++++++++++++-
 drivers/net/netdevsim/netdev.c                     |  95 ++-
 drivers/net/netdevsim/netdevsim.h                  |  48 ++
 include/net/devlink.h                              |  48 ++
 include/uapi/linux/devlink.h                       |  17 +
 net/core/devlink.c                                 | 660 ++++++++++++++++++++-
 .../selftests/drivers/net/netdevsim/devlink.sh     | 167 +++++-
 10 files changed, 1565 insertions(+), 58 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 28+ messages in thread
* [PATCH RESEND iproute2 net-next 0/4] devlink rate support
@ 2021-06-08 11:22 dlinkin
  2021-06-08 11:22 ` [PATCH RESEND iproute2 net-next 4/4] devlink: Add ISO/IEC switch dlinkin
  0 siblings, 1 reply; 28+ messages in thread
From: dlinkin @ 2021-06-08 11:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, jiri, dsahern, stephen, vladbu, parav, huyn, dlinkin

From: Dmytro Linkin <dlinkin@nvidia.com>

Resend rebased on top of net-next.

Serries implements devlink rate commands, which are:
- Dump particular or all rate objects (JSON or non-JSON)
- Add/Delete node rate object
- Set tx rate share/max values for rate object
- Set/Unset parent rate object for other rate object

Examples:

Display all rate objects:

    # devlink port function rate show
    pci/0000:03:00.0/1 type leaf parent some_group
    pci/0000:03:00.0/2 type leaf tx_share 12Mbit
    pci/0000:03:00.0/some_group type node tx_share 1Gbps tx_max 5Gbps

Display leaf rate object bound to the 1st devlink port of the
pci/0000:03:00.0 device:

    # devlink port function rate show pci/0000:03:00.0/1
    pci/0000:03:00.0/1 type leaf

Display node rate object with name some_group of the pci/0000:03:00.0
device:

    # devlink port function rate show pci/0000:03:00.0/some_group
    pci/0000:03:00.0/some_group type node

Display leaf rate object rate values using IEC units:

    # devlink -i port function rate show pci/0000:03:00.0/2
    pci/0000:03:00.0/2 type leaf 11718Kibit

Display pci/0000:03:00.0/2 leaf rate object as pretty JSON output:

    # devlink -jp port function rate show pci/0000:03:00.0/2
    {
        "rate": {
            "pci/0000:03:00.0/2": {
                "type": "leaf",
                "tx_share": 1500000
            }
        }
    }

Create node rate object with name "1st_group" on pci/0000:03:00.0 device:

    # devlink port function rate add pci/0000:03:00.0/1st_group

Create node rate object with specified parameters:

    # devlink port function rate add pci/0000:03:00.0/2nd_group \
        tx_share 10Mbit tx_max 30Mbit parent 1st_group

Set parameters to the specified leaf rate object:

    # devlink port function rate set pci/0000:03:00.0/1 \
        tx_share 2Mbit tx_max 10Mbit

Set leaf's parent to "1st_group":

    # devlink port function rate set pci/0000:03:00.0/1 parent 1st_group

Unset leaf's parent:

    # devlink port function rate set pci/0000:03:00.0/1 noparent

Delete node rate object:

    # devlink port function rate del pci/0000:03:00.0/2nd_group

Rate values can be specified in bits or bytes per second (bit|bps), with
any SI (k, m, g, t) or IEC (ki, mi, gi, ti) prefix. Bare number means
bits per second. Units also printed in "show" command output, but not
necessarily the same which were specified with "set" or "add" command.
-i/--iec switch force output in IEC units. JSON output always print
values as bytes per sec.

Dmytro Linkin (4):
  uapi: update devlink kernel header
  devlink: Add helper function to validate object handler
  devlink: Add port func rate support
  devlink: Add ISO/IEC switch

 devlink/devlink.c            | 527 ++++++++++++++++++++++++++++++++++++++++---
 include/uapi/linux/devlink.h |  17 ++
 man/man8/devlink-port.8      |   8 +
 man/man8/devlink-rate.8      | 270 ++++++++++++++++++++++
 man/man8/devlink.8           |   4 +
 5 files changed, 797 insertions(+), 29 deletions(-)
 create mode 100644 man/man8/devlink-rate.8

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-06-08 11:23 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02 12:17 [PATCH RESEND net-next v3 00/18] devlink: rate objects API dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 01/18] netdevsim: Add max_vfs to bus_dev dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 02/18] netdevsim: Disable VFs on nsim_dev_reload_destroy() call dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 03/18] netdevsim: Implement port types and indexing dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 04/18] netdevsim: Implement VFs dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 05/18] netdevsim: Implement legacy/switchdev mode for VFs dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 06/18] devlink: Introduce rate object dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 07/18] netdevsim: Register devlink rate leaf objects per VF dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 08/18] selftest: netdevsim: Add devlink rate test dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 09/18] devlink: Allow setting tx rate for devlink rate leaf objects dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 10/18] netdevsim: Implement devlink rate leafs tx rate support dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 11/18] selftest: netdevsim: Add devlink port shared/max tx rate test dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 12/18] devlink: Introduce rate nodes dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 13/18] netdevsim: Implement support for devlink " dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 14/18] selftest: netdevsim: Add devlink rate nodes test dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 15/18] devlink: Allow setting parent node of rate objects dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 16/18] netdevsim: " dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 17/18] selftest: netdevsim: Add devlink rate grouping test dlinkin
2021-06-02 12:17 ` [PATCH RESEND net-next v3 18/18] Documentation: devlink rate objects dlinkin
2021-06-02 12:31 ` [PATCH RESEND iproute2 net-next 0/4] devlink rate support Dmytro Linkin
2021-06-02 12:31   ` [PATCH RESEND iproute2 net-next 1/4] uapi: update devlink kernel header Dmytro Linkin
2021-06-02 12:31   ` [PATCH RESEND iproute2 net-next 2/4] devlink: Add helper function to validate object handler Dmytro Linkin
2021-06-02 12:31   ` [PATCH RESEND iproute2 net-next 3/4] devlink: Add port func rate support Dmytro Linkin
2021-06-02 12:31   ` [PATCH RESEND iproute2 net-next 4/4] devlink: Add ISO/IEC switch Dmytro Linkin
2021-06-02 16:58 ` [PATCH RESEND net-next v3 00/18] devlink: rate objects API Jakub Kicinski
2021-06-03  8:53   ` Dmytro Linkin
2021-06-04  1:59   ` Yunsheng Lin
2021-06-08 11:22 [PATCH RESEND iproute2 net-next 0/4] devlink rate support dlinkin
2021-06-08 11:22 ` [PATCH RESEND iproute2 net-next 4/4] devlink: Add ISO/IEC switch dlinkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).