netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: netdev@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>, Jiri Pirko <jiri@resnulli.us>,
	Madhu Chittim <madhu.chittim@intel.com>,
	Sridhar Samudrala <sridhar.samudrala@intel.com>,
	Paolo Abeni <pabeni@redhat.com>
Subject: [RFC] HW TX Rate Limiting Driver API
Date: Fri, 5 Apr 2024 11:23:13 +0100	[thread overview]
Message-ID: <20240405102313.GA310894@kernel.org> (raw)

Hi,

This is follow-up to the ongoing discussion started by Intel to extend the
support for TX shaping H/W offload [1].

The goal is allowing the user-space to configure TX shaping offload on a
per-queue basis with min guaranteed B/W, max B/W limit and burst size on a
VF device.


In the past few months several different solutions were attempted and
discussed, without finding a perfect fit:

- devlink_rate APIs are not appropriate for to control TX shaping on netdevs
- No existing TC qdisc offload covers the required feature set
- HTB does not allow direct queue configuration
- MQPRIO imposes constraint on the maximum number of TX queues
- TBF does not support max B/W limit
- ndo_set_tx_maxrate() only controls the max B/W limit

A new H/W offload API is needed, but offload API proliferation should be
avoided.

The following proposal intends to cover the above specified requirement and
provide a possible base to unify all the shaping offload APIs mentioned above.

The following only defines the in-kernel interface between the core and
drivers. The intention is to expose the feature to user-space via Netlink.
Hopefully the latter part should be straight-forward after agreement
on the in-kernel interface.

All feedback and comment is more then welcome!

[1] https://lore.kernel.org/netdev/20230808015734.1060525-1-wenjun1.wu@intel.com/

Regards,
Simon with much assistance from Paolo

--- 
/* SPDX-License-Identifier: GPL-2.0-or-later */

#ifndef _NET_SHAPER_H_
#define _NET_SHAPER_H_

/**
 * enum shaper_metric - the metric of the shaper
 * @SHAPER_METRIC_PPS: Shaper operates on a packets per second basis
 * @SHAPER_METRIC_BPS: Shaper operates on a bits per second basis
 */
enum shaper_metric {
	SHAPER_METRIC_PPS;
	SHAPER_METRIC_BPS;
};

#define SHAPER_ROOT_ID 0
#define SHAPER_NONE_ID UINT_MAX

/**
 * struct shaper_info - represent a node of the shaper hierarchy
 * @id: Unique identifier inside the shaper tree.
 * @parent_id: ID of parent shaper, or SHAPER_NONE_ID if the shaper has
 *             no parent. Only the root shaper has no parent.
 * @metric: Specify if the bw limits refers to PPS or BPS
 * @bw_min: Minimum guaranteed rate for this shaper
 * @bw_max: Maximum peak bw allowed for this shaper
 * @burst: Maximum burst for the peek rate of this shaper
 * @priority: Scheduling priority for this shaper
 * @weight: Scheduling weight for this shaper
 *
 * The full shaper hierarchy is maintained only by the
 * NIC driver (or firmware), possibly in a NIC-specific format
 * and/or in H/W tables.
 * The kernel uses this representation and the shaper_ops to
 * access, traverse, and update it.
 */
struct shaper_info {
	/* The following fields allow the full traversal of the whole
	 * hierarchy.
	 */
	u32 id;
	u32 parent_id;

	/* The following fields define the behavior of the shaper. */
	enum shaper_metric metric;
	u64 bw_min;
	u64 bw_max;
	u32 burst;
	u32 priority;
	u32 weight;
};

/**
 * enum shaper_lookup_mode - Lookup method used to access a shaper
 * @SHAPER_LOOKUP_BY_PORT: The root shaper for the whole H/W, @id is unused
 * @SHAPER_LOOKUP_BY_NETDEV: The main shaper for the given network device,
 *                           @id is unused
 * @SHAPER_LOOKUP_BY_VF: @id is a virtual function number.
 * @SHAPER_LOOKUP_BY_QUEUE: @id is a queue identifier.
 * @SHAPER_LOOKUP_BY_TREE_ID: @id is the unique shaper identifier inside the
 *                            shaper hierarchy as in shaper_info.id
 *
 * SHAPER_LOOKUP_BY_PORT and SHAPER_LOOKUP_BY_VF, SHAPER_LOOKUP_BY_TREE_ID are
 * only available on PF devices, usually inside the host/hypervisor.
 * SHAPER_LOOKUP_BY_NETDEV is available on both PFs and VFs devices, but
 * only if the latter are privileged ones.
 * The same shaper can be reached with different lookup mode/id pairs,
 * mapping network visible objects (devices, VFs, queues) to the scheduler
 * hierarchy and vice-versa.
 */
enum shaper_lookup_mode {
    SHAPER_LOOKUP_BY_PORT,
    SHAPER_LOOKUP_BY_NETDEV,
    SHAPER_LOOKUP_BY_VF,
    SHAPER_LOOKUP_BY_QUEUE,
    SHAPER_LOOKUP_BY_TREE_ID,
};


/**
 * struct shaper_ops - Operations on shaper hierarchy
 * @get: Access the specified shaper.
 * @set: Modify the specifier shaper.
 * @move: Move the specifier shaper inside the hierarchy.
 * @add: Add a shaper inside the shaper hierarchy.
 * @delete: Delete the specified shaper .
 *
 * The netdevice exposes a pointer to these ops.
 *
 * It’s up to the driver or firmware to create the default shapers hierarchy,
 * according to the H/W capabilities.
 */
struct shaper_ops {
	/* get - Fetch the specified shaper, if it exists
	 * @dev: Netdevice to operate on.
	 * @lookup_mode: How to perform the shaper lookup
	 * @id: ID of the specified shaper,
	 *      relative to the specified @lookup_mode.
	 * @shaper: Object to return shaper.
	 * @extack: Netlink extended ACK for reporting errors.
	 *
	 * Multiple placement domain/id pairs can refer to the same shaper.
	 * And multiple entities (e.g. VF and PF) can try to access the same
	 * shaper concurrently.
	 *
	 * Values of @id depend on the @access_type:
	 * * If @access_type is SHAPER_LOOKUP_BY_PORT or
	 *   SHAPER_LOOKUP_BY_NETDEV, then @placement_id is unused.
	 * * If @access_type is SHAPER_LOOKUP_BY_VF,
	 *   then @id is a virtual function number, relative to @dev
	 *   which should be phisical function
	 * * If @access_type is SHAPER_LOOKUP_BY_QUEUE,
	 *   Then @id represents the queue number, relative to @dev
	 * * If @access_type is SHAPER_LOOKUP_BY_TREE_ID,
	 *   then @id is a @shaper_info.id and any shaper inside the
	 *   hierarcy can be accessed directly.
	 *
	 * Return:
	 * * %0 - Success
	 * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
	 *		    or core for any reason. @extack should be set
	 *		    to text describing the reason.
	 * * Other negative error value on failure.
	 */
	int (*get)(struct net_device *dev,
		   enum shaper_lookup_mode lookup_mode, u32 id,
                   struct shaper_info *shaper, struct netlink_ext_ack *extack);

	/* set - Update the specified shaper, if it exists
	 * @dev: Netdevice to operate on.
	 * @lookup_mode: How to perform the shaper lookup
	 * @id: ID of the specified shaper,
	 *      relative to the specified @access_type.
	 * @shaper: Configuration of shaper.
	 * @extack: Netlink extended ACK for reporting errors.
	 *
	 * Configure the parameters of @shaper according to values supplied
	 * in the following fields:
	 * * @shaper.metric
	 * * @shaper.bw_min
	 * * @shaper.bw_max
	 * * @shaper.burst
	 * * @shaper.priority
	 * * @shaper.weight
	 * Values supplied in other fields of @shaper must be zero and,
	 * other than verifying that, are ignored.
	 *
	 * Return:
	 * * %0 - Success
	 * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
	 *		    or core for any reason. @extack should be set
	 *		    to text describing the reason.
	 * * Other negative error values on failure.
	 */
	int (*set)(struct net_device *dev,
		   enum shaper_lookup_mode lookup_mode, u32 id,
		   const struct shaper_info *shaper,
		   struct netlink_ext_ack *extack);

	/* Move - change the parent id of the specified shaper
	 * @dev: netdevice to operate on.
	 * @lookup_mode: how to perform the shaper lookup
	 * @id: ID of the specified shaper,
	 *                      relative to the specified @access_mode.
	 * @new_parent_id: new ID of the parent shapers,
	 *                      always relative to the SHAPER_LOOKUP_BY_TREE_ID
	 *                      lookup mode
	 * @extack: Netlink extended ACK for reporting errors.
	 *
	 * Move the specified shaper in the hierarchy replacing its
	 * current parent shaper with @new_parent_id
	 *
	 * Return:
	 * * %0 - Success
	 * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
	 *		    or core for any reason. @extack should be set
	 *		    to text describing the reason.
	 * * Other negative error values on failure.
	 */
	int (*move)(struct net_device *dev,
		    enum shaper_lookup_mode lookup_mode, u32 id,
		    u32 new_parent_id, struct netlink_ext_ack *extack);

	/* add - Add a shaper inside the shaper hierarchy
	 * @dev: netdevice to operate on.
	 * @shaper: configuration of shaper.
	 * @extack: Netlink extended ACK for reporting errors.
	 *
	 * @shaper.id must be set to SHAPER_NONE_ID as
	 * the id for the shaper will be automatically allocated.
	 * @shaper.parent_id determines where inside the shaper's tree
	 * this node is inserted.
	 *
	 * Return:
	 * * non-negative shaper id on success
	 * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
	 *		    or core for any reason. @extack should be set
	 *		    to text describing the reason.
	 * * Other negative error values on failure.
	 *
	 * Examples or reasons this operation may fail include:
	 * * H/W resources limits.
	 * * The parent is a ‘leaf’ node - attached to a queue.
	 * * Can’t respect the requested bw limits.
	 */
	int (*add)(struct net_device *dev, const struct shaper_info *shaper,
		   struct netlink_ext_ack *extack);

	/* delete - Add a shaper inside the shaper hierarchy
	 * @dev: netdevice to operate on.
	 * @lookup_mode: how to perform the shaper lookup
	 * @id: ID of the specified shaper,
	 *                      relative to the specified @access_type.
	 * @shaper: Object to return the deleted shaper configuration.
	 *              Ignored if NULL.
	 * @extack: Netlink extended ACK for reporting errors.
	 *
	 * Return:
	 * * %0 - Success
	 * * %-EOPNOTSUPP - Operation is not supported by hardware, driver,
	 *		    or core for any reason. @extack should be set
	 *		    to text describing the reason.
	 * * Other negative error values on failure.
	 */
	int (*delete)(struct net_device *dev,
		      enum shaper_lookup_mode lookup_mode,
		      u32 id, struct shaper_info *shaper,
		      struct netlink_ext_ack *extack);
};

/*
 * Examples:
 * - set shaping on a given queue
 *   struct shaper_info info = { // fill this };
 *   dev->shaper_ops->set(dev, SHAPER_LOOKUP_BY_QUEUE, queue_id, &info, NULL);
 *
 * - create a queue group with a queue group shaping limits.
 *   Assuming the following topology already exists:
 *                    < netdev shaper >
 *                      /           \
 *         <queue 0 shaper> . . .  <queue N shaper>
 *
 *   struct shaper_info pinfo, ginfo;
 *   dev->shaper_ops->get(dev, SHAPER_LOOKUP_BY_NETDEV, 0, &pinfo);
 *
 *   ginfo.parent_id = pinfo.id;
 *   // fill-in other shaper params...
 *   new_node_id = dev->shaper_ops->add(dev, &ginfo);
 *
 *   // now topology is:
 *   //                  <    netdev shaper    >
 *   //                  /            |        \
 *   //                 /             |        <newly created shaper>
 *   //                /              |
 *   // <queue 0 shaper> . . . <queue N shaper>
 *
 *   // move a shapers for queues 3..n out of such queue group
 *   for (i = 0; i <= 2; ++i)
 *           dev->shaper_ops->move(dev, SHAPER_LOOKUP_BY_QUEUE, i, new_node_id);
 *
 *   // now topology is:
 *   //                   < netdev shaper >
 *   //                   /              \
 *   //        <newly created shaper>   <queue 3 shaper> ... <queue n shaper>
 *   //         /                   \
 *   // <queue 0 shaper> ... <queue 2 shaper>
 */
#endif


             reply	other threads:[~2024-04-05 10:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05 10:23 Simon Horman [this message]
2024-04-05 13:33 ` [RFC] HW TX Rate Limiting Driver API Jamal Hadi Salim
2024-04-05 17:06   ` Paolo Abeni
2024-04-06 13:48     ` Jamal Hadi Salim
2024-04-10  9:41       ` Paolo Abeni
2024-04-05 14:34 ` John Fastabend
2024-04-05 16:25   ` Paolo Abeni
2024-04-09 22:32 ` Jakub Kicinski
2024-04-10  8:33   ` Paolo Abeni
2024-04-10 14:57     ` Jakub Kicinski
2024-04-11 15:58       ` Paolo Abeni
2024-04-11 16:03         ` Jakub Kicinski
2024-04-19 11:53           ` Paolo Abeni
2024-04-22 18:06             ` Jakub Kicinski
2024-04-23 17:25               ` Paolo Abeni
2024-04-24 23:57                 ` Jakub Kicinski
2024-04-11 23:51 ` Vinicius Costa Gomes
2024-04-12  4:39   ` John Fastabend
2024-04-22 11:30 ` Sunil Kovvuri Goutham
2024-04-23 14:07   ` Paolo Abeni
2024-04-23 15:56     ` Sunil Kovvuri Goutham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240405102313.GA310894@kernel.org \
    --to=horms@kernel.org \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=madhu.chittim@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sridhar.samudrala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).