netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	David Ahern <dsahern@kernel.org>,
	Jacob Keller <jacob.e.keller@intel.com>,
	Sridhar Samudrala <sridhar.samudrala@intel.com>,
	david.m.ertman@intel.com, dan.j.williams@intel.com,
	kiran.patil@intel.com, gregkh@linuxfoundation.org,
	Parav Pandit <parav@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>
Subject: [net-next v5 15/15] net/mlx5: Add devlink subfunction port documentation
Date: Tue, 15 Dec 2020 01:03:58 -0800	[thread overview]
Message-ID: <20201215090358.240365-16-saeed@kernel.org> (raw)
In-Reply-To: <20201215090358.240365-1-saeed@kernel.org>

From: Parav Pandit <parav@nvidia.com>

Add documentation for subfunction management using devlink
port.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../device_drivers/ethernet/mellanox/mlx5.rst | 204 ++++++++++++++++++
 1 file changed, 204 insertions(+)

diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index a5eb22793bb9..07e38c044355 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -12,6 +12,8 @@ Contents
 - `Enabling the driver and kconfig options`_
 - `Devlink info`_
 - `Devlink parameters`_
+- `mlx5 subfunction`_
+- `mlx5 port function`_
 - `Devlink health reporters`_
 - `mlx5 tracepoints`_
 
@@ -181,6 +183,208 @@ User command examples:
       values:
          cmode driverinit value true
 
+mlx5 subfunction
+================
+mlx5 supports subfunctions management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
+
+A Subfunction has its own function capabilities and its own resources. This
+means a subfunction has its own dedicated queues(txq, rxq, cq, eq). These queues
+are neither shared nor stolen from the parent PCI function.
+
+When subfunction is RDMA capable, it has its own QP1, GID table and rdma
+resources neither shared nor stolen from the parent PCI function.
+
+A subfunction has dedicated window in PCI BAR space that is not shared
+with the other subfunctions or parent PCI function. This ensures that all
+class devices of the subfunction accesses only assigned PCI BAR space.
+
+A Subfunction supports eswitch representation through which it supports tc
+offloads. User must configure eswitch to send/receive packets from/to
+subfunction port.
+
+Subfunctions share PCI level resources such as PCI MSI-X IRQs with
+the other subfunctions and/or with its parent PCI function.
+
+Example mlx5 software, system and device view::
+
+       _______
+      | admin |
+      | user  |----------
+      |_______|         |
+          |             |
+      ____|____       __|______            _________________
+     |         |     |         |          |                 |
+     | devlink |     | tc tool |          |    user         |
+     | tool    |     |_________|          | applications    |
+     |_________|         |                |_________________|
+           |             |                   |          |
+           |             |                   |          |         Userspace
+ +---------|-------------|-------------------|----------|--------------------+
+           |             |           +----------+   +----------+   Kernel
+           |             |           |  netdev  |   | rdma dev |
+           |             |           +----------+   +----------+
+   (devlink port add/del |              ^               ^
+    port function set)   |              |               |
+           |             |              +---------------|
+      _____|___          |              |        _______|_______
+     |         |         |              |       | mlx5 class    |
+     | devlink |   +------------+       |       |   drivers     |
+     | kernel  |   | rep netdev |       |       |(mlx5_core,ib) |
+     |_________|   +------------+       |       |_______________|
+           |             |              |               ^
+   (devlink ops)         |              |          (probe/remove)
+  _________|________     |              |           ____|________
+ | subfunction      |    |     +---------------+   | subfunction |
+ | management driver|-----     | subfunction   |---|  driver     |
+ | (mlx5_core)      |          | auxiliary dev |   | (mlx5_core) |
+ |__________________|          +---------------+   |_____________|
+           |                                            ^
+  (sf add/del, vhca events)                             |
+           |                                      (device add/del)
+      _____|____                                    ____|________
+     |          |                                  | subfunction |
+     |  PCI NIC |---- activate/deactive events---->| host driver |
+     |__________|                                  | (mlx5_core) |
+                                                   |_____________|
+
+Subfunction is created using devlink port interface.
+
+- Change device to switchdev mode::
+
+    $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
+
+- Add a devlink port of subfunction flavour::
+
+    $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
+
+- Show a devlink port of the subfunction::
+
+    $ devlink port show pci/0000:06:00.0/32768
+    pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88
+      function:
+        hw_addr 00:00:00:00:00:00
+
+- Delete a devlink port of subfunction after use::
+
+    $ devlink port del pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
+
+mlx5 port function
+==================
+mlx5 driver provides mechanism to setup PCI VF/SF port function
+attributes in unified way for smartnic and non-smartnic NICs.
+
+This is supported only when eswitch mode is set to switchdev. Port function
+configuration of the PCI VF/SF is supported through devlink eswitch port.
+
+Port function attributes should be set before PCI VF/SF is enumerated by the
+driver.
+
+MAC address setup
+-----------------
+mlx5 driver provides mechanism to setup the MAC address of the PCI VF/SF.
+
+Configured MAC address of the PCI VF/SF will be used by netdevice and rdma
+device created for the PCI VF/SF.
+
+- Get MAC address of the VF identified by its unique devlink port index::
+
+    $ devlink port show pci/0000:06:00.0/2
+    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+      function:
+        hw_addr 00:00:00:00:00:00
+
+- Set MAC address of the VF identified by its unique devlink port index::
+
+    $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55
+
+    $ devlink port show pci/0000:06:00.0/2
+    pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+      function:
+        hw_addr 00:11:22:33:44:55
+
+- Get MAC address of the SF identified by its unique devlink port index::
+
+    $ devlink port show pci/0000:06:00.0/32768
+    pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88
+      function:
+        hw_addr 00:00:00:00:00:00
+
+- Set MAC address of the VF identified by its unique devlink port index::
+
+    $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88
+
+    $ devlink port show pci/0000:06:00.0/32768
+    pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcivf pfnum 0 sfnum 88
+      function:
+        hw_addr 00:00:00:00:88:88
+
+SF state setup
+--------------
+To use the SF, user must active the SF using SF port function state attribute.
+
+- Get state of the SF identified by its unique devlink port index::
+
+   $ devlink port show ens2f0npf0sf88
+   pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+     function:
+       hw_addr 00:00:00:00:88:88 state inactive opstate detached
+
+- Activate the function and verify its state is active::
+
+   $ devlink port function set ens2f0npf0sf88 state active
+
+   $ devlink port show ens2f0npf0sf88
+   pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+     function:
+       hw_addr 00:00:00:00:88:88 state active opstate detached
+
+Upon function activation, PF driver instance gets the event from the device that
+particular SF was activated. It's the cue to put the device on bus, probe it and
+instantiate devlink instance and class specific auxiliary devices for it.
+
+- Show the auxiliary device and port of the subfunction::
+
+    $ devlink dev show
+    devlink dev show auxiliary/mlx5_core.sf.4
+
+    $ devlink port show auxiliary/mlx5_core.sf.4/1
+    auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false
+
+    $ rdma link show mlx5_0/1
+    link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88
+
+    $ rdma dev show
+    8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112
+    13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112
+
+- Subfunction auxiliary device and class device hierarchy::
+
+                 mlx5_core.sf.4
+          (subfunction auxiliary device)
+                       /\
+                      /  \
+                     /    \
+                    /      \
+                   /        \
+      mlx5_core.eth.4     mlx5_core.rdma.4
+     (sf eth aux dev)     (sf rdma aux dev)
+         |                      |
+         |                      |
+      p0sf88                  mlx5_0
+     (sf netdev)          (sf rdma device)
+
+Additionally SF port also gets the event when the driver attaches to the
+auxiliary device of the subfunction. This results in changing the operational
+state of the function. This provides visibility to user to decide when it is
+safe to delete the SF port for graceful termination of the subfunction.
+
+- Show the SF port operational state::
+
+    $ devlink port show ens2f0npf0sf88
+    pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+      function:
+        hw_addr 00:00:00:00:88:88 state active opstate attached
+
 Devlink health reporters
 ========================
 
-- 
2.26.2


      parent reply	other threads:[~2020-12-15  9:06 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-15  9:03 [net-next v5 00/15] Add mlx5 subfunction support Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 01/15] net/mlx5: Fix compilation warning for 32-bit platform Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 02/15] devlink: Prepare code to fill multiple port function attributes Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 03/15] devlink: Introduce PCI SF port flavour and port attribute Saeed Mahameed
2020-12-15 23:27   ` Jakub Kicinski
2020-12-16  3:42     ` Parav Pandit
2020-12-16 23:59       ` Jakub Kicinski
2020-12-17  4:44         ` Saeed Mahameed
2020-12-18 19:48           ` Jakub Kicinski
2020-12-19  4:43             ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 04/15] devlink: Support add and delete devlink port Saeed Mahameed
2020-12-16  0:29   ` Jakub Kicinski
2020-12-16  5:06     ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 05/15] devlink: Support get and set state of port function Saeed Mahameed
2020-12-16  0:37   ` Jakub Kicinski
2020-12-16  5:15     ` Parav Pandit
2020-12-16 16:15       ` David Ahern
2020-12-17  0:08       ` Jakub Kicinski
2020-12-17  5:46         ` Parav Pandit
2020-12-18 19:51           ` Jakub Kicinski
2020-12-19  5:06             ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 06/15] net/mlx5: Introduce vhca state event notifier Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 07/15] net/mlx5: SF, Add auxiliary device support Saeed Mahameed
2020-12-16  0:43   ` Jakub Kicinski
2020-12-16  5:19     ` Parav Pandit
2020-12-17  0:11       ` Jakub Kicinski
2020-12-17  5:23         ` Parav Pandit
2020-12-18 19:58           ` Jakub Kicinski
2020-12-19  4:53             ` Parav Pandit
2020-12-19 17:43               ` Jakub Kicinski
2020-12-15  9:03 ` [net-next v5 08/15] net/mlx5: SF, Add auxiliary device driver Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 09/15] net/mlx5: E-switch, Prepare eswitch to handle SF vport Saeed Mahameed
2020-12-16  0:47   ` Jakub Kicinski
2020-12-16  5:28     ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 10/15] net/mlx5: E-switch, Add eswitch helpers for " Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 11/15] net/mlx5: SF, Add port add delete functionality Saeed Mahameed
2020-12-16  0:51   ` Jakub Kicinski
2020-12-16  5:31     ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 12/15] net/mlx5: SF, Port function state change support Saeed Mahameed
2020-12-15  9:03 ` [net-next v5 13/15] devlink: Add devlink port documentation Saeed Mahameed
2020-12-16  0:57   ` Jakub Kicinski
2020-12-16  5:40     ` Parav Pandit
2020-12-15  9:03 ` [net-next v5 14/15] devlink: Extend devlink port documentation for subfunctions Saeed Mahameed
2020-12-16  1:00   ` Jakub Kicinski
2020-12-16  3:55     ` Parav Pandit
2020-12-15  9:03 ` Saeed Mahameed [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201215090358.240365-16-saeed@kernel.org \
    --to=saeed@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=david.m.ertman@intel.com \
    --cc=dsahern@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jgg@nvidia.com \
    --cc=kiran.patil@intel.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=sridhar.samudrala@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).