Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: Parav Pandit <parav@nvidia.com>
To: <netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org>,
	<gregkh@linuxfoundation.org>
Cc: <jiri@nvidia.com>, <jgg@nvidia.com>, <dledford@redhat.com>,
	<leonro@nvidia.com>, <saeedm@nvidia.com>, <kuba@kernel.org>,
	<davem@davemloft.net>, Parav Pandit <parav@nvidia.com>
Subject: [PATCH net-next 00/13] Add mlx5 subfunction support
Date: Thu, 12 Nov 2020 21:24:10 +0200
Message-ID: <20201112192424.2742-1-parav@nvidia.com> (raw)

Hi Dave, Jakub, Greg,

This series introduces support for mlx5 subfunction (SF).
A subfunction is a portion of a PCI device that supports multiple
classes of devices such as netdev, RDMA and more.

This patchset is based on Leon's series [3].
It is a third user of proposed auxiliary bus [4].

Subfunction support is discussed in detail in RFC [1] and [2].
RFC [1] and extension [2] describes requirements, design, and proposed
plumbing using devlink, auxiliary bus and sysfs for systemd/udev
support.

Patch summary:
--------------
Patch 1 to 6 prepares devlink:
Patch-1 prepares code to handle multiple port function attributes
Patch-2 introduces devlink pcisf port flavour similar to pcipf and pcivf
Patch-3 adds port add and delete driver callbacks
Patch-4 adds port function state get and set callbacks
Patch-5 refactors devlink to avoid using global mutext
Patch-6 uses refcount to allow creating devlink instance from existing
one
Patch 7 to 13 implements mlx5 pieces for SF support.
Patch-7 adds SF auxiliary device
Patch-8 adds SF auxiliary driver
Patch-9 prepares eswitch to handler SF vport
PAtch-10 adds eswitch helpers to add/remove SF vport
Patch-11 adds SF device configuration commands
Patch-12 implements devlink port add/del callbacks
Patch-13 implements devlink port function get/set callbacks

More on SF plumbing below.

overview:
--------
A subfunction can be created and deleted by a user using devlink port
add/delete interface.

A subfunction can be configured using devlink port function attributes
before its activated.

When a subfunction is activated, it results in an auxiliary device.
A driver binds to the auxiliary device that further creates supported
class devices.

short example sequence:
-----------------------
Change device to switchdev mode:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev

Add a devlink port of subfunction flaovur:
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88

Configure mac address of the port function:
$ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88

Now activate the function:
$ devlink port function set ens2f0npf0sf88 state active

Now use the auxiliary device and class devices:
$ devlink dev show
pci/0000:06:00.0
auxiliary/mlx5_core.sf.4

$ ip link show
127: ens2f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 24:8a:07:b3:d1:12 brd ff:ff:ff:ff:ff:ff
    altname enp6s0f0np0
129: p0sf88: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:88:88 brd ff:ff:ff:ff:ff:ff

$ rdma dev show
43: rdmap6s0f0: node_type ca fw 16.28.1002 node_guid 248a:0703:00b3:d112 sys_image_guid 248a:0703:00b3:d112
44: mlx5_0: node_type ca fw 16.28.1002 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112

subfunction (SF) in detail:
---------------------------
- A sub-function is a portion of the PCI device which supports multiple
  classes of devices such as netdev, RDMA and more.
- A SF netdev has its own dedicated queues(txq, rxq).
- A SF RDMA device has its own QP1, GID table and other RDMA resources.
- A SF supports eswitch representation and tc offload support similar
  to existing PF and VF representors.
- User must configure eswitch to send/receive SF's packets.
- A SF shares PCI level resources with other SFs and/or with its
  parent PCI function.
  For example, an SF shares IRQ vectors with other SFs and its
  PCI function.
  In future it may have dedicated IRQ vector per SF.
  A SF has dedicated window in PCI BAR space that is not shared
  with other SFs or PF. This ensures that when a SF is assigned to
  an application, only that application can access device resources.
- SF's auxiliary device exposes sfnum sysfs attribute. This will be
  used by systemd/udev to deterministic names for its netdev and
  RDMA device.

[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2
[3] https://lists.linuxfoundation.org/pipermail/virtualization/2020-November/050473.html
[4] https://lore.kernel.org/linux-rdma/20201023003338.1285642-2-david.m.ertman@intel.com/

Parav Pandit (11):
  devlink: Prepare code to fill multiple port function attributes
  devlink: Introduce PCI SF port flavour and port attribute
  devlink: Support add and delete devlink port
  devlink: Support get and set state of port function
  devlink: Avoid global devlink mutex, use per instance reload lock
  devlink: Introduce devlink refcount to reduce scope of global
    devlink_mutex
  net/mlx5: SF, Add auxiliary device support
  net/mlx5: SF, Add auxiliary device driver
  net/mlx5: E-switch, Add eswitch helpers for SF vport
  net/mlx5: SF, Add port add delete functionality
  net/mlx5: SF, Port function state change support

Vu Pham (2):
  net/mlx5: E-switch, Prepare eswitch to handle SF vport
  net/mlx5: SF, Add SF configuration hardware commands

 .../net/ethernet/mellanox/mlx5/core/Kconfig   |  19 +
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   9 +
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |   4 +
 .../net/ethernet/mellanox/mlx5/core/devlink.c |   7 +
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |   2 +-
 .../mellanox/mlx5/core/esw/acl/egress_ofld.c  |   2 +-
 .../mellanox/mlx5/core/esw/devlink_port.c     |  41 ++
 .../net/ethernet/mellanox/mlx5/core/eswitch.c |  46 +-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |  82 +++
 .../mellanox/mlx5/core/eswitch_offloads.c     |  47 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  48 +-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  10 +
 .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  20 +
 .../net/ethernet/mellanox/mlx5/core/sf/cmd.c  |  48 ++
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 213 ++++++++
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.h  |  68 +++
 .../mellanox/mlx5/core/sf/dev/driver.c        | 105 ++++
 .../net/ethernet/mellanox/mlx5/core/sf/priv.h |  14 +
 .../net/ethernet/mellanox/mlx5/core/sf/sf.c   | 498 ++++++++++++++++++
 .../net/ethernet/mellanox/mlx5/core/sf/sf.h   |  59 +++
 .../net/ethernet/mellanox/mlx5/core/vport.c   |   3 +-
 include/linux/mlx5/driver.h                   |  12 +-
 include/net/devlink.h                         |  82 +++
 include/uapi/linux/devlink.h                  |  26 +
 net/core/devlink.c                            | 362 +++++++++++--
 25 files changed, 1754 insertions(+), 73 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h

-- 
2.26.2


             reply index

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 19:24 Parav Pandit [this message]
2020-11-12 19:24 ` [PATCH net-next 01/13] devlink: Prepare code to fill multiple port function attributes Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 02/13] devlink: Introduce PCI SF port flavour and port attribute Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 03/13] devlink: Support add and delete devlink port Parav Pandit
2020-11-18 16:21   ` David Ahern
2020-11-18 17:02     ` Parav Pandit
2020-11-18 18:03       ` David Ahern
2020-11-18 18:38         ` Jason Gunthorpe
2020-11-18 19:36           ` David Ahern
2020-11-18 20:42             ` Jason Gunthorpe
2020-11-18 19:22         ` Parav Pandit
2020-11-19  0:41           ` Jacob Keller
2020-11-19  1:17             ` David Ahern
2020-11-19  1:56               ` Samudrala, Sridhar
2020-11-19  0:52       ` Jacob Keller
2020-11-12 19:24 ` [PATCH net-next 04/13] devlink: Support get and set state of port function Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 05/13] devlink: Avoid global devlink mutex, use per instance reload lock Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 06/13] devlink: Introduce devlink refcount to reduce scope of global devlink_mutex Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 07/13] net/mlx5: SF, Add auxiliary device support Parav Pandit
2020-12-07  2:48   ` David Ahern
2020-12-07  4:53     ` Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 08/13] net/mlx5: SF, Add auxiliary device driver Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 09/13] net/mlx5: E-switch, Prepare eswitch to handle SF vport Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 10/13] net/mlx5: E-switch, Add eswitch helpers for " Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 11/13] net/mlx5: SF, Add SF configuration hardware commands Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 12/13] net/mlx5: SF, Add port add delete functionality Parav Pandit
2020-11-12 19:24 ` [PATCH net-next 13/13] net/mlx5: SF, Port function state change support Parav Pandit
2020-11-16 22:52 ` [PATCH net-next 00/13] Add mlx5 subfunction support Jakub Kicinski
2020-11-17  0:06   ` Saeed Mahameed
2020-11-17  1:58     ` Jakub Kicinski
2020-11-17  4:08       ` Parav Pandit
2020-11-17 17:11         ` Jakub Kicinski
2020-11-17 18:49           ` Jason Gunthorpe
2020-11-19  2:14             ` Jakub Kicinski
2020-11-19  4:35               ` David Ahern
2020-11-19  5:57                 ` Saeed Mahameed
2020-11-20  1:31                   ` Jakub Kicinski
2020-11-25  5:33                   ` David Ahern
2020-11-25  6:00                     ` Parav Pandit
2020-11-25 14:37                       ` David Ahern
2020-11-20  1:29                 ` Jakub Kicinski
2020-11-20 17:58                   ` Alexander Duyck
2020-11-20 19:04                     ` Samudrala, Sridhar
2020-11-23 21:51                       ` Saeed Mahameed
2020-11-24  7:01                       ` Jason Wang
2020-11-24  7:05                         ` Jason Wang
2020-11-19  6:12               ` Saeed Mahameed
2020-11-19  8:25                 ` Parav Pandit
2020-11-20  1:35                 ` Jakub Kicinski
2020-11-20  3:34                   ` Parav Pandit
2020-11-17 18:50           ` Parav Pandit
2020-11-19  2:23             ` Jakub Kicinski
2020-11-19  6:22               ` Saeed Mahameed
2020-11-19 14:00                 ` Jason Gunthorpe
2020-11-20  3:35                   ` Jakub Kicinski
2020-11-20  3:50                     ` Parav Pandit
2020-11-20 16:16                     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201112192424.2742-1-parav@nvidia.com \
    --to=parav@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@nvidia.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git