All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@nvidia.com>
To: <netdev@vger.kernel.org>, <davem@davemloft.net>, <kuba@kernel.org>
Cc: <jgg@nvidia.com>, <dledford@redhat.com>,
	<linux-rdma@vger.kernel.org>, <parav@nvidia.com>,
	<saeedm@nvidia.com>, Yishai Hadas <yishaih@nvidia.com>,
	Jiri Pirko <jiri@nvidia.com>
Subject: [PATCH net-next RESEND 1/2] devlink: Expose port function commands to control roce
Date: Tue, 2 Feb 2021 10:06:13 +0200	[thread overview]
Message-ID: <20210202080614.37903-2-yishaih@nvidia.com> (raw)
In-Reply-To: <20210202080614.37903-1-yishaih@nvidia.com>

Expose port function commands to turn on / off roce, this is used to
control the port roce device capabilities.

When roce is disabled for a function of the port, function cannot create
any roce specific resources (e.g GID table).
It also saves system memory utilization. For example disabling roce on a
VF/SF saves 1 Mbytes of system memory per function.

Example of a PCI VF port which supports function configuration:
Set roce of the VF's port function.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
    function:
        hw_addr 00:00:00:00:00:00 roce on

$ devlink port function set pci/0000:06:00.0/2 roce off

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
    function:
        hw_addr 00:11:22:33:44:55 roce off

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 .../networking/devlink/devlink-port.rst       |  5 +-
 include/net/devlink.h                         | 22 +++++++
 include/uapi/linux/devlink.h                  |  1 +
 net/core/devlink.c                            | 63 +++++++++++++++++++
 4 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
index e99b41599465..541e19f9d256 100644
--- a/Documentation/networking/devlink/devlink-port.rst
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -110,7 +110,7 @@ devlink ports for both the controllers.
 Function configuration
 ======================
 
-A user can configure the function attribute before enumerating the PCI
+A user can configure one or more function attributes before enumerating the PCI
 function. Usually it means, user should configure function attribute
 before a bus specific device for the function is created. However, when
 SRIOV is enabled, virtual function devices are created on the PCI bus.
@@ -122,6 +122,9 @@ A user may set the hardware address of the function using
 'devlink port function set hw_addr' command. For Ethernet port function
 this means a MAC address.
 
+A user may set also the roce capability of the function using
+'devlink port function set roce' command.
+
 Subfunction
 ============
 
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 47b4b063401b..055280212b58 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1451,6 +1451,28 @@ struct devlink_ops {
 				 struct devlink_port *port,
 				 enum devlink_port_fn_state state,
 				 struct netlink_ext_ack *extack);
+	/**
+	 * @port_fn_roce_get: Port function's roce get function.
+	 *
+	 * Should be used by device drivers to report the roce state of
+	 * a function managed by the devlink port. Driver should return
+	 * -EOPNOTSUPP if it doesn't support port function handling for
+	 * a particular port.
+	 */
+	int (*port_fn_roce_get)(struct devlink *devlink,
+				struct devlink_port *port, bool *on,
+				struct netlink_ext_ack *extack);
+	/**
+	 * @port_fn_roce_set: Port function's roce set function.
+	 *
+	 * Should be used by device drivers to enable/disable the roce state of
+	 * a function managed by the devlink port. Driver should return
+	 * -EOPNOTSUPP if it doesn't support port function handling for
+	 * a particular port.
+	 */
+	int (*port_fn_roce_set)(struct devlink *devlink,
+				struct devlink_port *port, bool on,
+				struct netlink_ext_ack *extack);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index f6008b2fa60f..77990b563d80 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -585,6 +585,7 @@ enum devlink_port_function_attr {
 	DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR,	/* binary */
 	DEVLINK_PORT_FN_ATTR_STATE,	/* u8 */
 	DEVLINK_PORT_FN_ATTR_OPSTATE,	/* u8 */
+	DEVLINK_PORT_FN_ATTR_ROCE,	/* u8 */
 
 	__DEVLINK_PORT_FUNCTION_ATTR_MAX,
 	DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 737b61c2976e..d04318e79dc2 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -90,6 +90,7 @@ static const struct nla_policy devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_
 	[DEVLINK_PORT_FN_ATTR_STATE] =
 		NLA_POLICY_RANGE(NLA_U8, DEVLINK_PORT_FN_STATE_INACTIVE,
 				 DEVLINK_PORT_FN_STATE_ACTIVE),
+	[DEVLINK_PORT_FN_ATTR_ROCE] = NLA_POLICY_RANGE(NLA_U8, 0, 1),
 };
 
 static LIST_HEAD(devlink_list);
@@ -724,6 +725,34 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg,
 	return 0;
 }
 
+static int devlink_port_function_roce_fill(struct devlink *devlink,
+					   const struct devlink_ops *ops,
+					   struct devlink_port *port,
+					   struct sk_buff *msg,
+					   struct netlink_ext_ack *extack,
+					   bool *msg_updated)
+{
+	bool on;
+	int err;
+
+	if (!ops->port_fn_roce_get)
+		return 0;
+
+	err = ops->port_fn_roce_get(devlink, port, &on, extack);
+	if (err) {
+		if (err == -EOPNOTSUPP)
+			return 0;
+		return err;
+	}
+
+	err = nla_put_u8(msg, DEVLINK_PORT_FN_ATTR_ROCE, on);
+	if (err)
+		return err;
+
+	*msg_updated = true;
+	return 0;
+}
+
 static int
 devlink_port_fn_hw_addr_fill(struct devlink *devlink, const struct devlink_ops *ops,
 			     struct devlink_port *port, struct sk_buff *msg,
@@ -820,6 +849,12 @@ devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *por
 					   extack, &msg_updated);
 	if (err)
 		goto out;
+
+	err = devlink_port_function_roce_fill(devlink, ops, port, msg, extack,
+					      &msg_updated);
+	if (err)
+		goto out;
+
 	err = devlink_port_fn_state_fill(devlink, ops, port, msg, extack,
 					 &msg_updated);
 out:
@@ -1054,6 +1089,26 @@ static int devlink_port_type_set(struct devlink *devlink,
 	return -EOPNOTSUPP;
 }
 
+static int
+devlink_port_fn_roce_set(struct devlink *devlink, struct devlink_port *port,
+			 const struct nlattr *attr,
+			 struct netlink_ext_ack *extack)
+{
+	const struct devlink_ops *ops;
+	bool on;
+
+	on = nla_get_u8(attr);
+
+	ops = devlink->ops;
+	if (!ops->port_fn_roce_set) {
+		NL_SET_ERR_MSG_MOD(extack,
+				   "Port doesn't support roce function attribute");
+		return -EOPNOTSUPP;
+	}
+
+	return ops->port_fn_roce_set(devlink, port, on, extack);
+}
+
 static int
 devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port *port,
 				  const struct nlattr *attr, struct netlink_ext_ack *extack)
@@ -1126,6 +1181,14 @@ devlink_port_function_set(struct devlink *devlink, struct devlink_port *port,
 		if (err)
 			return err;
 	}
+
+	attr = tb[DEVLINK_PORT_FN_ATTR_ROCE];
+	if (attr) {
+		err = devlink_port_fn_roce_set(devlink, port, attr, extack);
+		if (err)
+			return err;
+	}
+
 	/* Keep this as the last function attribute set, so that when
 	 * multiple port function attributes are set along with state,
 	 * Those can be applied first before activating the state.
-- 
2.18.1


  reply	other threads:[~2021-02-02  8:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02  8:06 [PATCH net-next RESEND 0/2] devlink: Add port function attribute to enable/disable roce Yishai Hadas
2021-02-02  8:06 ` Yishai Hadas [this message]
2021-02-02  8:06 ` [PATCH net-next RESEND 2/2] net/mlx5: E-Switch, Implement devlink port function cmds to control roce Yishai Hadas
2021-02-02 19:11   ` Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210202080614.37903-2-yishaih@nvidia.com \
    --to=yishaih@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.