linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next 00/10] IB SR-IOV support
@ 2016-03-01 16:52 Eli Cohen
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen, netdev-u79uwXL29TY76Z2rM5mHXA

Hi Doug, Dave 

The following series adds support for managing SRIOV IB VFs in a standard 
way (rtnetlink, iproute2) through IPoIB ndo entries which translate to 
corresponding verbs calls. 

In IB networks, 64 bit GUIDs are used as the primary means of identification. 
To support that for VFs, we added a set_vf_guid ndo which is used to program 
the VF node and port GUID from the PF. 

Those verbs are implemented by the mlx5 driver along with some more changes 
needed in the driver, IPoIB and the IB core to support IB virtualization.

We've copied netdev only on the 1st patch of the series, as the rest of it just 
uses the patch along with the existing ndos in IPoIB plus add things which 
are internal to the IB stack. 

The series is rebased against 4.5-rc6 and Meny's patches:
http://www.spinics.net/lists/linux-rdma/msg33536.html and also assumes
Leon's patch that extends the kernel ib device attr caps field to u64.

Eli 

Eli Cohen (9):
  net/core: Add support for configuring VF GUIDs
  IB/mlx5: Fix decision on using MAD_IFC
  IB/core: Support accessing SA in virtualized environment
  IB/core: Add interfaces to control VF attributes
  IB/ipoib: Add ndo operations for configuring VFs
  net/mlx5_core: Add VF param when querying vport counter
  net/mlx5_core: Implement modify HCA vport command
  IB/mlx5: Implement callbacks for manipulating VFs
  IB/ipoib: Allow mcast packets from other VFs

Or Gerlitz (1):
  IB/core: Use GRH when the path hop-limit > 0

 drivers/infiniband/core/device.c                |   1 +
 drivers/infiniband/core/sa_query.c              |   7 +-
 drivers/infiniband/core/verbs.c                 |  40 +++++
 drivers/infiniband/hw/mlx5/Makefile             |   2 +-
 drivers/infiniband/hw/mlx5/ib_virt.c            | 194 ++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/mad.c                |   2 +-
 drivers/infiniband/hw/mlx5/main.c               |  12 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h            |   8 +
 drivers/infiniband/ulp/ipoib/ipoib_ib.c         |  29 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c       |  65 +++++++-
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c   |   6 +
 drivers/net/ethernet/mellanox/mlx5/core/vport.c |  72 ++++++++-
 include/linux/mlx5/driver.h                     |   5 +-
 include/linux/mlx5/mlx5_ifc.h                   |   6 +
 include/linux/mlx5/vport.h                      |   7 +-
 include/linux/netdevice.h                       |   3 +
 include/rdma/ib_verbs.h                         |  25 +++
 include/uapi/linux/if_link.h                    |   7 +
 net/core/rtnetlink.c                            |  79 +++++++++-
 19 files changed, 547 insertions(+), 23 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/ib_virt.c

-- 
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-01 16:52   ` Eli Cohen
       [not found]     ` <1456851143-138332-2-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2021-10-26 15:16     ` Eugene Syromiatnikov
  2016-03-01 16:52   ` [PATCH for-next 02/10] IB/mlx5: Fix decision on using MAD_IFC Eli Cohen
                     ` (8 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen, netdev-u79uwXL29TY76Z2rM5mHXA

Add two new NLAs to support configuration of Infiniband node or port
GUIDs. New applications can choose to use this interface to configure
GUIDs with iproute2 with commands such as:

ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78

For backwards compatibility, old applications which set the MAC of a VF
may set the VF's port GUID for an infiniband port also via set MAC. The
GUID will be generated from the 6-byte MAC per IETF RFC 7042. Note that
when using set MAC to set a port GUID, the node GUID is set as well (to
the port guid value).

A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the
request to change the GUID.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 include/linux/netdevice.h    |  3 ++
 include/uapi/linux/if_link.h |  7 ++++
 net/core/rtnetlink.c         | 79 ++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 83 insertions(+), 6 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 5440b7b705eb..7b4ae218b90b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1147,6 +1147,9 @@ struct net_device_ops {
 						   struct nlattr *port[]);
 	int			(*ndo_get_vf_port)(struct net_device *dev,
 						   int vf, struct sk_buff *skb);
+	int			(*ndo_set_vf_guid)(struct net_device *dev,
+						   int vf, u64 guid,
+						   int guid_type);
 	int			(*ndo_set_vf_rss_query_en)(
 						   struct net_device *dev,
 						   int vf, bool setting);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index a30b78090594..1d01e8a4e5dd 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -556,6 +556,8 @@ enum {
 				 */
 	IFLA_VF_STATS,		/* network device statistics */
 	IFLA_VF_TRUST,		/* Trust VF */
+	IFLA_VF_IB_NODE_GUID,	/* VF Infiniband node GUID */
+	IFLA_VF_IB_PORT_GUID,	/* VF Infiniband port GUID */
 	__IFLA_VF_MAX,
 };
 
@@ -588,6 +590,11 @@ struct ifla_vf_spoofchk {
 	__u32 setting;
 };
 
+struct ifla_vf_guid {
+	__u32 vf;
+	__u64 guid;
+};
+
 enum {
 	IFLA_VF_LINK_STATE_AUTO,	/* link state of the uplink */
 	IFLA_VF_LINK_STATE_ENABLE,	/* link always up */
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d735e854f916..9db6e5bde786 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1387,6 +1387,8 @@ static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
 	[IFLA_VF_RSS_QUERY_EN]	= { .len = sizeof(struct ifla_vf_rss_query_en) },
 	[IFLA_VF_STATS]		= { .type = NLA_NESTED },
 	[IFLA_VF_TRUST]		= { .len = sizeof(struct ifla_vf_trust) },
+	[IFLA_VF_IB_NODE_GUID]	= { .len = sizeof(struct ifla_vf_guid) },
+	[IFLA_VF_IB_PORT_GUID]	= { .len = sizeof(struct ifla_vf_guid) },
 };
 
 static const struct nla_policy ifla_vf_stats_policy[IFLA_VF_STATS_MAX + 1] = {
@@ -1534,6 +1536,58 @@ static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[])
 	return 0;
 }
 
+static int handle_infiniband_guid(struct net_device *dev, struct ifla_vf_guid *ivt,
+				  int guid_type)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	return ops->ndo_set_vf_guid(dev, ivt->vf, ivt->guid, guid_type);
+}
+
+static int handle_vf_guid(struct net_device *dev, struct ifla_vf_guid *ivt, int guid_type)
+{
+	if (dev->type != ARPHRD_INFINIBAND)
+		return -EOPNOTSUPP;
+
+	return handle_infiniband_guid(dev, ivt, guid_type);
+}
+
+static int handle_vf_mac(struct net_device *dev, struct ifla_vf_mac *ivm)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	struct ifla_vf_guid ivt;
+	u8 *s = ivm->mac;
+	u8 d[8];
+	int err;
+
+	if (dev->type != ARPHRD_INFINIBAND) {
+		if (!ops->ndo_set_vf_mac)
+			return -EOPNOTSUPP;
+
+		return ops->ndo_set_vf_mac(dev, ivm->vf, ivm->mac);
+	}
+
+	if (!ops->ndo_set_vf_guid)
+		return -EOPNOTSUPP;
+
+	d[0] = s[0];
+	d[1] = s[1];
+	d[2] = s[2];
+	d[3] = 0xff;
+	d[4] = 0xfe;
+	d[5] = s[3];
+	d[6] = s[4];
+	d[7] = s[5];
+
+	ivt.vf = ivm->vf;
+	ivt.guid = be64_to_cpu(*(__be64 *)d);
+	err = handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_NODE_GUID);
+	if (err)
+		return err;
+
+	return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);
+}
+
 static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
@@ -1542,12 +1596,7 @@ static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
 	if (tb[IFLA_VF_MAC]) {
 		struct ifla_vf_mac *ivm = nla_data(tb[IFLA_VF_MAC]);
 
-		err = -EOPNOTSUPP;
-		if (ops->ndo_set_vf_mac)
-			err = ops->ndo_set_vf_mac(dev, ivm->vf,
-						  ivm->mac);
-		if (err < 0)
-			return err;
+		return handle_vf_mac(dev, ivm);
 	}
 
 	if (tb[IFLA_VF_VLAN]) {
@@ -1636,6 +1685,24 @@ static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
 			return err;
 	}
 
+	if (tb[IFLA_VF_IB_NODE_GUID]) {
+		struct ifla_vf_guid *ivt = nla_data(tb[IFLA_VF_IB_NODE_GUID]);
+
+		if (!ops->ndo_set_vf_guid)
+			return -EOPNOTSUPP;
+
+		return handle_vf_guid(dev, ivt, IFLA_VF_IB_NODE_GUID);
+	}
+
+	if (tb[IFLA_VF_IB_PORT_GUID]) {
+		struct ifla_vf_guid *ivt = nla_data(tb[IFLA_VF_IB_PORT_GUID]);
+
+		if (!ops->ndo_set_vf_guid)
+			return -EOPNOTSUPP;
+
+		return handle_vf_guid(dev, ivt, IFLA_VF_IB_PORT_GUID);
+	}
+
 	return err;
 }
 
-- 
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 02/10] IB/mlx5: Fix decision on using MAD_IFC
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-03-01 16:52   ` [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment Eli Cohen
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Fix the condition that dictates when MAD_IFC should be used. According
to firmware specifications, MAD_IFC commands must be used only if the
ib_virt capability is off.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 03c418ccbc98..399049b70a72 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -283,7 +283,7 @@ __be16 mlx5_get_roce_udp_sport(struct mlx5_ib_dev *dev, u8 port_num,
 
 static int mlx5_use_mad_ifc(struct mlx5_ib_dev *dev)
 {
-	return !dev->mdev->issi;
+	return !MLX5_CAP_GEN(dev->mdev, ib_virt);
 }
 
 enum {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-03-01 16:52   ` [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 02/10] IB/mlx5: Fix decision on using MAD_IFC Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
       [not found]     ` <1456851143-138332-4-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-03-01 16:52   ` [PATCH for-next 04/10] IB/core: Add interfaces to control VF attributes Eli Cohen
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Per the ongoing standardisation process, when virtual HCAs are present
in a network, traffic is routed based on a destination GID. In order to
access the SA we use the well known SA GID.

We also add a GRH required boolean field to the port attributes which is
used to report to the verbs consumer whether this port is connected to a
virtual network. We use this field to realize whether we need to create
an address vector with GRH to access the subnet administrator. We clear
the port attributes struct before calling the hardware driver to make
sure the default remains that GRH is not required.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/device.c   | 1 +
 drivers/infiniband/core/sa_query.c | 5 +++++
 include/rdma/ib_verbs.h            | 6 ++++++
 3 files changed, 12 insertions(+)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 00da80e02154..24926acd4bcd 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -652,6 +652,7 @@ int ib_query_port(struct ib_device *device,
 	if (port_num < rdma_start_port(device) || port_num > rdma_end_port(device))
 		return -EINVAL;
 
+	memset(port_attr, 0, sizeof(*port_attr));
 	return device->query_port(device, port_num, port_attr);
 }
 EXPORT_SYMBOL(ib_query_port);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index f334090bb612..833d2a99a311 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -886,6 +886,11 @@ static void update_sm_ah(struct work_struct *work)
 	ah_attr.dlid     = port_attr.sm_lid;
 	ah_attr.sl       = port_attr.sm_sl;
 	ah_attr.port_num = port->port_num;
+	if (port_attr.grh_required) {
+		ah_attr.ah_flags = IB_AH_GRH;
+		ah_attr.grh.dgid.global.subnet_prefix = cpu_to_be64(IB_SA_WELL_KNOWN_GID_PREFIX);
+		ah_attr.grh.dgid.global.interface_id = cpu_to_be64(IB_SA_WELL_KNOWN_GUID);
+	}
 
 	new_ah->ah = ib_create_ah(port->agent->qp->pd, &ah_attr);
 	if (IS_ERR(new_ah->ah)) {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 284b00c8fea4..5c1f11742c65 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -97,6 +97,11 @@ enum rdma_node_type {
 	RDMA_NODE_USNIC_UDP,
 };
 
+enum {
+	IB_SA_WELL_KNOWN_GID_PREFIX	= 0xfe80000000000000ull,
+	IB_SA_WELL_KNOWN_GUID		= 2,
+};
+
 enum rdma_transport_type {
 	RDMA_TRANSPORT_IB,
 	RDMA_TRANSPORT_IWARP,
@@ -508,6 +513,7 @@ struct ib_port_attr {
 	u8			active_width;
 	u8			active_speed;
 	u8                      phys_state;
+	bool			grh_required;
 };
 
 enum ib_device_modify_flags {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 04/10] IB/core: Add interfaces to control VF attributes
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 05/10] IB/ipoib: Add ndo operations for configuring VFs Eli Cohen
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Following the practice exercised for network devices which allow the PF
net device to configure attributes of its virtual functions, we
introduce the following functions to be used by IPoIB which is the
network driver implementation for IB devices.

ib_set_vf_link_state - set the policy for a VF link. More below.
ib_get_vf_config - read configuration information of a VF
ib_get_vf_stats - read VF statistics
ib_set_vf_guid - set the node or port GUID of a VF

Also add an indication in the device cap flags that indicates that this
IB devices is based on a virtual function.

A VF shares the physical port with the PF and other VFs. When setting
the link state we have three options:

1. Auto - in this mode, the virtual port follows the state of the
   physical port and becomes active only if the physical port's state is
   active. In all other cases it remains in a Down state.
2. Down - sets the state of the virtual port to Down
3. Up - causes the virtual port to transition into Initialize state if
   it was not already in this state. A virtualization aware subnet manager
   can then bring the state of the port into the Active state.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 40 ++++++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         | 19 +++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 5af6d024e053..d8e70ff68456 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1551,6 +1551,46 @@ int ib_check_mr_status(struct ib_mr *mr, u32 check_mask,
 }
 EXPORT_SYMBOL(ib_check_mr_status);
 
+int ib_set_vf_link_state(struct ib_device *device, int vf, u8 port,
+			 int state)
+{
+	if (!device->set_vf_link_state)
+		return -ENOSYS;
+
+	return device->set_vf_link_state(device, vf, port, state);
+}
+EXPORT_SYMBOL(ib_set_vf_link_state);
+
+int ib_get_vf_config(struct ib_device *device, int vf, u8 port,
+		     struct ifla_vf_info *info)
+{
+	if (!device->get_vf_config)
+		return -ENOSYS;
+
+	return device->get_vf_config(device, vf, port, info);
+}
+EXPORT_SYMBOL(ib_get_vf_config);
+
+int ib_get_vf_stats(struct ib_device *device, int vf, u8 port,
+		    struct ifla_vf_stats *stats)
+{
+	if (!device->get_vf_stats)
+		return -ENOSYS;
+
+	return device->get_vf_stats(device, vf, port, stats);
+}
+EXPORT_SYMBOL(ib_get_vf_stats);
+
+int ib_set_vf_guid(struct ib_device *device, int vf, u8 port, u64 guid,
+		   int type)
+{
+	if (!device->set_vf_guid)
+		return -ENOSYS;
+
+	return device->set_vf_guid(device, vf, port, guid, type);
+}
+EXPORT_SYMBOL(ib_set_vf_guid);
+
 /**
  * ib_map_mr_sg() - Map the largest prefix of a dma mapped SG list
  *     and set it the memory region.
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5c1f11742c65..d6a4c3d879fb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -56,6 +56,7 @@
 #include <linux/string.h>
 #include <linux/slab.h>
 
+#include <linux/if_link.h>
 #include <linux/atomic.h>
 #include <linux/mmu_notifier.h>
 #include <asm/uaccess.h>
@@ -217,6 +218,7 @@ enum ib_device_cap_flags {
 	IB_DEVICE_MANAGED_FLOW_STEERING		= (1 << 29),
 	IB_DEVICE_SIGNATURE_HANDOVER		= (1 << 30),
 	IB_DEVICE_ON_DEMAND_PAGING		= (1 << 31),
+	IB_DEVICE_VIRTUAL_FUNCTION		= ((u64)1 << 32),
 };
 
 enum ib_signature_prot_cap {
@@ -1852,6 +1854,14 @@ struct ib_device {
 	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
 						      struct ib_mr_status *mr_status);
 	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
+	int			   (*set_vf_link_state)(struct ib_device *device, int vf, u8 port,
+							int state);
+	int			   (*get_vf_config)(struct ib_device *device, int vf, u8 port,
+						    struct ifla_vf_info *ivf);
+	int			   (*get_vf_stats)(struct ib_device *device, int vf, u8 port,
+						   struct ifla_vf_stats *stats);
+	int			   (*set_vf_guid)(struct ib_device *device, int vf, u8 port, u64 guid,
+						  int type);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -2295,6 +2305,15 @@ int ib_query_gid(struct ib_device *device,
 		 u8 port_num, int index, union ib_gid *gid,
 		 struct ib_gid_attr *attr);
 
+int ib_set_vf_link_state(struct ib_device *device, int vf, u8 port,
+			 int state);
+int ib_get_vf_config(struct ib_device *device, int vf, u8 port,
+		     struct ifla_vf_info *info);
+int ib_get_vf_stats(struct ib_device *device, int vf, u8 port,
+		    struct ifla_vf_stats *stats);
+int ib_set_vf_guid(struct ib_device *device, int vf, u8 port, u64 guid,
+		   int type);
+
 int ib_query_pkey(struct ib_device *device,
 		  u8 port_num, u16 index, u16 *pkey);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 05/10] IB/ipoib: Add ndo operations for configuring VFs
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 04/10] IB/core: Add interfaces to control VF attributes Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 06/10] net/mlx5_core: Add VF param when querying vport counter Eli Cohen
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Add ndo operations to the network driver that enables configuring the
following operations:

ipoib_set_vf_link_state - configure the VF link policy
ipoib_get_vf_config - get link state configuration
ipoib_set_vf_guid - set a VF port or node GUID
ipoib_get_vf_stats - get statistics of a VF

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c | 65 ++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 25509bbd4a05..80807d6e5c4c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -51,6 +51,7 @@
 #include <net/addrconf.h>
 #include <linux/inetdevice.h>
 #include <rdma/ib_cache.h>
+#include <linux/pci.h>
 
 #define DRV_VERSION "1.0.0"
 
@@ -1590,11 +1591,67 @@ void ipoib_dev_cleanup(struct net_device *dev)
 	priv->tx_ring = NULL;
 }
 
+static int ipoib_set_vf_link_state(struct net_device *dev, int vf, int link_state)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(dev);
+
+	return ib_set_vf_link_state(priv->ca, vf, priv->port, link_state);
+}
+
+static int ipoib_get_vf_config(struct net_device *dev, int vf,
+			       struct ifla_vf_info *ivf)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(dev);
+	int err;
+
+	err = ib_get_vf_config(priv->ca, vf, priv->port, ivf);
+	if (err)
+		return err;
+
+	ivf->vf = vf;
+
+	return 0;
+}
+
+static int ipoib_set_vf_guid(struct net_device *dev, int vf, u64 guid, int type)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(dev);
+
+	if (type != IFLA_VF_IB_NODE_GUID && type != IFLA_VF_IB_PORT_GUID)
+		return -EINVAL;
+
+	return ib_set_vf_guid(priv->ca, vf, priv->port, guid, type);
+}
+
+static int ipoib_get_vf_stats(struct net_device *dev, int vf,
+			      struct ifla_vf_stats *vf_stats)
+{
+	struct ipoib_dev_priv *priv = netdev_priv(dev);
+
+	return ib_get_vf_stats(priv->ca, vf, priv->port, vf_stats);
+}
+
 static const struct header_ops ipoib_header_ops = {
 	.create	= ipoib_hard_header,
 };
 
-static const struct net_device_ops ipoib_netdev_ops = {
+static const struct net_device_ops ipoib_netdev_ops_pf = {
+	.ndo_uninit		 = ipoib_uninit,
+	.ndo_open		 = ipoib_open,
+	.ndo_stop		 = ipoib_stop,
+	.ndo_change_mtu		 = ipoib_change_mtu,
+	.ndo_fix_features	 = ipoib_fix_features,
+	.ndo_start_xmit		 = ipoib_start_xmit,
+	.ndo_tx_timeout		 = ipoib_timeout,
+	.ndo_set_rx_mode	 = ipoib_set_mcast_list,
+	.ndo_get_iflink		 = ipoib_get_iflink,
+	.ndo_set_vf_link_state	 = ipoib_set_vf_link_state,
+	.ndo_get_vf_config	 = ipoib_get_vf_config,
+	.ndo_get_vf_stats	 = ipoib_get_vf_stats,
+	.ndo_set_vf_guid	 = ipoib_set_vf_guid,
+};
+
+static const struct net_device_ops ipoib_netdev_ops_vf = {
 	.ndo_uninit		 = ipoib_uninit,
 	.ndo_open		 = ipoib_open,
 	.ndo_stop		 = ipoib_stop,
@@ -1610,7 +1667,11 @@ void ipoib_setup(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 
-	dev->netdev_ops		 = &ipoib_netdev_ops;
+	if (priv->hca_caps & IB_DEVICE_VIRTUAL_FUNCTION)
+		dev->netdev_ops	= &ipoib_netdev_ops_vf;
+	else
+		dev->netdev_ops	= &ipoib_netdev_ops_pf;
+
 	dev->header_ops		 = &ipoib_header_ops;
 
 	ipoib_set_ethtool_ops(dev);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 06/10] net/mlx5_core: Add VF param when querying vport counter
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 05/10] IB/ipoib: Add ndo operations for configuring VFs Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 07/10] net/mlx5_core: Implement modify HCA vport command Eli Cohen
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Add a vf parameter to mlx5_core_query_vport_counter so we can call it to
query counters of virtual functions. Also update current users of the
API.

PFs may call mlx5_core_query_vport_counter with other_vport set to
indicate that they are querying a virtual function. The virtual
function to be queried is given by the vf parameter. Virtual function
numbering is zero based so the first VF is 0 and so on. When a PF
queries its own function, the other_vport parameter is cleared.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/mad.c                | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/vport.c | 5 +++--
 include/linux/mlx5/vport.h                      | 3 ++-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 41d8a0036465..1534af113058 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -208,7 +208,7 @@ static int process_pma_cmd(struct ib_device *ibdev, u8 port_num,
 		if (!out_cnt)
 			return IB_MAD_RESULT_FAILURE;
 
-		err = mlx5_core_query_vport_counter(dev->mdev, 0,
+		err = mlx5_core_query_vport_counter(dev->mdev, 0, 0,
 						    port_num, out_cnt, sz);
 		if (!err)
 			pma_cnt_ext_assign(pma_cnt_ext, out_cnt);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index 90ab09e375b8..2b836d0b4738 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -852,7 +852,8 @@ int mlx5_nic_vport_disable_roce(struct mlx5_core_dev *mdev)
 EXPORT_SYMBOL_GPL(mlx5_nic_vport_disable_roce);
 
 int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
-				  u8 port_num, void *out, size_t out_sz)
+				  int vf, u8 port_num, void *out,
+				  size_t out_sz)
 {
 	int	in_sz = MLX5_ST_SZ_BYTES(query_vport_counter_in);
 	int	is_group_manager;
@@ -871,7 +872,7 @@ int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
 	if (other_vport) {
 		if (is_group_manager) {
 			MLX5_SET(query_vport_counter_in, in, other_vport, 1);
-			MLX5_SET(query_vport_counter_in, in, vport_number, 0);
+			MLX5_SET(query_vport_counter_in, in, vport_number, vf + 1);
 		} else {
 			err = -EPERM;
 			goto free;
diff --git a/include/linux/mlx5/vport.h b/include/linux/mlx5/vport.h
index a9f2bcc98cab..aafb3e48b5f8 100644
--- a/include/linux/mlx5/vport.h
+++ b/include/linux/mlx5/vport.h
@@ -93,6 +93,7 @@ int mlx5_modify_nic_vport_vlans(struct mlx5_core_dev *dev,
 int mlx5_nic_vport_enable_roce(struct mlx5_core_dev *mdev);
 int mlx5_nic_vport_disable_roce(struct mlx5_core_dev *mdev);
 int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
-				  u8 port_num, void *out, size_t out_sz);
+				  int vf, u8 port_num, void *out,
+				  size_t out_sz);
 
 #endif /* __MLX5_VPORT_H__ */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 07/10] net/mlx5_core: Implement modify HCA vport command
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 06/10] net/mlx5_core: Add VF param when querying vport counter Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 08/10] IB/mlx5: Implement callbacks for manipulating VFs Eli Cohen
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Implement the modify HCA vport commands used to modify the parameters of
virtual HCA's ports.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c   |  6 +++
 drivers/net/ethernet/mellanox/mlx5/core/vport.c | 67 +++++++++++++++++++++++++
 include/linux/mlx5/vport.h                      |  4 ++
 3 files changed, 77 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 037fc4cdf5af..ebb4036b98e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -407,6 +407,12 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
 const char *mlx5_command_str(int command)
 {
 	switch (command) {
+	case MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT:
+		return "QUERY_HCA_VPORT_CONTEXT";
+
+	case MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT:
+		return "MODIFY_HCA_VPORT_CONTEXT";
+
 	case MLX5_CMD_OP_QUERY_HCA_CAP:
 		return "QUERY_HCA_CAP";
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index 2b836d0b4738..bd518405859e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -891,3 +891,70 @@ free:
 	return err;
 }
 EXPORT_SYMBOL_GPL(mlx5_core_query_vport_counter);
+
+int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
+				       u8 other_vport, u8 port_num,
+				       int vf,
+				       struct mlx5_hca_vport_context *req)
+{
+	int in_sz = MLX5_ST_SZ_BYTES(modify_hca_vport_context_in);
+	u8 out[MLX5_ST_SZ_BYTES(modify_hca_vport_context_out)];
+	int is_group_manager;
+	void *in;
+	int err;
+	void *ctx;
+
+	mlx5_core_dbg(dev, "vf %d\n", vf);
+	is_group_manager = MLX5_CAP_GEN(dev, vport_group_manager);
+	in = kzalloc(in_sz, GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	memset(out, 0, sizeof(out));
+	MLX5_SET(modify_hca_vport_context_in, in, opcode, MLX5_CMD_OP_MODIFY_HCA_VPORT_CONTEXT);
+	if (other_vport) {
+		if (is_group_manager) {
+			MLX5_SET(modify_hca_vport_context_in, in, other_vport, 1);
+			MLX5_SET(modify_hca_vport_context_in, in, vport_number, vf);
+		} else {
+			err = -EPERM;
+			goto ex;
+		}
+	}
+
+	if (MLX5_CAP_GEN(dev, num_ports) > 1)
+		MLX5_SET(modify_hca_vport_context_in, in, port_num, port_num);
+
+	ctx = MLX5_ADDR_OF(modify_hca_vport_context_in, in, hca_vport_context);
+	MLX5_SET(hca_vport_context, ctx, field_select, req->field_select);
+	MLX5_SET(hca_vport_context, ctx, sm_virt_aware, req->sm_virt_aware);
+	MLX5_SET(hca_vport_context, ctx, has_smi, req->has_smi);
+	MLX5_SET(hca_vport_context, ctx, has_raw, req->has_raw);
+	MLX5_SET(hca_vport_context, ctx, vport_state_policy, req->policy);
+	MLX5_SET(hca_vport_context, ctx, port_physical_state, req->phys_state);
+	MLX5_SET(hca_vport_context, ctx, vport_state, req->vport_state);
+	MLX5_SET64(hca_vport_context, ctx, port_guid, req->port_guid);
+	MLX5_SET64(hca_vport_context, ctx, node_guid, req->node_guid);
+	MLX5_SET(hca_vport_context, ctx, cap_mask1, req->cap_mask1);
+	MLX5_SET(hca_vport_context, ctx, cap_mask1_field_select, req->cap_mask1_perm);
+	MLX5_SET(hca_vport_context, ctx, cap_mask2, req->cap_mask2);
+	MLX5_SET(hca_vport_context, ctx, cap_mask2_field_select, req->cap_mask2_perm);
+	MLX5_SET(hca_vport_context, ctx, lid, req->lid);
+	MLX5_SET(hca_vport_context, ctx, init_type_reply, req->init_type_reply);
+	MLX5_SET(hca_vport_context, ctx, lmc, req->lmc);
+	MLX5_SET(hca_vport_context, ctx, subnet_timeout, req->subnet_timeout);
+	MLX5_SET(hca_vport_context, ctx, sm_lid, req->sm_lid);
+	MLX5_SET(hca_vport_context, ctx, sm_sl, req->sm_sl);
+	MLX5_SET(hca_vport_context, ctx, qkey_violation_counter, req->qkey_violation_counter);
+	MLX5_SET(hca_vport_context, ctx, pkey_violation_counter, req->pkey_violation_counter);
+	err = mlx5_cmd_exec(dev, in, in_sz, out, sizeof(out));
+	if (err)
+		goto ex;
+
+	err = mlx5_cmd_status_to_err_v2(out);
+
+ex:
+	kfree(in);
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_modify_hca_vport_context);
diff --git a/include/linux/mlx5/vport.h b/include/linux/mlx5/vport.h
index aafb3e48b5f8..bd93e6323603 100644
--- a/include/linux/mlx5/vport.h
+++ b/include/linux/mlx5/vport.h
@@ -95,5 +95,9 @@ int mlx5_nic_vport_disable_roce(struct mlx5_core_dev *mdev);
 int mlx5_core_query_vport_counter(struct mlx5_core_dev *dev, u8 other_vport,
 				  int vf, u8 port_num, void *out,
 				  size_t out_sz);
+int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
+				       u8 other_vport, u8 port_num,
+				       int vf,
+				       struct mlx5_hca_vport_context *req);
 
 #endif /* __MLX5_VPORT_H__ */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 08/10] IB/mlx5: Implement callbacks for manipulating VFs
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 07/10] net/mlx5_core: Implement modify HCA vport command Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
       [not found]     ` <1456851143-138332-9-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-03-01 16:52   ` [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs Eli Cohen
  2016-03-01 16:52   ` [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0 Eli Cohen
  9 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

Implement the IB defined callbacks used to manipulate the policy for the
link state, set GUIDs or get statistics information. This functionality
is added into a new file that will be used to add any SRIOV related
functionality to the mlx5 IB layer.

The following callbacks have been added:

mlx5_ib_get_vf_config
mlx5_ib_set_vf_link_state
mlx5_ib_get_vf_stats
mlx5_ib_set_vf_guid

In addition, publish whether this device is based on a virtual function.

In mlx5 supported devices, virtual functions are implemented as vHCAs.
vHCAs have their own QP number space so it is possible that two vHCAs
will use a QP with the same number at the same time.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/Makefile  |   2 +-
 drivers/infiniband/hw/mlx5/ib_virt.c | 194 +++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/main.c    |  10 ++
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   8 ++
 include/linux/mlx5/driver.h          |   5 +-
 include/linux/mlx5/mlx5_ifc.h        |   6 ++
 6 files changed, 223 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/ib_virt.c

diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
index 27a70159e2ea..82e074f47cf2 100644
--- a/drivers/infiniband/hw/mlx5/Makefile
+++ b/drivers/infiniband/hw/mlx5/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_MLX5_INFINIBAND)	+= mlx5_ib.o
 
-mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o
+mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o ib_virt.o
 mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
diff --git a/drivers/infiniband/hw/mlx5/ib_virt.c b/drivers/infiniband/hw/mlx5/ib_virt.c
new file mode 100644
index 000000000000..c1b9de800fe5
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/ib_virt.c
@@ -0,0 +1,194 @@
+/*
+ * Copyright (c) 2016, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/module.h>
+#include <linux/mlx5/vport.h>
+#include "mlx5_ib.h"
+
+static inline u32 mlx_to_net_policy(enum port_state_policy mlx_policy)
+{
+	switch (mlx_policy) {
+	case MLX5_POLICY_DOWN:
+		return IFLA_VF_LINK_STATE_DISABLE;
+	case MLX5_POLICY_UP:
+		return IFLA_VF_LINK_STATE_ENABLE;
+	case MLX5_POLICY_FOLLOW:
+		return IFLA_VF_LINK_STATE_AUTO;
+	default:
+		return __IFLA_VF_LINK_STATE_MAX;
+	}
+}
+
+int mlx5_ib_get_vf_config(struct ib_device *device, int vf, u8 port,
+			  struct ifla_vf_info *info)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_core_dev *mdev = dev->mdev;
+	struct mlx5_hca_vport_context *rep;
+	int err;
+
+	rep = kzalloc(sizeof(*rep), GFP_KERNEL);
+	if (!rep)
+		return -ENOMEM;
+
+	err = mlx5_query_hca_vport_context(mdev, 1, 1,  vf + 1, rep);
+	if (err) {
+		mlx5_ib_warn(dev, "failed to query port policy for vf %d (%d)\n",
+			     vf, err);
+		goto free;
+	}
+	memset(info, 0, sizeof(*info));
+	info->linkstate = mlx_to_net_policy(rep->policy);
+	if (info->linkstate == __IFLA_VF_LINK_STATE_MAX)
+		err = -EINVAL;
+
+free:
+	kfree(rep);
+	return err;
+}
+
+static inline enum port_state_policy net_to_mlx_policy(int policy)
+{
+	switch (policy) {
+	case IFLA_VF_LINK_STATE_DISABLE:
+		return MLX5_POLICY_DOWN;
+	case IFLA_VF_LINK_STATE_ENABLE:
+		return MLX5_POLICY_UP;
+	case IFLA_VF_LINK_STATE_AUTO:
+		return MLX5_POLICY_FOLLOW;
+	default:
+		return MLX5_POLICY_INVALID;
+	}
+}
+
+int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
+			      u8 port, int state)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_core_dev *mdev = dev->mdev;
+	struct mlx5_hca_vport_context *in;
+	int err;
+
+	in = kzalloc(sizeof(*in), GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	in->policy = net_to_mlx_policy(state);
+	if (in->policy == MLX5_POLICY_INVALID) {
+		err = -EINVAL;
+		goto out;
+	}
+	in->field_select = MLX5_HCA_VPORT_SEL_STATE_POLICY;
+	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+
+out:
+	kfree(in);
+	return err;
+}
+
+int mlx5_ib_get_vf_stats(struct ib_device *device, int vf,
+			 u8 port, struct ifla_vf_stats *stats)
+{
+	int out_sz = MLX5_ST_SZ_BYTES(query_vport_counter_out);
+	struct mlx5_core_dev *mdev;
+	struct mlx5_ib_dev *dev;
+	void *out;
+	int err;
+
+	dev = to_mdev(device);
+	mdev = dev->mdev;
+
+	out = kzalloc(out_sz, GFP_KERNEL);
+	if (!out)
+		return -ENOMEM;
+
+	err = mlx5_core_query_vport_counter(mdev, true, vf, port, out, out_sz);
+	if (err)
+		goto ex;
+
+	stats->rx_packets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.packets);
+	stats->tx_packets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.packets);
+	stats->rx_bytes = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.octets);
+	stats->tx_bytes = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.octets);
+	stats->multicast = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_multicast.packets);
+
+ex:
+	kfree(out);
+	return err;
+}
+
+static int set_vf_node_guid(struct ib_device *device, int vf, u8 port, u64 guid)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_core_dev *mdev = dev->mdev;
+	struct mlx5_hca_vport_context *in;
+	int err;
+
+	in = kzalloc(sizeof(*in), GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	in->field_select = MLX5_HCA_VPORT_SEL_NODE_GUID;
+	in->node_guid = guid;
+	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+	kfree(in);
+	return err;
+}
+
+static int set_vf_port_guid(struct ib_device *device, int vf, u8 port, u64 guid)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_core_dev *mdev = dev->mdev;
+	struct mlx5_hca_vport_context *in;
+	int err;
+
+	in = kzalloc(sizeof(*in), GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	in->field_select = MLX5_HCA_VPORT_SEL_PORT_GUID;
+	in->port_guid = guid;
+	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+	kfree(in);
+	return err;
+}
+
+int mlx5_ib_set_vf_guid(struct ib_device *device, int vf, u8 port,
+			u64 guid, int type)
+{
+	if (type == IFLA_VF_IB_NODE_GUID)
+		return set_vf_node_guid(device, vf, port, guid);
+	else if (type == IFLA_VF_IB_PORT_GUID)
+		return set_vf_port_guid(device, vf, port, guid);
+
+	return -EINVAL;
+}
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 399049b70a72..00ca1e998254 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -549,6 +549,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 	if (MLX5_CAP_GEN(mdev, cd))
 		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
 
+	if (!mlx5_core_is_pf(mdev))
+		props->device_cap_flags |= IB_DEVICE_VIRTUAL_FUNCTION;
+
 	return 0;
 }
 
@@ -686,6 +689,7 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u8 port,
 	props->qkey_viol_cntr	= rep->qkey_violation_counter;
 	props->subnet_timeout	= rep->subnet_timeout;
 	props->init_type_reply	= rep->init_type_reply;
+	props->grh_required	= rep->grh_required;
 
 	err = mlx5_query_port_link_width_oper(mdev, &ib_link_width_oper, port);
 	if (err)
@@ -2266,6 +2270,12 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	dev->ib_dev.map_mr_sg		= mlx5_ib_map_mr_sg;
 	dev->ib_dev.check_mr_status	= mlx5_ib_check_mr_status;
 	dev->ib_dev.get_port_immutable  = mlx5_port_immutable;
+	if (mlx5_core_is_pf(mdev)) {
+		dev->ib_dev.get_vf_config	= mlx5_ib_get_vf_config;
+		dev->ib_dev.set_vf_link_state	= mlx5_ib_set_vf_link_state;
+		dev->ib_dev.get_vf_stats	= mlx5_ib_get_vf_stats;
+		dev->ib_dev.set_vf_guid		= mlx5_ib_set_vf_guid;
+	}
 
 	mlx5_ib_internal_fill_odp_caps(dev);
 
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index d2b9737baa36..752e3e33bcbc 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -719,6 +719,14 @@ void mlx5_ib_qp_disable_pagefaults(struct mlx5_ib_qp *qp);
 void mlx5_ib_qp_enable_pagefaults(struct mlx5_ib_qp *qp);
 void mlx5_ib_invalidate_range(struct ib_umem *umem, unsigned long start,
 			      unsigned long end);
+int mlx5_ib_get_vf_config(struct ib_device *device, int vf,
+			  u8 port, struct ifla_vf_info *info);
+int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
+			      u8 port, int state);
+int mlx5_ib_get_vf_stats(struct ib_device *device, int vf,
+			 u8 port, struct ifla_vf_stats *stats);
+int mlx5_ib_set_vf_guid(struct ib_device *device, int vf, u8 port,
+			u64 guid, int type);
 
 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 8edcd08853dd..ea00549e45aa 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -613,7 +613,10 @@ struct mlx5_pas {
 };
 
 enum port_state_policy {
-	MLX5_AAA_000
+	MLX5_POLICY_DOWN	= 0,
+	MLX5_POLICY_UP		= 1,
+	MLX5_POLICY_FOLLOW	= 2,
+	MLX5_POLICY_INVALID	= 0xffffffff
 };
 
 enum phy_port_state {
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 9f404de5f99b..7a146074d428 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -3661,6 +3661,12 @@ struct mlx5_ifc_query_hca_vport_pkey_in_bits {
 	u8         pkey_index[0x10];
 };
 
+enum {
+	MLX5_HCA_VPORT_SEL_PORT_GUID	= 1 << 0,
+	MLX5_HCA_VPORT_SEL_NODE_GUID	= 1 << 1,
+	MLX5_HCA_VPORT_SEL_STATE_POLICY	= 1 << 2,
+};
+
 struct mlx5_ifc_query_hca_vport_gid_out_bits {
 	u8         status[0x8];
 	u8         reserved_at_8[0x18];
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 08/10] IB/mlx5: Implement callbacks for manipulating VFs Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
       [not found]     ` <1456851143-138332-10-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-03-01 16:52   ` [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0 Eli Cohen
  9 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Eli Cohen

With SRIOV enabled, two VFs on the same HCA which have the same port LID
and may have the same QP number. To enable receiving multicasts from
such VFs, further qualify the check: ignore the receive only if, in
addition, the packet source gid equals the receiving VF's source gid.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index fa9c42ff1fb0..e0b953cdab50 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -180,6 +180,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 	struct sk_buff *skb;
 	u64 mapping[IPOIB_UD_RX_SG];
 	union ib_gid *dgid;
+	union ib_gid *sgid;
 
 	ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n",
 		       wr_id, wc->status);
@@ -203,13 +204,6 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 		return;
 	}
 
-	/*
-	 * Drop packets that this interface sent, ie multicast packets
-	 * that the HCA has replicated.
-	 */
-	if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num)
-		goto repost;
-
 	memcpy(mapping, priv->rx_ring[wr_id].mapping,
 	       IPOIB_UD_RX_SG * sizeof *mapping);
 
@@ -239,6 +233,27 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 	else
 		skb->pkt_type = PACKET_MULTICAST;
 
+	sgid = &((struct ib_grh *)skb->data)->sgid;
+
+	/*
+	 * Drop packets that this interface sent, ie multicast packets
+	 * that the HCA has replicated.
+	 */
+	if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) {
+		int need_repost = 1;
+
+		if ((wc->wc_flags & IB_WC_GRH) &&
+		    memcmp(&sgid->global.interface_id,
+			   &priv->local_gid.global.interface_id,
+			   sizeof(sgid->global.interface_id)))
+			need_repost = 0;
+
+		if (need_repost) {
+			dev_kfree_skb_any(skb);
+			goto repost;
+		}
+	}
+
 	skb_pull(skb, IB_GRH_BYTES);
 
 	skb->protocol = ((struct ipoib_header *) skb->data)->proto;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0
       [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-03-01 16:52   ` [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs Eli Cohen
@ 2016-03-01 16:52   ` Eli Cohen
       [not found]     ` <1456851143-138332-11-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  9 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 16:52 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Or Gerlitz

From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
the path returned by the SA has hop-limit > 0. Currently, we do that
only for the > 1 case, fix that.

Fixes: 6d969a471ba1 ('IB/sa: Add ib_init_ah_from_path()')
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/sa_query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 833d2a99a311..6145ded45293 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -1076,7 +1076,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
 		}
 	}
 
-	if (rec->hop_limit > 1 || use_roce) {
+	if (rec->hop_limit > 0 || use_roce) {
 		ah_attr->ah_flags = IB_AH_GRH;
 		ah_attr->grh.dgid = rec->dgid;
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]     ` <1456851143-138332-2-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-01 17:37       ` Jason Gunthorpe
       [not found]         ` <20160301173751.GA25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 17:37 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 01, 2016 at 06:52:14PM +0200, Eli Cohen wrote:
> Add two new NLAs to support configuration of Infiniband node or port
> GUIDs. New applications can choose to use this interface to configure
> GUIDs with iproute2 with commands such as:
> 
> ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
> ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78

I like this idea better than the last version..

> +static int handle_vf_mac(struct net_device *dev, struct ifla_vf_mac *ivm)
> +{
[..]
> +	return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);

But is this emulation really necessary? It seems dangerous and
continues the bad practice of assuming IFLA_VF_MAC is fixed to 6 bytes
in size, and is not just LLADDR bytes. I'd rather see mac sets fail on
IB.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0
       [not found]     ` <1456851143-138332-11-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-01 17:38       ` Jason Gunthorpe
       [not found]         ` <20160301173846.GB25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 17:38 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Or Gerlitz

On Tue, Mar 01, 2016 at 06:52:23PM +0200, Eli Cohen wrote:
> From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
> the path returned by the SA has hop-limit > 0. Currently, we do that
> only for the > 1 case, fix that.
> 
> Fixes: 6d969a471ba1 ('IB/sa: Add ib_init_ah_from_path()')
> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Reviewed-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]     ` <1456851143-138332-4-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-01 17:44       ` Jason Gunthorpe
       [not found]         ` <20160301174401.GC25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 17:44 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 06:52:16PM +0200, Eli Cohen wrote:
> Per the ongoing standardisation process, when virtual HCAs are present
> in a network, traffic is routed based on a destination GID. In order to
> access the SA we use the well known SA GID.

Should we be merging patches based on on-going standards work?

> +		ah_attr.ah_flags = IB_AH_GRH;
> +		ah_attr.grh.dgid.global.subnet_prefix = cpu_to_be64(IB_SA_WELL_KNOWN_GID_PREFIX);
> +		ah_attr.grh.dgid.global.interface_id = cpu_to_be64(IB_SA_WELL_KNOWN_GUID);

I'm surprised this hard wired to fe80::2 - surely this should use
the current subnet prefix? There should be no traffic prefixed with
fe80:: if a subnet is configured with another prefix.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]         ` <20160301173751.GA25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-01 17:49           ` Eli Cohen
       [not found]             ` <20160301174951.GA19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 17:49 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 01, 2016 at 10:37:51AM -0700, Jason Gunthorpe wrote:
> > +	return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);
> 
> But is this emulation really necessary? It seems dangerous and
> continues the bad practice of assuming IFLA_VF_MAC is fixed to 6 bytes
> in size, and is not just LLADDR bytes. I'd rather see mac sets fail on
> IB.
> 
struct ifla_vf_mac  already defines mac as 32 bytes but the idea here
is that applications that configure six byte Ethernet MACs to VFs will
continue to work without any change. 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]         ` <20160301174401.GC25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-01 18:17           ` Eli Cohen
       [not found]             ` <20160301181742.GB19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 18:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 10:44:01AM -0700, Jason Gunthorpe wrote:
> On Tue, Mar 01, 2016 at 06:52:16PM +0200, Eli Cohen wrote:
> > Per the ongoing standardisation process, when virtual HCAs are present
> > in a network, traffic is routed based on a destination GID. In order to
> > access the SA we use the well known SA GID.
> 
> Should we be merging patches based on on-going standards work?
The spec is in a very advanced state so I think it makes sense to
merge.

> 
> > +		ah_attr.ah_flags = IB_AH_GRH;
> > +		ah_attr.grh.dgid.global.subnet_prefix = cpu_to_be64(IB_SA_WELL_KNOWN_GID_PREFIX);
> > +		ah_attr.grh.dgid.global.interface_id = cpu_to_be64(IB_SA_WELL_KNOWN_GUID);
> 
> I'm surprised this hard wired to fe80::2 - surely this should use
> the current subnet prefix? There should be no traffic prefixed with
> fe80:: if a subnet is configured with another prefix.
> 
Hmm... seems like no such thing IB_SA_WELL_KNOWN_GID_PREFIX. The
subnet prefix is part of the port info but struct ib_port_attr does
not define a field for the subnet prefix. How do you think it should
be obtained?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]             ` <20160301174951.GA19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
@ 2016-03-01 18:25               ` Jason Gunthorpe
       [not found]                 ` <20160301182516.GA12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 18:25 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 01, 2016 at 07:49:51PM +0200, Eli Cohen wrote:
> On Tue, Mar 01, 2016 at 10:37:51AM -0700, Jason Gunthorpe wrote:
> > > +	return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);
> > 
> > But is this emulation really necessary? It seems dangerous and
> > continues the bad practice of assuming IFLA_VF_MAC is fixed to 6 bytes
> > in size, and is not just LLADDR bytes. I'd rather see mac sets fail on
> > IB.
> > 
> struct ifla_vf_mac  already defines mac as 32 bytes but the idea here
> is that applications that configure six byte Ethernet MACs to VFs will
> continue to work without any change. 

In my view it is incorrect for an application to try and set a 6 byte
mac on an *infiniband* interface, the kernel should refuse to do it.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]             ` <20160301181742.GB19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
@ 2016-03-01 18:32               ` Jason Gunthorpe
       [not found]                 ` <20160301183256.GB12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 18:32 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 08:17:42PM +0200, Eli Cohen wrote:
> On Tue, Mar 01, 2016 at 10:44:01AM -0700, Jason Gunthorpe wrote:
> > On Tue, Mar 01, 2016 at 06:52:16PM +0200, Eli Cohen wrote:
> > > Per the ongoing standardisation process, when virtual HCAs are present
> > > in a network, traffic is routed based on a destination GID. In order to
> > > access the SA we use the well known SA GID.
> > 
> > Should we be merging patches based on on-going standards work?
> The spec is in a very advanced state so I think it makes sense to
> merge.

How does this interact with the existing SRIOV stuff? It is already
very annoying that mlx5 is incompatible with the old scheme. Is the
proposal to shift all IB to this new scheme, or still keep mlx4/mlx5 with
different approaches?

We can't just have the kernel become incompatible with existing
SMs. Eg opensm only supports the old scheme last I looked.

> > > +		ah_attr.ah_flags = IB_AH_GRH;
> > > +		ah_attr.grh.dgid.global.subnet_prefix = cpu_to_be64(IB_SA_WELL_KNOWN_GID_PREFIX);
> > > +		ah_attr.grh.dgid.global.interface_id = cpu_to_be64(IB_SA_WELL_KNOWN_GUID);
> > 
> > I'm surprised this hard wired to fe80::2 - surely this should use
> > the current subnet prefix? There should be no traffic prefixed with
> > fe80:: if a subnet is configured with another prefix.
> > 
> Hmm... seems like no such thing IB_SA_WELL_KNOWN_GID_PREFIX. The
> subnet prefix is part of the port info but struct ib_port_attr does
> not define a field for the subnet prefix. How do you think it should
> be obtained?

It makes sense to add it to ib_port_attr, the code already has to call
that to get the sm_lid.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                 ` <20160301183256.GB12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-01 19:07                   ` Eli Cohen
       [not found]                     ` <20160301190742.GC19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 19:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 11:32:56AM -0700, Jason Gunthorpe wrote:
> 
> How does this interact with the existing SRIOV stuff? It is already
> very annoying that mlx5 is incompatible with the old scheme. Is the
> proposal to shift all IB to this new scheme, or still keep mlx4/mlx5 with
> different approaches?

Now we have proper interfaces so we can use these interfaces to
implement the same functionality in mlx4 which is there but accessible
in a different manner.
> 
> We can't just have the kernel become incompatible with existing
> SMs. Eg opensm only supports the old scheme last I looked.
> 
You need a SM which support virtualizaion to have virtualization
supported but if you don't you can still work with physical functions
in the same way you did before so we don't break anything, we just
adding new functionality.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                     ` <20160301190742.GC19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
@ 2016-03-01 19:31                       ` Jason Gunthorpe
       [not found]                         ` <20160301193153.GA25755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jason Gunthorpe @ 2016-03-01 19:31 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 09:07:42PM +0200, Eli Cohen wrote:
> On Tue, Mar 01, 2016 at 11:32:56AM -0700, Jason Gunthorpe wrote:
> > 
> > How does this interact with the existing SRIOV stuff? It is already
> > very annoying that mlx5 is incompatible with the old scheme. Is the
> > proposal to shift all IB to this new scheme, or still keep mlx4/mlx5 with
> > different approaches?
> 
> Now we have proper interfaces so we can use these interfaces to
> implement the same functionality in mlx4 which is there but accessible
> in a different manner.

> > We can't just have the kernel become incompatible with existing
> > SMs. Eg opensm only supports the old scheme last I looked.
>
> You need a SM which support virtualizaion to have virtualization
> supported but if you don't you can still work with physical functions
> in the same way you did before so we don't break anything, we just
> adding new functionality.

I mean opensm supports the GUID Alias scheme for virtualization, this
new virtualization scheme is not compatible, and we shouldn't have the
kernel drop support for existing working SMs, by, eg, replacing the
mlx4 guid alias scheme with this new scheme.

I'm guessing a user controlled switch is going to be necessary here to
pick GUID alias or port port virtualization.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                         ` <20160301193153.GA25755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-01 19:46                           ` Eli Cohen
       [not found]                             ` <20160301194608.GF19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Cohen @ 2016-03-01 19:46 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 12:31:53PM -0700, Jason Gunthorpe wrote:
> 
> I mean opensm supports the GUID Alias scheme for virtualization, this
> new virtualization scheme is not compatible, and we shouldn't have the
> kernel drop support for existing working SMs, by, eg, replacing the
> mlx4 guid alias scheme with this new scheme.
> 
> I'm guessing a user controlled switch is going to be necessary here to
> pick GUID alias or port port virtualization.
> 

The alias GUID mechanism remains and can be used with mlx4 devices.
With this scheme the admin configures the port and node GUIDs using
iprout2 which ends up in the hardware driver configuring the deivce. A
virtualization aware SM can read this configuration through MADs.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]                 ` <20160301182516.GA12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-01 21:08                   ` Or Gerlitz
       [not found]                     ` <CAJ3xEMgrAUCj7PS6fegmuSUsjMruH3gzSHZmuzAX+ZbHZOpL9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Or Gerlitz @ 2016-03-01 21:08 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Eli Cohen, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Linux Netdev List

On Tue, Mar 1, 2016 at 8:25 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Mar 01, 2016 at 07:49:51PM +0200, Eli Cohen wrote:
>> On Tue, Mar 01, 2016 at 10:37:51AM -0700, Jason Gunthorpe wrote:
>> > > + return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);
>> >
>> > But is this emulation really necessary? It seems dangerous and
>> > continues the bad practice of assuming IFLA_VF_MAC is fixed to 6 bytes
>> > in size, and is not just LLADDR bytes. I'd rather see mac sets fail on
>> > IB.
>> >
>> struct ifla_vf_mac  already defines mac as 32 bytes but the idea here
>> is that applications that configure six byte Ethernet MACs to VFs will
>> continue to work without any change.
>
> In my view it is incorrect for an application to try and set a 6 byte
> mac on an *infiniband* interface, the kernel should refuse to do it.

As Eli wrote, there's a well defined way to extend MAC to GUID. With
that in hand, the idea here is to allow staged/evolved support for IB
Virtualization using un-touched provisioning systems which assign VMs
with 6-byte MACs along with the fully IB aware solution where the
upper level does provision IB GUIDs.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                             ` <20160301194608.GF19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
@ 2016-03-01 21:15                               ` Or Gerlitz
  2016-03-04 14:37                               ` Doug Ledford
  1 sibling, 0 replies; 34+ messages in thread
From: Or Gerlitz @ 2016-03-01 21:15 UTC (permalink / raw)
  To: Eli Cohen
  Cc: Jason Gunthorpe, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss

On Tue, Mar 1, 2016 at 9:46 PM, Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On Tue, Mar 01, 2016 at 12:31:53PM -0700, Jason Gunthorpe wrote:

>> I mean opensm supports the GUID Alias scheme for virtualization, this
>> new virtualization scheme is not compatible, and we shouldn't have the
>> kernel drop support for existing working SMs, by, eg, replacing the
>> mlx4 guid alias scheme with this new scheme.

[...]

> The alias GUID mechanism remains and can be used with mlx4 devices.
> With this scheme the admin configures the port and node GUIDs using
> iproute2 which ends up in the hardware driver configuring the device.

Indeed. The provisioning through standard means (rtnetlink, iproute2)
would work for both mlx4 (uses Alias GUIDs) and mlx5 (uses
vHCA/vGUIDs).

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]                     ` <CAJ3xEMgrAUCj7PS6fegmuSUsjMruH3gzSHZmuzAX+ZbHZOpL9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-03-02 16:50                       ` Doug Ledford
       [not found]                         ` <56D719E3.2000206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Doug Ledford @ 2016-03-02 16:50 UTC (permalink / raw)
  To: Or Gerlitz, Jason Gunthorpe
  Cc: Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss,
	Linux Netdev List


[-- Attachment #1.1: Type: text/plain, Size: 1605 bytes --]

On 3/1/2016 4:08 PM, Or Gerlitz wrote:
> On Tue, Mar 1, 2016 at 8:25 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
>> On Tue, Mar 01, 2016 at 07:49:51PM +0200, Eli Cohen wrote:
>>> On Tue, Mar 01, 2016 at 10:37:51AM -0700, Jason Gunthorpe wrote:
>>>>> + return handle_infiniband_guid(dev, &ivt, IFLA_VF_IB_PORT_GUID);
>>>>
>>>> But is this emulation really necessary? It seems dangerous and
>>>> continues the bad practice of assuming IFLA_VF_MAC is fixed to 6 bytes
>>>> in size, and is not just LLADDR bytes. I'd rather see mac sets fail on
>>>> IB.
>>>>
>>> struct ifla_vf_mac  already defines mac as 32 bytes but the idea here
>>> is that applications that configure six byte Ethernet MACs to VFs will
>>> continue to work without any change.
>>
>> In my view it is incorrect for an application to try and set a 6 byte
>> mac on an *infiniband* interface, the kernel should refuse to do it.
> 
> As Eli wrote, there's a well defined way to extend MAC to GUID. With
> that in hand, the idea here is to allow staged/evolved support for IB
> Virtualization using un-touched provisioning systems which assign VMs
> with 6-byte MACs

Exactly *what* provisioning system tries to set the VF_MAC on an IPoIB
interface and expects it to set the GUID of an underlying IB device?

> along with the fully IB aware solution where the
> upper level does provision IB GUIDs.

There has never been upstream support for this MAC->GUID stuff you refer
to.  I'm not convinced we should add it now versus just doing things
right, period.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]                         ` <56D719E3.2000206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-03-02 18:40                           ` Or Gerlitz
       [not found]                             ` <CAJ3xEMh5vJAZVO03=rRVCvqqXzXvah3idrMtMQfFP-wBxR7R_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Or Gerlitz @ 2016-03-02 18:40 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jason Gunthorpe, Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Linux Netdev List

On Wed, Mar 2, 2016 at 6:50 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> Exactly *what* provisioning system tries to set the VF_MAC on an IPoIB
> interface and expects it to set the GUID of an underlying IB device?

The provisioning system need not be fully aware in all their
components this is IB here, there's PCI linkage that tells these are
VFs of this PF and they have to be used for these VMs.

>> along with the fully IB aware solution where the
>> upper level does provision IB GUIDs.

> There has never been upstream support for this MAC->GUID stuff you refer
> to.  I'm not convinced we should add it now versus just doing things
> right, period.

We **are** doing things right with the new ndo.

Using the small MAC->GUID addition, people could be using non-modified
(or almost non-modified) provisioning systems that assign SRIOV VMs
with a MACs --- just use these patches on their hosts and get DHCP
server supplying IP addresses based on the derived GUIDs (this is
supported today).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0
       [not found]         ` <20160301173846.GB25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-03-03 15:55           ` Doug Ledford
  0 siblings, 0 replies; 34+ messages in thread
From: Doug Ledford @ 2016-03-03 15:55 UTC (permalink / raw)
  To: Jason Gunthorpe, Eli Cohen
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w,
	Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

On 03/01/2016 12:38 PM, Jason Gunthorpe wrote:
> On Tue, Mar 01, 2016 at 06:52:23PM +0200, Eli Cohen wrote:
>> From: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
>> the path returned by the SA has hop-limit > 0. Currently, we do that
>> only for the > 1 case, fix that.
>>
>> Fixes: 6d969a471ba1 ('IB/sa: Add ib_init_ah_from_path()')
>> Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> Reviewed-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> 
> Jason
> 

I've pulled patch 10/10 out of this series and applied it to my 4.5-rc
branch as it's a valid fix for code already in the kernel and need not
be part of this other patch series.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]                             ` <CAJ3xEMh5vJAZVO03=rRVCvqqXzXvah3idrMtMQfFP-wBxR7R_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-03-04 14:35                               ` Doug Ledford
       [not found]                                 ` <56D99D3D.4000606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Doug Ledford @ 2016-03-04 14:35 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Jason Gunthorpe, Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 3461 bytes --]

On 03/02/2016 01:40 PM, Or Gerlitz wrote:
> On Wed, Mar 2, 2016 at 6:50 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> 
>> Exactly *what* provisioning system tries to set the VF_MAC on an IPoIB
>> interface and expects it to set the GUID of an underlying IB device?
> 
> The provisioning system need not be fully aware in all their
> components this is IB here, there's PCI linkage that tells these are
> VFs of this PF and they have to be used for these VMs.

If I understand you correctly, then I don't think I agree.

From what I read, I gather you mean:

libvirt can be used to control guests today, and you can list a PCI
device as "managed" and specify a MAC address (which has libvirt
assuming the device is an ethernet device).  In that case, libvirt
automatically detaches the device from the host (if attached), figures
out if it's a PF or VF, sets the MAC address using either PF or VF MAC
setting methods in ethtool, attaches the device to the guest, then
starts the guest.  And you're saying we should put the MAC->GUID
transformation into this code for IB so that libvirt can be blissfully
ignorant and people can tell libvirt it's an ethernet device with a MAC
and libvirt will treat it as such and life will be grand.

Except it won't.  Along with setting the GUID, we also need to set the
P_Keys allowed list (at least using the alias GUIDs method of mlx4 you
do, so unless you add a patch to this series to switch mlx4 to this new
method, that's a valid concern).  And nothing in libvirt can do that as
long libvirt thinks this is ethernet because libvirt doesn't control the
vlans on a guest's ethernet device, the guest does.  In that sense, IB
and ethernet vary greatly.

So, at *best*, the solution you are suggesting for existing setups is a
partial solution that leaves things only half done.

I don't see the justification to clutter up upstream code with a
solution that isn't at least completely functional in its
implementation.  If the solution is only partial, then I would rather
leave it out and tell people to upgrade their libvirt to know about IB
devices.

This is actually further backed up, in my mind, by the fact that you can
have RoCE/iWARP Ethernet devices and regular Ethernet devices, and
libvirt needs to be taught the concept of an RDMA capable device,
whether Ethernet or IB, so that when trying to select a host for
migration it can make sure that the migration target has the same
capabilities as the hardware you are migrating from.  So, to me, doing
this right *requires* a libvirt upgrade, and there is no sense in this
middle ground GUID from MAC hack that you are suggesting.

>>> along with the fully IB aware solution where the
>>> upper level does provision IB GUIDs.
> 
>> There has never been upstream support for this MAC->GUID stuff you refer
>> to.  I'm not convinced we should add it now versus just doing things
>> right, period.
> 
> We **are** doing things right with the new ndo.
> 
> Using the small MAC->GUID addition, people could be using non-modified
> (or almost non-modified) provisioning systems that assign SRIOV VMs
> with a MACs --- just use these patches on their hosts and get DHCP
> server supplying IP addresses based on the derived GUIDs (this is
> supported today).
> 


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                             ` <20160301194608.GF19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
  2016-03-01 21:15                               ` Or Gerlitz
@ 2016-03-04 14:37                               ` Doug Ledford
       [not found]                                 ` <56D99D8E.5020900-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 34+ messages in thread
From: Doug Ledford @ 2016-03-04 14:37 UTC (permalink / raw)
  To: Eli Cohen, Jason Gunthorpe
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]

On 03/01/2016 02:46 PM, Eli Cohen wrote:
> On Tue, Mar 01, 2016 at 12:31:53PM -0700, Jason Gunthorpe wrote:
>>
>> I mean opensm supports the GUID Alias scheme for virtualization, this
>> new virtualization scheme is not compatible, and we shouldn't have the
>> kernel drop support for existing working SMs, by, eg, replacing the
>> mlx4 guid alias scheme with this new scheme.
>>
>> I'm guessing a user controlled switch is going to be necessary here to
>> pick GUID alias or port port virtualization.
>>
> 
> The alias GUID mechanism remains and can be used with mlx4 devices.
> With this scheme the admin configures the port and node GUIDs using
> iprout2 which ends up in the hardware driver configuring the deivce. A
> virtualization aware SM can read this configuration through MADs.
> 

If the alias GUID mechanism is to be retained, then we need another NDO
entry point for setting P_Keys on alias GUID VFs.  If we are going to
switch over to using iproute2, then the solution needs to be complete.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs
       [not found]     ` <1456851143-138332-10-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-06 11:50       ` Yuval Shaia
       [not found]         ` <20160306115006.GA23975-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Yuval Shaia @ 2016-03-06 11:50 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 06:52:22PM +0200, Eli Cohen wrote:
> With SRIOV enabled, two VFs on the same HCA which have the same port LID
> and may have the same QP number. To enable receiving multicasts from
> such VFs, further qualify the check: ignore the receive only if, in
> addition, the packet source gid equals the receiving VF's source gid.
> 
> Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_ib.c | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> index fa9c42ff1fb0..e0b953cdab50 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
> @@ -180,6 +180,7 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>  	struct sk_buff *skb;
>  	u64 mapping[IPOIB_UD_RX_SG];
>  	union ib_gid *dgid;
> +	union ib_gid *sgid;
>  
>  	ipoib_dbg_data(priv, "recv completion: id %d, status: %d\n",
>  		       wr_id, wc->status);
> @@ -203,13 +204,6 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>  		return;
>  	}
>  
> -	/*
> -	 * Drop packets that this interface sent, ie multicast packets
> -	 * that the HCA has replicated.
> -	 */
> -	if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num)
> -		goto repost;
> -
>  	memcpy(mapping, priv->rx_ring[wr_id].mapping,
>  	       IPOIB_UD_RX_SG * sizeof *mapping);
>  
> @@ -239,6 +233,27 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>  	else
>  		skb->pkt_type = PACKET_MULTICAST;
>  
> +	sgid = &((struct ib_grh *)skb->data)->sgid;
> +
> +	/*
> +	 * Drop packets that this interface sent, ie multicast packets
> +	 * that the HCA has replicated.
> +	 */
> +	if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) {
> +		int need_repost = 1;
> +
> +		if ((wc->wc_flags & IB_WC_GRH) &&
> +		    memcmp(&sgid->global.interface_id,
> +			   &priv->local_gid.global.interface_id,
> +			   sizeof(sgid->global.interface_id)))
1. Why can't we do sgid->global.interface_id !=
priv->local_gid.global.interface_id
2. Don't we need also to check subnet_prefix? i.e. is it possible to have
same interface_id on different networks?
> +			need_repost = 0;
> +
> +		if (need_repost) {
> +			dev_kfree_skb_any(skb);
> +			goto repost;
> +		}
> +	}
> +
>  	skb_pull(skb, IB_GRH_BYTES);
>  
>  	skb->protocol = ((struct ipoib_header *) skb->data)->proto;
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs
       [not found]         ` <20160306115006.GA23975-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
@ 2016-03-06 12:13           ` Or Gerlitz
  0 siblings, 0 replies; 34+ messages in thread
From: Or Gerlitz @ 2016-03-06 12:13 UTC (permalink / raw)
  To: Yuval Shaia
  Cc: Eli Cohen, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Liran Liss

On Sun, Mar 6, 2016 at 1:50 PM, Yuval Shaia <yuval.shaia-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
> On Tue, Mar 01, 2016 at 06:52:22PM +0200, Eli Cohen wrote:

>> @@ -239,6 +233,27 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>>       else
>>               skb->pkt_type = PACKET_MULTICAST;
>>
>> +     sgid = &((struct ib_grh *)skb->data)->sgid;
>> +
>> +     /*
>> +      * Drop packets that this interface sent, ie multicast packets
>> +      * that the HCA has replicated.
>> +      */
>> +     if (wc->slid == priv->local_lid && wc->src_qp == priv->qp->qp_num) {
>> +             int need_repost = 1;
>> +
>> +             if ((wc->wc_flags & IB_WC_GRH) &&
>> +                 memcmp(&sgid->global.interface_id,
>> +                        &priv->local_gid.global.interface_id,
>> +                        sizeof(sgid->global.interface_id)))

> 1. Why can't we do sgid->global.interface_id !=
> priv->local_gid.global.interface_id

you mean get better perf for A !=B on 64bit ARCHs? yes, we can do
that,  I guess.


> 2. Don't we need also to check subnet_prefix? i.e. is it possible to have
> same interface_id on different networks?

We're trying to figure out if this is US (node X, port Y) who send
this packet, and looking
for the minimal thing to compare on.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 08/10] IB/mlx5: Implement callbacks for manipulating VFs
       [not found]     ` <1456851143-138332-9-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-03-06 12:33       ` Yuval Shaia
  0 siblings, 0 replies; 34+ messages in thread
From: Yuval Shaia @ 2016-03-06 12:33 UTC (permalink / raw)
  To: Eli Cohen
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, liranl-VPRAkNaXOzVWk0Htik3J/w

On Tue, Mar 01, 2016 at 06:52:21PM +0200, Eli Cohen wrote:
> Implement the IB defined callbacks used to manipulate the policy for the
> link state, set GUIDs or get statistics information. This functionality
> is added into a new file that will be used to add any SRIOV related
> functionality to the mlx5 IB layer.
> 
> The following callbacks have been added:
> 
> mlx5_ib_get_vf_config
> mlx5_ib_set_vf_link_state
> mlx5_ib_get_vf_stats
> mlx5_ib_set_vf_guid
> 
> In addition, publish whether this device is based on a virtual function.
> 
> In mlx5 supported devices, virtual functions are implemented as vHCAs.
> vHCAs have their own QP number space so it is possible that two vHCAs
> will use a QP with the same number at the same time.
> 
> Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/hw/mlx5/Makefile  |   2 +-
>  drivers/infiniband/hw/mlx5/ib_virt.c | 194 +++++++++++++++++++++++++++++++++++
>  drivers/infiniband/hw/mlx5/main.c    |  10 ++
>  drivers/infiniband/hw/mlx5/mlx5_ib.h |   8 ++
>  include/linux/mlx5/driver.h          |   5 +-
>  include/linux/mlx5/mlx5_ifc.h        |   6 ++
>  6 files changed, 223 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/infiniband/hw/mlx5/ib_virt.c
> 
> diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
> index 27a70159e2ea..82e074f47cf2 100644
> --- a/drivers/infiniband/hw/mlx5/Makefile
> +++ b/drivers/infiniband/hw/mlx5/Makefile
> @@ -1,4 +1,4 @@
>  obj-$(CONFIG_MLX5_INFINIBAND)	+= mlx5_ib.o
>  
> -mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o
> +mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o ib_virt.o
>  mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
> diff --git a/drivers/infiniband/hw/mlx5/ib_virt.c b/drivers/infiniband/hw/mlx5/ib_virt.c
> new file mode 100644
> index 000000000000..c1b9de800fe5
> --- /dev/null
> +++ b/drivers/infiniband/hw/mlx5/ib_virt.c
> @@ -0,0 +1,194 @@
> +/*
> + * Copyright (c) 2016, Mellanox Technologies. All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/mlx5/vport.h>
> +#include "mlx5_ib.h"
> +
> +static inline u32 mlx_to_net_policy(enum port_state_policy mlx_policy)
> +{
> +	switch (mlx_policy) {
> +	case MLX5_POLICY_DOWN:
> +		return IFLA_VF_LINK_STATE_DISABLE;
> +	case MLX5_POLICY_UP:
> +		return IFLA_VF_LINK_STATE_ENABLE;
> +	case MLX5_POLICY_FOLLOW:
> +		return IFLA_VF_LINK_STATE_AUTO;
> +	default:
> +		return __IFLA_VF_LINK_STATE_MAX;
> +	}
> +}
> +
> +int mlx5_ib_get_vf_config(struct ib_device *device, int vf, u8 port,
> +			  struct ifla_vf_info *info)
> +{
> +	struct mlx5_ib_dev *dev = to_mdev(device);
> +	struct mlx5_core_dev *mdev = dev->mdev;
> +	struct mlx5_hca_vport_context *rep;
> +	int err;
> +
> +	rep = kzalloc(sizeof(*rep), GFP_KERNEL);
Any reason why we can't have this guy on the stack?
> +	if (!rep)
> +		return -ENOMEM;
> +
> +	err = mlx5_query_hca_vport_context(mdev, 1, 1,  vf + 1, rep);
> +	if (err) {
> +		mlx5_ib_warn(dev, "failed to query port policy for vf %d (%d)\n",
> +			     vf, err);
> +		goto free;
> +	}
> +	memset(info, 0, sizeof(*info));
> +	info->linkstate = mlx_to_net_policy(rep->policy);
> +	if (info->linkstate == __IFLA_VF_LINK_STATE_MAX)
> +		err = -EINVAL;
> +
> +free:
> +	kfree(rep);
> +	return err;
> +}
> +
> +static inline enum port_state_policy net_to_mlx_policy(int policy)
> +{
> +	switch (policy) {
> +	case IFLA_VF_LINK_STATE_DISABLE:
> +		return MLX5_POLICY_DOWN;
> +	case IFLA_VF_LINK_STATE_ENABLE:
> +		return MLX5_POLICY_UP;
> +	case IFLA_VF_LINK_STATE_AUTO:
> +		return MLX5_POLICY_FOLLOW;
> +	default:
> +		return MLX5_POLICY_INVALID;
> +	}
> +}
> +
> +int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
> +			      u8 port, int state)
> +{
> +	struct mlx5_ib_dev *dev = to_mdev(device);
> +	struct mlx5_core_dev *mdev = dev->mdev;
> +	struct mlx5_hca_vport_context *in;
> +	int err;
> +
> +	in = kzalloc(sizeof(*in), GFP_KERNEL);
Same question here, heap vs. stack
> +	if (!in)
> +		return -ENOMEM;
> +
> +	in->policy = net_to_mlx_policy(state);
> +	if (in->policy == MLX5_POLICY_INVALID) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +	in->field_select = MLX5_HCA_VPORT_SEL_STATE_POLICY;
> +	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
> +
> +out:
> +	kfree(in);
> +	return err;
> +}
> +
> +int mlx5_ib_get_vf_stats(struct ib_device *device, int vf,
> +			 u8 port, struct ifla_vf_stats *stats)
> +{
> +	int out_sz = MLX5_ST_SZ_BYTES(query_vport_counter_out);
> +	struct mlx5_core_dev *mdev;
> +	struct mlx5_ib_dev *dev;
> +	void *out;
> +	int err;
> +
> +	dev = to_mdev(device);
> +	mdev = dev->mdev;
> +
> +	out = kzalloc(out_sz, GFP_KERNEL);
heap vs. stack
> +	if (!out)
> +		return -ENOMEM;
> +
> +	err = mlx5_core_query_vport_counter(mdev, true, vf, port, out, out_sz);
> +	if (err)
> +		goto ex;
> +
> +	stats->rx_packets = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.packets);
> +	stats->tx_packets = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.packets);
> +	stats->rx_bytes = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_unicast.octets);
> +	stats->tx_bytes = MLX5_GET64_PR(query_vport_counter_out, out, transmitted_ib_unicast.octets);
> +	stats->multicast = MLX5_GET64_PR(query_vport_counter_out, out, received_ib_multicast.packets);
> +
> +ex:
> +	kfree(out);
> +	return err;
> +}
> +
> +static int set_vf_node_guid(struct ib_device *device, int vf, u8 port, u64 guid)
> +{
> +	struct mlx5_ib_dev *dev = to_mdev(device);
> +	struct mlx5_core_dev *mdev = dev->mdev;
> +	struct mlx5_hca_vport_context *in;
> +	int err;
> +
> +	in = kzalloc(sizeof(*in), GFP_KERNEL);
heap vs. stack
> +	if (!in)
> +		return -ENOMEM;
> +
> +	in->field_select = MLX5_HCA_VPORT_SEL_NODE_GUID;
> +	in->node_guid = guid;
> +	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
> +	kfree(in);
> +	return err;
> +}
> +
> +static int set_vf_port_guid(struct ib_device *device, int vf, u8 port, u64 guid)
> +{
> +	struct mlx5_ib_dev *dev = to_mdev(device);
> +	struct mlx5_core_dev *mdev = dev->mdev;
> +	struct mlx5_hca_vport_context *in;
> +	int err;
> +
> +	in = kzalloc(sizeof(*in), GFP_KERNEL);
heap vs. stack
> +	if (!in)
> +		return -ENOMEM;
> +
> +	in->field_select = MLX5_HCA_VPORT_SEL_PORT_GUID;
> +	in->port_guid = guid;
> +	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
> +	kfree(in);
> +	return err;
> +}
> +
> +int mlx5_ib_set_vf_guid(struct ib_device *device, int vf, u8 port,
> +			u64 guid, int type)
> +{
> +	if (type == IFLA_VF_IB_NODE_GUID)
> +		return set_vf_node_guid(device, vf, port, guid);
> +	else if (type == IFLA_VF_IB_PORT_GUID)
> +		return set_vf_port_guid(device, vf, port, guid);
> +
> +	return -EINVAL;
> +}
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index 399049b70a72..00ca1e998254 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -549,6 +549,9 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
>  	if (MLX5_CAP_GEN(mdev, cd))
>  		props->device_cap_flags |= IB_DEVICE_CROSS_CHANNEL;
>  
> +	if (!mlx5_core_is_pf(mdev))
> +		props->device_cap_flags |= IB_DEVICE_VIRTUAL_FUNCTION;
> +
>  	return 0;
>  }
>  
> @@ -686,6 +689,7 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u8 port,
>  	props->qkey_viol_cntr	= rep->qkey_violation_counter;
>  	props->subnet_timeout	= rep->subnet_timeout;
>  	props->init_type_reply	= rep->init_type_reply;
> +	props->grh_required	= rep->grh_required;
>  
>  	err = mlx5_query_port_link_width_oper(mdev, &ib_link_width_oper, port);
>  	if (err)
> @@ -2266,6 +2270,12 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
>  	dev->ib_dev.map_mr_sg		= mlx5_ib_map_mr_sg;
>  	dev->ib_dev.check_mr_status	= mlx5_ib_check_mr_status;
>  	dev->ib_dev.get_port_immutable  = mlx5_port_immutable;
> +	if (mlx5_core_is_pf(mdev)) {
> +		dev->ib_dev.get_vf_config	= mlx5_ib_get_vf_config;
> +		dev->ib_dev.set_vf_link_state	= mlx5_ib_set_vf_link_state;
> +		dev->ib_dev.get_vf_stats	= mlx5_ib_get_vf_stats;
> +		dev->ib_dev.set_vf_guid		= mlx5_ib_set_vf_guid;
> +	}
>  
>  	mlx5_ib_internal_fill_odp_caps(dev);
>  
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index d2b9737baa36..752e3e33bcbc 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -719,6 +719,14 @@ void mlx5_ib_qp_disable_pagefaults(struct mlx5_ib_qp *qp);
>  void mlx5_ib_qp_enable_pagefaults(struct mlx5_ib_qp *qp);
>  void mlx5_ib_invalidate_range(struct ib_umem *umem, unsigned long start,
>  			      unsigned long end);
> +int mlx5_ib_get_vf_config(struct ib_device *device, int vf,
> +			  u8 port, struct ifla_vf_info *info);
> +int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
> +			      u8 port, int state);
> +int mlx5_ib_get_vf_stats(struct ib_device *device, int vf,
> +			 u8 port, struct ifla_vf_stats *stats);
> +int mlx5_ib_set_vf_guid(struct ib_device *device, int vf, u8 port,
> +			u64 guid, int type);
>  
>  #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
>  static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev)
> diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
> index 8edcd08853dd..ea00549e45aa 100644
> --- a/include/linux/mlx5/driver.h
> +++ b/include/linux/mlx5/driver.h
> @@ -613,7 +613,10 @@ struct mlx5_pas {
>  };
>  
>  enum port_state_policy {
> -	MLX5_AAA_000
> +	MLX5_POLICY_DOWN	= 0,
> +	MLX5_POLICY_UP		= 1,
> +	MLX5_POLICY_FOLLOW	= 2,
> +	MLX5_POLICY_INVALID	= 0xffffffff
>  };
>  
>  enum phy_port_state {
> diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
> index 9f404de5f99b..7a146074d428 100644
> --- a/include/linux/mlx5/mlx5_ifc.h
> +++ b/include/linux/mlx5/mlx5_ifc.h
> @@ -3661,6 +3661,12 @@ struct mlx5_ifc_query_hca_vport_pkey_in_bits {
>  	u8         pkey_index[0x10];
>  };
>  
> +enum {
> +	MLX5_HCA_VPORT_SEL_PORT_GUID	= 1 << 0,
> +	MLX5_HCA_VPORT_SEL_NODE_GUID	= 1 << 1,
> +	MLX5_HCA_VPORT_SEL_STATE_POLICY	= 1 << 2,
> +};
> +
>  struct mlx5_ifc_query_hca_vport_gid_out_bits {
>  	u8         status[0x8];
>  	u8         reserved_at_8[0x18];
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
       [not found]                                 ` <56D99D3D.4000606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-03-07  7:23                                   ` Or Gerlitz
  0 siblings, 0 replies; 34+ messages in thread
From: Or Gerlitz @ 2016-03-07  7:23 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jason Gunthorpe, Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss, Linux Netdev List

On Fri, Mar 4, 2016 at 4:35 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

[...]
> So, at *best*, the solution you are suggesting for existing setups is a
> partial solution that leaves things only half done.
[...]

Doug,

Point/s taken, justifying the code re-route of libvirt attempting to
issue set_vf_mac on the PF to go through
set_vf_guid needs handling of more aspects. We will remove it from this patch.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment
       [not found]                                 ` <56D99D8E.5020900-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-03-07  7:52                                   ` Or Gerlitz
  0 siblings, 0 replies; 34+ messages in thread
From: Or Gerlitz @ 2016-03-07  7:52 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Eli Cohen, Jason Gunthorpe, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Liran Liss

On Fri, Mar 4, 2016 at 4:37 PM, Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On 03/01/2016 02:46 PM, Eli Cohen wrote:
>> On Tue, Mar 01, 2016 at 12:31:53PM -0700, Jason Gunthorpe wrote:
>>>
>>> I mean opensm supports the GUID Alias scheme for virtualization, this
>>> new virtualization scheme is not compatible, and we shouldn't have the
>>> kernel drop support for existing working SMs, by, eg, replacing the
>>> mlx4 guid alias scheme with this new scheme.
>>>
>>> I'm guessing a user controlled switch is going to be necessary here to
>>> pick GUID alias or port port virtualization.

>> The alias GUID mechanism remains and can be used with mlx4 devices.
>> With this scheme the admin configures the port and node GUIDs using
>> iprout2 which ends up in the hardware driver configuring the deivce. A
>> virtualization aware SM can read this configuration through MADs.

> If the alias GUID mechanism is to be retained, then we need another NDO
> entry point for setting P_Keys on alias GUID VFs.  If we are going to
> switch over to using iproute2, then the solution needs to be complete.

Doug, as you commented on the net/core patch, iproute2 et al can't be
really used today to build mlx4 based SRIOV cloud systems. This series
doesn't introduce regressions for the way mlx4 setups are made.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs
  2016-03-01 16:52   ` [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs Eli Cohen
       [not found]     ` <1456851143-138332-2-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2021-10-26 15:16     ` Eugene Syromiatnikov
  1 sibling, 0 replies; 34+ messages in thread
From: Eugene Syromiatnikov @ 2021-10-26 15:16 UTC (permalink / raw)
  To: Eli Cohen; +Cc: linux-rdma, Liran Liss, Or Gerlitz, Doug Ledford, netdev

On Tue, Mar 01, 2016 at 06:52:14PM +0200, Eli Cohen wrote:
> +struct ifla_vf_guid {
> +	__u32 vf;
> +	__u64 guid;
> +};

This type definition differs in size on 64-bit and (most of) 32-bit
architectures, and it breaks 32-on-64-bit compat applications, as a result.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-10-26 15:16 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-01 16:52 [PATCH for-next 00/10] IB SR-IOV support Eli Cohen
     [not found] ` <1456851143-138332-1-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-01 16:52   ` [PATCH for-next 01/10] net/core: Add support for configuring VF GUIDs Eli Cohen
     [not found]     ` <1456851143-138332-2-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-01 17:37       ` Jason Gunthorpe
     [not found]         ` <20160301173751.GA25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-01 17:49           ` Eli Cohen
     [not found]             ` <20160301174951.GA19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
2016-03-01 18:25               ` Jason Gunthorpe
     [not found]                 ` <20160301182516.GA12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-01 21:08                   ` Or Gerlitz
     [not found]                     ` <CAJ3xEMgrAUCj7PS6fegmuSUsjMruH3gzSHZmuzAX+ZbHZOpL9w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-02 16:50                       ` Doug Ledford
     [not found]                         ` <56D719E3.2000206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-03-02 18:40                           ` Or Gerlitz
     [not found]                             ` <CAJ3xEMh5vJAZVO03=rRVCvqqXzXvah3idrMtMQfFP-wBxR7R_Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-04 14:35                               ` Doug Ledford
     [not found]                                 ` <56D99D3D.4000606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-03-07  7:23                                   ` Or Gerlitz
2021-10-26 15:16     ` Eugene Syromiatnikov
2016-03-01 16:52   ` [PATCH for-next 02/10] IB/mlx5: Fix decision on using MAD_IFC Eli Cohen
2016-03-01 16:52   ` [PATCH for-next 03/10] IB/core: Support accessing SA in virtualized environment Eli Cohen
     [not found]     ` <1456851143-138332-4-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-01 17:44       ` Jason Gunthorpe
     [not found]         ` <20160301174401.GC25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-01 18:17           ` Eli Cohen
     [not found]             ` <20160301181742.GB19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
2016-03-01 18:32               ` Jason Gunthorpe
     [not found]                 ` <20160301183256.GB12495-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-01 19:07                   ` Eli Cohen
     [not found]                     ` <20160301190742.GC19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
2016-03-01 19:31                       ` Jason Gunthorpe
     [not found]                         ` <20160301193153.GA25755-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-01 19:46                           ` Eli Cohen
     [not found]                             ` <20160301194608.GF19366-lgQlq6cFzJSjLWYaRI30zHI+JuX82XLG@public.gmane.org>
2016-03-01 21:15                               ` Or Gerlitz
2016-03-04 14:37                               ` Doug Ledford
     [not found]                                 ` <56D99D8E.5020900-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-03-07  7:52                                   ` Or Gerlitz
2016-03-01 16:52   ` [PATCH for-next 04/10] IB/core: Add interfaces to control VF attributes Eli Cohen
2016-03-01 16:52   ` [PATCH for-next 05/10] IB/ipoib: Add ndo operations for configuring VFs Eli Cohen
2016-03-01 16:52   ` [PATCH for-next 06/10] net/mlx5_core: Add VF param when querying vport counter Eli Cohen
2016-03-01 16:52   ` [PATCH for-next 07/10] net/mlx5_core: Implement modify HCA vport command Eli Cohen
2016-03-01 16:52   ` [PATCH for-next 08/10] IB/mlx5: Implement callbacks for manipulating VFs Eli Cohen
     [not found]     ` <1456851143-138332-9-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-06 12:33       ` Yuval Shaia
2016-03-01 16:52   ` [PATCH for-next 09/10] IB/ipoib: Allow mcast packets from other VFs Eli Cohen
     [not found]     ` <1456851143-138332-10-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-06 11:50       ` Yuval Shaia
     [not found]         ` <20160306115006.GA23975-Hxa29pjIrETlQW142y8m19+IiqhCXseY@public.gmane.org>
2016-03-06 12:13           ` Or Gerlitz
2016-03-01 16:52   ` [PATCH for-next 10/10] IB/core: Use GRH when the path hop-limit > 0 Eli Cohen
     [not found]     ` <1456851143-138332-11-git-send-email-eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-03-01 17:38       ` Jason Gunthorpe
     [not found]         ` <20160301173846.GB25176-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-03-03 15:55           ` Doug Ledford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).