All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 for-next 0/4] Add receive Flow Steering support
@ 2013-06-26 12:57 Or Gerlitz
       [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 12:57 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

Hi Roland, all

These patches add Flow Steering support to the kernel IB core, to uverbs and 
to the mlx4 IB (verbs) driver along with one patch to uverbs which adds 
some code to support extensions.

  IB/core: Add receive Flow Steering support
  IB/core: Infra-structure to support verbs extensions through uverbs
  IB/core: Export ib_create/destroy_flow through uverbs
  IB/mlx4: Add receive Flow Steering support

The main patch which introduces the Flow-Steering API is "IB/core: Add receive Flow 
Steering support", see its change log. Looking on the "Network Adapter Flow Steering" 
slides from Tzahi Oved which he presented on the annual OFA 2012 meeting could be helpful
https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/518-network-adapter-flow-steering.html

At high level, V2 fixed an issue which was found by Roland during the review. There are some
open questions posed by Roland on verbs extensions, will take it over the list as response
to another thread where it was raised.

V2 changes:
  - dropped struct ib_kern_flow from patch #3, this structure wasn't 
    used and was left there by mistake (bug, thanks Roland)

  - removed the void *flow_context field from struct ib_flow, this was 
    pointing to driver private data for that flow, but doesn't belong here, 
    i.e need not be seen by the verbs consumer but rather hidden.

  - renamed struct mlx4_flow_handle to mlx4_ib_flow, a structure that contains 
    the verbs level struct ib_flow and the mlx4 registeration ID for that flow

V1 changes:

 - dropped the five pre-patches which were accepted into 3.10
 - rebased the patches against Roland's for-next / 3.10-rc4
 - in patch #3, ib_uverbs_destroy_flow was returning too quickly when the driver
   returned failure for ib_destroy_flow, need to free some uverbs resources 1st.
 - in patch #4, check index before accessing the array at mlx4_ib_create/destroy_flow

V0 has been acknowledged by Steve and Christoph, and was also got positive feedback from 
Sean and Jason over f2f talks we had during the Linux Foundation EU summit on last month.

Or.


Hadar Hen Zion (3):
  IB/core: Add receive Flow Steering support
  IB/core: Export ib_create/destroy_flow through uverbs
  IB/mlx4: Add receive Flow Steering support

Igor Ivanov (1):
  IB/core: Infra-structure to support verbs extensions through uverbs

 drivers/infiniband/core/uverbs.h      |    3 +
 drivers/infiniband/core/uverbs_cmd.c  |  206 +++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |   42 +++++-
 drivers/infiniband/core/verbs.c       |   30 ++++
 drivers/infiniband/hw/mlx4/main.c     |  244 +++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |   12 ++
 include/linux/mlx4/device.h           |    5 -
 include/rdma/ib_verbs.h               |  136 ++++++++++++++++++-
 include/uapi/rdma/ib_user_verbs.h     |  112 +++++++++++++++-
 9 files changed, 776 insertions(+), 14 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-26 12:57   ` Or Gerlitz
       [not found]     ` <1372251464-13394-2-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-06-26 12:57   ` [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 12:57 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

From: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

The RDMA stack allows for applications to create IB_QPT_RAW_PACKET QPs,
for which plain Ethernet packets are used, specifically packets which
don't carry any QPN to be matched by the receiving side.

Applications using these QPs must be provided with a method to
program some steering rule with the HW so packets arriving at
the local port can be routed to them.

This patch adds ib_create_flow which allow to provide a flow specification
for a QP, such that when there's a match between the specification and the
received packet, it can be forwarded to that QP, in a similar manner
one needs to use ib_attach_multicast for IB UD multicast handling.

Flow specifications are provided as instances of struct ib_flow_spec_yyy
which describe L2, L3 and L4 headers, currently specs for Ethernet, IPv4,
TCP, UDP and IB are defined. Flow specs are made of values and masks.

The input to ib_create_flow is instance of struct ib_flow_attr which
contain few mandatory control elements and optional flow specs.

struct ib_flow_attr {
	enum ib_flow_attr_type type;
	u16      size;
	u16      priority;
	u8       num_of_specs;
	u8       port;
	u32      flags;
	/* Following are the optional layers according to user request
	 * struct ib_flow_spec_yyy
	 * struct ib_flow_spec_zzz
	 */
};

As these specs are eventually coming from user space, they are defined and
used in a way which allows adding new spec types without kernel/user ABI
change, and with a little API enhancement which defines the newly added spec.

The flow spec structures are defined in a TLV (Type-Length-Value) manner,
which allows to call ib_create_flow with a list of variable length of
optional specs.

For the actual processing of ib_flow_attr the driver uses the number of
specs and the size mandatory fields along with the TLV nature of the specs.

Steering rules processing order is according to rules priority. The user
sets the 12 low-order bits from the priority field and the remaining
4 high-order bits are set by the kernel according to a domain the
application or the layer that created the rule belongs to. Lower
priority numerical value means higher priority.

The returned value from ib_create_flow is instance of struct ib_flow
which contains a database pointer (handle) provided by the HW driver
to be used when calling ib_destroy_flow.

Applications that offload TCP/IP traffic could be written also over IB UD QPs.
As such, the ib_create_flow / ib_destroy_flow API is designed to support UD QPs
too, the HW driver sets IB_DEVICE_MANAGED_FLOW_STEERING to denote support
of flow steering.

The ib_flow_attr enum type relates to usage of flow steering for promiscuous
and sniffer purposes:

IB_FLOW_ATTR_NORMAL - "regular" rule, steering according to rule specification

IB_FLOW_ATTR_ALL_DEFAULT - default unicast and multicast rule, receive
all Ethernet traffic which isn't steered to any QP

IB_FLOW_ATTR_MC_DEFAULT - same as IB_FLOW_ATTR_ALL_DEFAULT but only for multicast

IB_FLOW_ATTR_SNIFFER - sniffer rule, receive all port traffic

ALL_DEFAULT and MC_DEFAULT rules options are valid only for Ethernet link type.

Signed-off-by: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c |   30 +++++++++
 include/rdma/ib_verbs.h         |  135 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 163 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 22192de..932f4a7 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1254,3 +1254,33 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 	return xrcd->device->dealloc_xrcd(xrcd);
 }
 EXPORT_SYMBOL(ib_dealloc_xrcd);
+
+struct ib_flow *ib_create_flow(struct ib_qp *qp,
+			       struct ib_flow_attr *flow_attr,
+			       int domain)
+{
+	struct ib_flow *flow_id;
+	if (!qp->device->create_flow)
+		return ERR_PTR(-ENOSYS);
+
+	flow_id = qp->device->create_flow(qp, flow_attr, domain);
+	if (!IS_ERR(flow_id))
+		atomic_inc(&qp->usecnt);
+	return flow_id;
+}
+EXPORT_SYMBOL(ib_create_flow);
+
+int ib_destroy_flow(struct ib_flow *flow_id)
+{
+	int err;
+	struct ib_qp *qp = flow_id->qp;
+
+	if (!flow_id->qp->device->destroy_flow)
+		return -ENOSYS;
+
+	err = qp->device->destroy_flow(flow_id);
+	if (!err)
+		atomic_dec(&qp->usecnt);
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_flow);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 98cc4b2..8e18d17 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -116,7 +116,8 @@ enum ib_device_cap_flags {
 	IB_DEVICE_MEM_MGT_EXTENSIONS	= (1<<21),
 	IB_DEVICE_BLOCK_MULTICAST_LOOPBACK = (1<<22),
 	IB_DEVICE_MEM_WINDOW_TYPE_2A	= (1<<23),
-	IB_DEVICE_MEM_WINDOW_TYPE_2B	= (1<<24)
+	IB_DEVICE_MEM_WINDOW_TYPE_2B	= (1<<24),
+	IB_DEVICE_MANAGED_FLOW_STEERING = (1<<29)
 };
 
 enum ib_atomic_cap {
@@ -1002,7 +1003,8 @@ struct ib_qp {
 	struct ib_srq	       *srq;
 	struct ib_xrcd	       *xrcd; /* XRC TGT QPs only */
 	struct list_head	xrcd_list;
-	atomic_t		usecnt; /* count times opened, mcast attaches */
+	/* count times opened, mcast attaches, flow attaches */
+	atomic_t		usecnt;
 	struct list_head	open_list;
 	struct ib_qp           *real_qp;
 	struct ib_uobject      *uobject;
@@ -1037,6 +1039,126 @@ struct ib_fmr {
 	u32			rkey;
 };
 
+/* Supported steering options */
+enum ib_flow_attr_type {
+	/* steering according to rule specifications */
+	IB_FLOW_ATTR_NORMAL		= 0x0,
+	/* default unicast and multicast rule -
+	 * receive all Eth traffic which isn't steered to any QP
+	 */
+	IB_FLOW_ATTR_ALL_DEFAULT	= 0x1,
+	/* default multicast rule -
+	 * receive all Eth multicast traffic which isn't steered to any QP
+	 */
+	IB_FLOW_ATTR_MC_DEFAULT		= 0x2,
+	/* sniffer rule - receive all port traffic */
+	IB_FLOW_ATTR_SNIFFER		= 0x3
+};
+
+/* Supported steering header types */
+enum ib_flow_spec_type {
+	/* L2 headers*/
+	IB_FLOW_SPEC_ETH	= 0x20,
+	IB_FLOW_SPEC_IB		= 0x21,
+	/* L3 header*/
+	IB_FLOW_SPEC_IPV4	= 0x30,
+	/* L4 headers*/
+	IB_FLOW_SPEC_TCP	= 0x40,
+	IB_FLOW_SPEC_UDP	= 0x41
+};
+
+/* Flow steering rule priority is set according to it's domain.
+ * Lower domain value means higher priority.
+ */
+enum ib_flow_domain {
+	IB_FLOW_DOMAIN_USER,
+	IB_FLOW_DOMAIN_ETHTOOL,
+	IB_FLOW_DOMAIN_RFS,
+	IB_FLOW_DOMAIN_NIC,
+	IB_FLOW_DOMAIN_NUM /* Must be last */
+};
+
+struct ib_flow_eth_filter {
+	u8	dst_mac[6];
+	u8	src_mac[6];
+	__be16	ether_type;
+	__be16	vlan_tag;
+};
+
+struct ib_flow_spec_eth {
+	enum ib_flow_spec_type	  type;
+	u16			  size;
+	struct ib_flow_eth_filter val;
+	struct ib_flow_eth_filter mask;
+};
+
+struct ib_flow_ib_filter {
+	__be32	l3_type_qpn;
+	u8	dst_gid[16];
+};
+
+struct ib_flow_spec_ib {
+	enum ib_flow_spec_type	 type;
+	u16			 size;
+	struct ib_flow_ib_filter val;
+	struct ib_flow_ib_filter mask;
+};
+
+struct ib_flow_ipv4_filter {
+	__be32	src_ip;
+	__be32	dst_ip;
+};
+
+struct ib_flow_spec_ipv4 {
+	enum ib_flow_spec_type	   type;
+	u16			   size;
+	struct ib_flow_ipv4_filter val;
+	struct ib_flow_ipv4_filter mask;
+};
+
+struct ib_flow_tcp_udp_filter {
+	__be16	dst_port;
+	__be16	src_port;
+};
+
+struct ib_flow_spec_tcp_udp {
+	enum ib_flow_spec_type	      type;
+	u16			      size;
+	struct ib_flow_tcp_udp_filter val;
+	struct ib_flow_tcp_udp_filter mask;
+};
+
+struct _ib_flow_spec {
+	union {
+		struct {
+			enum ib_flow_spec_type	type;
+			u16			size;
+		};
+		struct ib_flow_spec_ib ib;
+		struct ib_flow_spec_eth eth;
+		struct ib_flow_spec_ipv4 ipv4;
+		struct ib_flow_spec_tcp_udp tcp_udp;
+	};
+};
+
+struct ib_flow_attr {
+	enum ib_flow_attr_type type;
+	u16	     size;
+	u16	     priority;
+	u8	     num_of_specs;
+	u8	     port;
+	u32	     flags;
+	/* Following are the optional layers according to user request
+	 * struct ib_flow_spec_xxx
+	 * struct ib_flow_spec_yyy
+	 */
+};
+
+struct ib_flow {
+	struct ib_qp		*qp;
+	struct ib_uobject	*uobject;
+};
+
 struct ib_mad;
 struct ib_grh;
 
@@ -1269,6 +1391,11 @@ struct ib_device {
 						 struct ib_ucontext *ucontext,
 						 struct ib_udata *udata);
 	int			   (*dealloc_xrcd)(struct ib_xrcd *xrcd);
+	struct ib_flow *	   (*create_flow)(struct ib_qp *qp,
+						  struct ib_flow_attr
+						  *flow_attr,
+						  int domain);
+	int			   (*destroy_flow)(struct ib_flow *flow_id);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -2229,4 +2356,8 @@ struct ib_xrcd *ib_alloc_xrcd(struct ib_device *device);
  */
 int ib_dealloc_xrcd(struct ib_xrcd *xrcd);
 
+struct ib_flow *ib_create_flow(struct ib_qp *qp,
+			       struct ib_flow_attr *flow_attr, int domain);
+int ib_destroy_flow(struct ib_flow *flow_id);
+
 #endif /* IB_VERBS_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-06-26 12:57   ` [PATCH V2 for-next 1/4] IB/core: " Or Gerlitz
@ 2013-06-26 12:57   ` Or Gerlitz
       [not found]     ` <1372251464-13394-3-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-06-26 12:57   ` [PATCH V2 for-next 3/4] IB/core: Export ib_create/destroy_flow " Or Gerlitz
  2013-06-26 12:57   ` [PATCH V2 for-next 4/4] IB/mlx4: Add receive Flow Steering support Or Gerlitz
  3 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 12:57 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	Igor Ivanov

From: Igor Ivanov <Igor.Ivanov-wN0M4riKYwLQT0dZR+AlfA@public.gmane.org>

Add Infra-structure to support extended uverbs capabilities in a forward/backward
manner. Uverbs command opcodes which are based on the verbs extensions approach should
be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format
and processed a bit differently.

Signed-off-by: Igor Ivanov <Igor.Ivanov-wN0M4riKYwLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_main.c |   29 ++++++++++++++++++++++++-----
 include/uapi/rdma/ib_user_verbs.h     |   10 ++++++++++
 2 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 2c6f0f2..e4e7b24 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -583,9 +583,6 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 	if (copy_from_user(&hdr, buf, sizeof hdr))
 		return -EFAULT;
 
-	if (hdr.in_words * 4 != count)
-		return -EINVAL;
-
 	if (hdr.command >= ARRAY_SIZE(uverbs_cmd_table) ||
 	    !uverbs_cmd_table[hdr.command])
 		return -EINVAL;
@@ -597,8 +594,30 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf,
 	if (!(file->device->ib_dev->uverbs_cmd_mask & (1ull << hdr.command)))
 		return -ENOSYS;
 
-	return uverbs_cmd_table[hdr.command](file, buf + sizeof hdr,
-					     hdr.in_words * 4, hdr.out_words * 4);
+	if (hdr.command >= IB_USER_VERBS_CMD_THRESHOLD) {
+		struct ib_uverbs_cmd_hdr_ex hdr_ex;
+
+		if (copy_from_user(&hdr_ex, buf, sizeof(hdr_ex)))
+			return -EFAULT;
+
+		if (((hdr_ex.in_words + hdr_ex.provider_in_words) * 4) != count)
+			return -EINVAL;
+
+		return uverbs_cmd_table[hdr.command](file,
+						     buf + sizeof(hdr_ex),
+						     (hdr_ex.in_words +
+						      hdr_ex.provider_in_words) * 4,
+						     (hdr_ex.out_words +
+						      hdr_ex.provider_out_words) * 4);
+	} else {
+		if (hdr.in_words * 4 != count)
+			return -EINVAL;
+
+		return uverbs_cmd_table[hdr.command](file,
+						     buf + sizeof(hdr),
+						     hdr.in_words * 4,
+						     hdr.out_words * 4);
+	}
 }
 
 static int ib_uverbs_mmap(struct file *filp, struct vm_area_struct *vma)
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 805711e..61535aa 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -43,6 +43,7 @@
  * compatibility are made.
  */
 #define IB_USER_VERBS_ABI_VERSION	6
+#define IB_USER_VERBS_CMD_THRESHOLD    50
 
 enum {
 	IB_USER_VERBS_CMD_GET_CONTEXT,
@@ -123,6 +124,15 @@ struct ib_uverbs_cmd_hdr {
 	__u16 out_words;
 };
 
+struct ib_uverbs_cmd_hdr_ex {
+	__u32 command;
+	__u16 in_words;
+	__u16 out_words;
+	__u16 provider_in_words;
+	__u16 provider_out_words;
+	__u32 cmd_hdr_reserved;
+};
+
 struct ib_uverbs_get_context {
 	__u64 response;
 	__u64 driver_data[0];
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V2 for-next 3/4] IB/core: Export ib_create/destroy_flow through uverbs
       [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-06-26 12:57   ` [PATCH V2 for-next 1/4] IB/core: " Or Gerlitz
  2013-06-26 12:57   ` [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
@ 2013-06-26 12:57   ` Or Gerlitz
  2013-06-26 12:57   ` [PATCH V2 for-next 4/4] IB/mlx4: Add receive Flow Steering support Or Gerlitz
  3 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 12:57 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

From: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Implement ib_uverbs_create_flow and ib_uverbs_destroy_flow to
support flow steering for user space applications.

Signed-off-by: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |    3 +
 drivers/infiniband/core/uverbs_cmd.c  |  206 +++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |   13 ++-
 include/rdma/ib_verbs.h               |    1 +
 include/uapi/rdma/ib_user_verbs.h     |  102 ++++++++++++++++-
 5 files changed, 323 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 0fcd7aa..ad9d102 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -155,6 +155,7 @@ extern struct idr ib_uverbs_cq_idr;
 extern struct idr ib_uverbs_qp_idr;
 extern struct idr ib_uverbs_srq_idr;
 extern struct idr ib_uverbs_xrcd_idr;
+extern struct idr ib_uverbs_rule_idr;
 
 void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
 
@@ -215,5 +216,7 @@ IB_UVERBS_DECLARE_CMD(destroy_srq);
 IB_UVERBS_DECLARE_CMD(create_xsrq);
 IB_UVERBS_DECLARE_CMD(open_xrcd);
 IB_UVERBS_DECLARE_CMD(close_xrcd);
+IB_UVERBS_DECLARE_CMD(create_flow);
+IB_UVERBS_DECLARE_CMD(destroy_flow);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a7d00f6..956782b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -54,6 +54,7 @@ static struct uverbs_lock_class qp_lock_class	= { .name = "QP-uobj" };
 static struct uverbs_lock_class ah_lock_class	= { .name = "AH-uobj" };
 static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
 static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
+static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
 
 #define INIT_UDATA(udata, ibuf, obuf, ilen, olen)			\
 	do {								\
@@ -330,6 +331,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->srq_list);
 	INIT_LIST_HEAD(&ucontext->ah_list);
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
+	INIT_LIST_HEAD(&ucontext->rule_list);
 	ucontext->closing = 0;
 
 	resp.num_comp_vectors = file->device->num_comp_vectors;
@@ -2587,6 +2589,210 @@ out_put:
 	return ret ? ret : in_len;
 }
 
+static int kern_spec_to_ib_spec(struct ib_kern_spec *kern_spec,
+				struct _ib_flow_spec *ib_spec)
+{
+	ib_spec->type = kern_spec->type;
+
+	switch (ib_spec->type) {
+	case IB_FLOW_SPEC_ETH:
+		ib_spec->eth.size = sizeof(struct ib_flow_spec_eth);
+		memcpy(&ib_spec->eth.val, &kern_spec->eth.val,
+		       sizeof(struct ib_flow_eth_filter));
+		memcpy(&ib_spec->eth.mask, &kern_spec->eth.mask,
+		       sizeof(struct ib_flow_eth_filter));
+		break;
+	case IB_FLOW_SPEC_IB:
+		ib_spec->ib.size = sizeof(struct ib_flow_spec_ib);
+		memcpy(&ib_spec->ib.val, &kern_spec->ib.val,
+		       sizeof(struct ib_flow_ib_filter));
+		memcpy(&ib_spec->ib.mask, &kern_spec->ib.mask,
+		       sizeof(struct ib_flow_ib_filter));
+		break;
+	case IB_FLOW_SPEC_IPV4:
+		ib_spec->ipv4.size = sizeof(struct ib_flow_spec_ipv4);
+		memcpy(&ib_spec->ipv4.val, &kern_spec->ipv4.val,
+		       sizeof(struct ib_flow_ipv4_filter));
+		memcpy(&ib_spec->ipv4.mask, &kern_spec->ipv4.mask,
+		       sizeof(struct ib_flow_ipv4_filter));
+		break;
+	case IB_FLOW_SPEC_TCP:
+	case IB_FLOW_SPEC_UDP:
+		ib_spec->tcp_udp.size = sizeof(struct ib_flow_spec_tcp_udp);
+		memcpy(&ib_spec->tcp_udp.val, &kern_spec->tcp_udp.val,
+		       sizeof(struct ib_flow_tcp_udp_filter));
+		memcpy(&ib_spec->tcp_udp.mask, &kern_spec->tcp_udp.mask,
+		       sizeof(struct ib_flow_tcp_udp_filter));
+		break;
+	default:
+		return -EINVAL;
+	}
+	return 0;
+}
+
+ssize_t ib_uverbs_create_flow(struct ib_uverbs_file *file,
+			      const char __user *buf, int in_len,
+			      int out_len)
+{
+	struct ib_uverbs_create_flow	  cmd;
+	struct ib_uverbs_create_flow_resp resp;
+	struct ib_uobject		  *uobj;
+	struct ib_flow			  *flow_id;
+	struct ib_kern_flow_attr	  *kern_flow_attr;
+	struct ib_flow_attr		  *flow_attr;
+	struct ib_qp			  *qp;
+	int err = 0;
+	void *kern_spec;
+	void *ib_spec;
+	int i;
+
+	if (out_len < sizeof(resp))
+		return -ENOSPC;
+
+	if (copy_from_user(&cmd, buf, sizeof(cmd)))
+		return -EFAULT;
+
+	if ((cmd.flow_attr.type == IB_FLOW_ATTR_SNIFFER &&
+	     !capable(CAP_NET_ADMIN)) || !capable(CAP_NET_RAW))
+		return -EPERM;
+
+	if (cmd.flow_attr.num_of_specs) {
+		kern_flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
+		if (!kern_flow_attr)
+			return -ENOMEM;
+
+		memcpy(kern_flow_attr, &cmd.flow_attr, sizeof(*kern_flow_attr));
+		if (copy_from_user(kern_flow_attr + 1, buf + sizeof(cmd),
+				   cmd.flow_attr.size - sizeof(cmd))) {
+			err = -EFAULT;
+			goto err_free_attr;
+		}
+	} else {
+		kern_flow_attr = &cmd.flow_attr;
+	}
+
+	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
+	if (!uobj) {
+		err = -ENOMEM;
+		goto err_free_attr;
+	}
+	init_uobj(uobj, 0, file->ucontext, &rule_lock_class);
+	down_write(&uobj->mutex);
+
+	qp = idr_read_qp(cmd.qp_handle, file->ucontext);
+	if (!qp) {
+		err = -EINVAL;
+		goto err_uobj;
+	}
+
+	flow_attr = kmalloc(cmd.flow_attr.size, GFP_KERNEL);
+	if (!flow_attr) {
+		err = -ENOMEM;
+		goto err_put;
+	}
+
+	flow_attr->type = kern_flow_attr->type;
+	flow_attr->priority = kern_flow_attr->priority;
+	flow_attr->num_of_specs = kern_flow_attr->num_of_specs;
+	flow_attr->port = kern_flow_attr->port;
+	flow_attr->flags = kern_flow_attr->flags;
+	flow_attr->size = sizeof(*flow_attr);
+
+	kern_spec = kern_flow_attr + 1;
+	ib_spec = flow_attr + 1;
+	for (i = 0; i < flow_attr->num_of_specs; i++) {
+		err = kern_spec_to_ib_spec(kern_spec, ib_spec);
+		if (err)
+			goto err_free;
+		flow_attr->size +=
+			((struct _ib_flow_spec *)ib_spec)->size;
+		kern_spec += ((struct ib_kern_spec *)kern_spec)->size;
+		ib_spec += ((struct _ib_flow_spec *)ib_spec)->size;
+	}
+	flow_id = ib_create_flow(qp, flow_attr, IB_FLOW_DOMAIN_USER);
+	if (IS_ERR(flow_id)) {
+		err = PTR_ERR(flow_id);
+		goto err_free;
+	}
+	flow_id->qp = qp;
+	flow_id->uobject = uobj;
+	uobj->object = flow_id;
+
+	err = idr_add_uobj(&ib_uverbs_rule_idr, uobj);
+	if (err)
+		goto destroy_flow;
+
+	memset(&resp, 0, sizeof(resp));
+	resp.flow_handle = uobj->id;
+
+	if (copy_to_user((void __user *)(unsigned long) cmd.response,
+			 &resp, sizeof(resp))) {
+		err = -EFAULT;
+		goto err_copy;
+	}
+
+	put_qp_read(qp);
+	mutex_lock(&file->mutex);
+	list_add_tail(&uobj->list, &file->ucontext->rule_list);
+	mutex_unlock(&file->mutex);
+
+	uobj->live = 1;
+
+	up_write(&uobj->mutex);
+	kfree(flow_attr);
+	if (cmd.flow_attr.num_of_specs)
+		kfree(kern_flow_attr);
+	return in_len;
+err_copy:
+	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+destroy_flow:
+	ib_destroy_flow(flow_id);
+err_free:
+	kfree(flow_attr);
+err_put:
+	put_qp_read(qp);
+err_uobj:
+	put_uobj_write(uobj);
+err_free_attr:
+	if (cmd.flow_attr.num_of_specs)
+		kfree(kern_flow_attr);
+	return err;
+}
+
+ssize_t ib_uverbs_destroy_flow(struct ib_uverbs_file *file,
+			       const char __user *buf, int in_len,
+			       int out_len) {
+	struct ib_uverbs_destroy_flow	cmd;
+	struct ib_flow			*flow_id;
+	struct ib_uobject		*uobj;
+	int				ret;
+
+	if (copy_from_user(&cmd, buf, sizeof(cmd)))
+		return -EFAULT;
+
+	uobj = idr_write_uobj(&ib_uverbs_rule_idr, cmd.flow_handle,
+			      file->ucontext);
+	if (!uobj)
+		return -EINVAL;
+	flow_id = uobj->object;
+
+	ret = ib_destroy_flow(flow_id);
+	if (!ret)
+		uobj->live = 0;
+
+	put_uobj_write(uobj);
+
+	idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+
+	mutex_lock(&file->mutex);
+	list_del(&uobj->list);
+	mutex_unlock(&file->mutex);
+
+	put_uobj(uobj);
+
+	return ret ? ret : in_len;
+}
+
 static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 				struct ib_uverbs_create_xsrq *cmd,
 				struct ib_udata *udata)
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index e4e7b24..75ad86c 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -73,6 +73,7 @@ DEFINE_IDR(ib_uverbs_cq_idr);
 DEFINE_IDR(ib_uverbs_qp_idr);
 DEFINE_IDR(ib_uverbs_srq_idr);
 DEFINE_IDR(ib_uverbs_xrcd_idr);
+DEFINE_IDR(ib_uverbs_rule_idr);
 
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -113,7 +114,9 @@ static ssize_t (*uverbs_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_CMD_OPEN_XRCD]		= ib_uverbs_open_xrcd,
 	[IB_USER_VERBS_CMD_CLOSE_XRCD]		= ib_uverbs_close_xrcd,
 	[IB_USER_VERBS_CMD_CREATE_XSRQ]		= ib_uverbs_create_xsrq,
-	[IB_USER_VERBS_CMD_OPEN_QP]		= ib_uverbs_open_qp
+	[IB_USER_VERBS_CMD_OPEN_QP]		= ib_uverbs_open_qp,
+	[IB_USER_VERBS_CMD_CREATE_FLOW]		= ib_uverbs_create_flow,
+	[IB_USER_VERBS_CMD_DESTROY_FLOW]	= ib_uverbs_destroy_flow
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
@@ -212,6 +215,14 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uobj);
 	}
 
+	list_for_each_entry_safe(uobj, tmp, &context->rule_list, list) {
+		struct ib_flow *flow_id = uobj->object;
+
+		idr_remove_uobj(&ib_uverbs_rule_idr, uobj);
+		ib_destroy_flow(flow_id);
+		kfree(uobj);
+	}
+
 	list_for_each_entry_safe(uobj, tmp, &context->qp_list, list) {
 		struct ib_qp *qp = uobj->object;
 		struct ib_uqp_object *uqp =
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 8e18d17..0903ce4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -923,6 +923,7 @@ struct ib_ucontext {
 	struct list_head	srq_list;
 	struct list_head	ah_list;
 	struct list_head	xrcd_list;
+	struct list_head	rule_list;
 	int			closing;
 };
 
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 61535aa..81bcbf6 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -86,7 +86,9 @@ enum {
 	IB_USER_VERBS_CMD_OPEN_XRCD,
 	IB_USER_VERBS_CMD_CLOSE_XRCD,
 	IB_USER_VERBS_CMD_CREATE_XSRQ,
-	IB_USER_VERBS_CMD_OPEN_QP
+	IB_USER_VERBS_CMD_OPEN_QP,
+	IB_USER_VERBS_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
+	IB_USER_VERBS_CMD_DESTROY_FLOW
 };
 
 /*
@@ -694,6 +696,104 @@ struct ib_uverbs_detach_mcast {
 	__u64 driver_data[0];
 };
 
+struct ib_kern_eth_filter {
+	__u8  dst_mac[6];
+	__u8  src_mac[6];
+	__be16 ether_type;
+	__be16 vlan_tag;
+};
+
+struct ib_kern_spec_eth {
+	__u32  type;
+	__u16  size;
+	__u16  reserved;
+	struct ib_kern_eth_filter val;
+	struct ib_kern_eth_filter mask;
+};
+
+struct ib_kern_ib_filter {
+	__be32 l3_type_qpn;
+	__u8  dst_gid[16];
+};
+
+struct ib_kern_spec_ib {
+	__u32  type;
+	__u16  size;
+	__u16  reserved;
+	struct ib_kern_ib_filter val;
+	struct ib_kern_ib_filter mask;
+};
+
+struct ib_kern_ipv4_filter {
+	__be32 src_ip;
+	__be32 dst_ip;
+};
+
+struct ib_kern_spec_ipv4 {
+	__u32  type;
+	__u16  size;
+	__u16  reserved;
+	struct ib_kern_ipv4_filter val;
+	struct ib_kern_ipv4_filter mask;
+};
+
+struct ib_kern_tcp_udp_filter {
+	__be16 dst_port;
+	__be16 src_port;
+};
+
+struct ib_kern_spec_tcp_udp {
+	__u32  type;
+	__u16  size;
+	__u16  reserved;
+	struct ib_kern_tcp_udp_filter val;
+	struct ib_kern_tcp_udp_filter mask;
+};
+
+struct ib_kern_spec {
+	union {
+		struct {
+			__u32 type;
+			__u16 size;
+		};
+		struct ib_kern_spec_ib	    ib;
+		struct ib_kern_spec_eth	    eth;
+		struct ib_kern_spec_ipv4    ipv4;
+		struct ib_kern_spec_tcp_udp tcp_udp;
+	};
+};
+
+struct ib_kern_flow_attr {
+	__u32 type;
+	__u16 size;
+	__u16 priority;
+	__u8  num_of_specs;
+	__u8  reserved[2];
+	__u8  port;
+	__u32 flags;
+	/* Following are the optional layers according to user request
+	 * struct ib_flow_spec_xxx
+	 * struct ib_flow_spec_yyy
+	 */
+};
+
+struct ib_uverbs_create_flow  {
+	__u32 comp_mask;
+	__u64 response;
+	__u32 qp_handle;
+	struct ib_kern_flow_attr flow_attr;
+};
+
+struct ib_uverbs_create_flow_resp {
+	__u32 comp_mask;
+	__u32 flow_handle;
+};
+
+struct ib_uverbs_destroy_flow  {
+	__u32 comp_mask;
+	__u32 flow_handle;
+};
+
 struct ib_uverbs_create_srq {
 	__u64 response;
 	__u64 user_handle;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V2 for-next 4/4] IB/mlx4: Add receive Flow Steering support
       [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2013-06-26 12:57   ` [PATCH V2 for-next 3/4] IB/core: Export ib_create/destroy_flow " Or Gerlitz
@ 2013-06-26 12:57   ` Or Gerlitz
  3 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 12:57 UTC (permalink / raw)
  To: roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w

From: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Implement ib_create_flow and ib_destroy_flow.

Translate the verbs structures provided by the user to HW structures
and call the MLX4_QP_FLOW_STEERING_ATTACH/DETACH firmware commands.

On the ATTACH command completion, the firmware provides 64 bit registration
ID which is placed into struct mlx4_ib_flow that wraps the instance of 
struct ib_flow which is retuned to caller. Later, this reg ID is used
for detaching that flow from the firmware.

Signed-off-by: Hadar Hen Zion <hadarh-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c    |  244 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx4/mlx4_ib.h |   12 ++
 include/linux/mlx4/device.h          |    5 -
 3 files changed, 256 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index a188d31..752c958 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -54,6 +54,8 @@
 #define DRV_VERSION	"1.0"
 #define DRV_RELDATE	"April 4, 2008"
 
+#define MLX4_IB_FLOW_MAX_PRIO 0xFFF
+
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("Mellanox ConnectX HCA InfiniBand driver");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -88,6 +90,25 @@ static void init_query_mad(struct ib_smp *mad)
 
 static union ib_gid zgid;
 
+static int check_flow_steering_support(struct mlx4_dev *dev)
+{
+	int ib_num_ports = 0;
+	int i;
+
+	mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
+		ib_num_ports++;
+
+	if (dev->caps.steering_mode == MLX4_STEERING_MODE_DEVICE_MANAGED) {
+		if (ib_num_ports || mlx4_is_mfunc(dev)) {
+			pr_warn("Device managed flow steering is unavailable "
+				"for IB ports or in multifunction env.\n");
+			return 0;
+		}
+		return 1;
+	}
+	return 0;
+}
+
 static int mlx4_ib_query_device(struct ib_device *ibdev,
 				struct ib_device_attr *props)
 {
@@ -144,6 +165,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
 			props->device_cap_flags |= IB_DEVICE_MEM_WINDOW_TYPE_2B;
 		else
 			props->device_cap_flags |= IB_DEVICE_MEM_WINDOW_TYPE_2A;
+	if (check_flow_steering_support(dev->dev))
+		props->device_cap_flags |= IB_DEVICE_MANAGED_FLOW_STEERING;
 	}
 
 	props->vendor_id	   = be32_to_cpup((__be32 *) (out_mad->data + 36)) &
@@ -798,6 +821,218 @@ struct mlx4_ib_steering {
 	union ib_gid gid;
 };
 
+static int parse_flow_attr(struct mlx4_dev *dev,
+			   struct _ib_flow_spec *ib_spec,
+			   struct _rule_hw *mlx4_spec)
+{
+	enum mlx4_net_trans_rule_id type;
+
+	switch (ib_spec->type) {
+	case IB_FLOW_SPEC_ETH:
+		type = MLX4_NET_TRANS_RULE_ID_ETH;
+		memcpy(mlx4_spec->eth.dst_mac, ib_spec->eth.val.dst_mac,
+		       ETH_ALEN);
+		memcpy(mlx4_spec->eth.dst_mac_msk, ib_spec->eth.mask.dst_mac,
+		       ETH_ALEN);
+		mlx4_spec->eth.vlan_tag = ib_spec->eth.val.vlan_tag;
+		mlx4_spec->eth.vlan_tag_msk = ib_spec->eth.mask.vlan_tag;
+		break;
+
+	case IB_FLOW_SPEC_IB:
+		type = MLX4_NET_TRANS_RULE_ID_IB;
+		mlx4_spec->ib.l3_qpn = ib_spec->ib.val.l3_type_qpn;
+		mlx4_spec->ib.qpn_mask = ib_spec->ib.mask.l3_type_qpn;
+		memcpy(&mlx4_spec->ib.dst_gid, ib_spec->ib.val.dst_gid, 16);
+		memcpy(&mlx4_spec->ib.dst_gid_msk,
+		       ib_spec->ib.mask.dst_gid, 16);
+		break;
+
+	case IB_FLOW_SPEC_IPV4:
+		type = MLX4_NET_TRANS_RULE_ID_IPV4;
+		mlx4_spec->ipv4.src_ip = ib_spec->ipv4.val.src_ip;
+		mlx4_spec->ipv4.src_ip_msk = ib_spec->ipv4.mask.src_ip;
+		mlx4_spec->ipv4.dst_ip = ib_spec->ipv4.val.dst_ip;
+		mlx4_spec->ipv4.dst_ip_msk = ib_spec->ipv4.mask.dst_ip;
+		break;
+
+	case IB_FLOW_SPEC_TCP:
+	case IB_FLOW_SPEC_UDP:
+		type = ib_spec->type == IB_FLOW_SPEC_TCP ?
+					MLX4_NET_TRANS_RULE_ID_TCP :
+					MLX4_NET_TRANS_RULE_ID_UDP;
+		mlx4_spec->tcp_udp.dst_port = ib_spec->tcp_udp.val.dst_port;
+		mlx4_spec->tcp_udp.dst_port_msk = ib_spec->tcp_udp.mask.dst_port;
+		mlx4_spec->tcp_udp.src_port = ib_spec->tcp_udp.val.src_port;
+		mlx4_spec->tcp_udp.src_port_msk = ib_spec->tcp_udp.mask.src_port;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+	if (mlx4_map_sw_to_hw_steering_id(dev, type) < 0 ||
+	    mlx4_hw_rule_sz(dev, type) < 0)
+		return -EINVAL;
+	mlx4_spec->id = cpu_to_be16(mlx4_map_sw_to_hw_steering_id(dev, type));
+	mlx4_spec->size = mlx4_hw_rule_sz(dev, type) >> 2;
+	return mlx4_hw_rule_sz(dev, type);
+}
+
+static int __mlx4_ib_create_flow(struct ib_qp *qp, struct ib_flow_attr *flow_attr,
+			  int domain,
+			  enum mlx4_net_trans_promisc_mode flow_type,
+			  u64 *reg_id)
+{
+	int ret, i;
+	int size = 0;
+	void *ib_flow;
+	struct mlx4_ib_dev *mdev = to_mdev(qp->device);
+	struct mlx4_cmd_mailbox *mailbox;
+	struct mlx4_net_trans_rule_hw_ctrl *ctrl;
+	size_t rule_size = sizeof(struct mlx4_net_trans_rule_hw_ctrl) +
+			   (sizeof(struct _rule_hw) * flow_attr->num_of_specs);
+
+	static const u16 __mlx4_domain[] = {
+		[IB_FLOW_DOMAIN_USER] = MLX4_DOMAIN_UVERBS,
+		[IB_FLOW_DOMAIN_ETHTOOL] = MLX4_DOMAIN_ETHTOOL,
+		[IB_FLOW_DOMAIN_RFS] = MLX4_DOMAIN_RFS,
+		[IB_FLOW_DOMAIN_NIC] = MLX4_DOMAIN_NIC,
+	};
+
+	if (flow_attr->priority > MLX4_IB_FLOW_MAX_PRIO) {
+		pr_err("Invalid priority value %d\n", flow_attr->priority);
+		return -EINVAL;
+	}
+
+	if (domain >= IB_FLOW_DOMAIN_NUM) {
+		pr_err("Invalid domain value %d\n", domain);
+		return -EINVAL;
+	}
+
+	if (mlx4_map_sw_to_hw_steering_mode(mdev->dev, flow_type) < 0)
+		return -EINVAL;
+
+	mailbox = mlx4_alloc_cmd_mailbox(mdev->dev);
+	if (IS_ERR(mailbox))
+		return PTR_ERR(mailbox);
+	memset(mailbox->buf, 0, rule_size);
+	ctrl = mailbox->buf;
+
+	ctrl->prio = cpu_to_be16(__mlx4_domain[domain] |
+				 flow_attr->priority);
+	ctrl->type = mlx4_map_sw_to_hw_steering_mode(mdev->dev, flow_type);
+	ctrl->port = flow_attr->port;
+	ctrl->qpn = cpu_to_be32(qp->qp_num);
+
+	ib_flow = flow_attr + 1;
+	size += sizeof(struct mlx4_net_trans_rule_hw_ctrl);
+	for (i = 0; i < flow_attr->num_of_specs; i++) {
+		ret = parse_flow_attr(mdev->dev, ib_flow, mailbox->buf + size);
+		if (ret < 0) {
+			mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+			return -EINVAL;
+		}
+		ib_flow += ((struct _ib_flow_spec *)ib_flow)->size;
+		size += ret;
+	}
+
+	ret = mlx4_cmd_imm(mdev->dev, mailbox->dma, reg_id, size >> 2, 0,
+			   MLX4_QP_FLOW_STEERING_ATTACH, MLX4_CMD_TIME_CLASS_A,
+			   MLX4_CMD_NATIVE);
+	if (ret == -ENOMEM)
+		pr_err("mcg table is full. Fail to register network rule.\n");
+	else if (ret == -ENXIO)
+		pr_err("Device managed flow steering is disabled. Fail to register network rule.\n");
+	else if (ret)
+		pr_err("Invalid argumant. Fail to register network rule.\n");
+
+	mlx4_free_cmd_mailbox(mdev->dev, mailbox);
+	return ret;
+}
+
+static int __mlx4_ib_destroy_flow(struct mlx4_dev *dev, u64 reg_id)
+{
+	int err;
+	err = mlx4_cmd(dev, reg_id, 0, 0,
+		       MLX4_QP_FLOW_STEERING_DETACH, MLX4_CMD_TIME_CLASS_A,
+		       MLX4_CMD_NATIVE);
+	if (err)
+		pr_err("Fail to detach network rule. registration id = 0x%llx\n",
+		       reg_id);
+	return err;
+}
+
+static struct ib_flow *mlx4_ib_create_flow(struct ib_qp *qp,
+				    struct ib_flow_attr *flow_attr,
+				    int domain)
+{
+	int err = 0, i = 0;
+	struct mlx4_ib_flow *mflow;
+	enum mlx4_net_trans_promisc_mode type[2];
+
+	memset(type, 0, sizeof(type));
+
+	mflow = kzalloc(sizeof(struct mlx4_ib_flow), GFP_KERNEL);
+	if (!mflow) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	switch (flow_attr->type) {
+	case IB_FLOW_ATTR_NORMAL:
+		type[0] = MLX4_FS_REGULAR;
+		break;
+
+	case IB_FLOW_ATTR_ALL_DEFAULT:
+		type[0] = MLX4_FS_ALL_DEFAULT;
+		break;
+
+	case IB_FLOW_ATTR_MC_DEFAULT:
+		type[0] = MLX4_FS_MC_DEFAULT;
+		break;
+
+	case IB_FLOW_ATTR_SNIFFER:
+		type[0] = MLX4_FS_UC_SNIFFER;
+		type[1] = MLX4_FS_MC_SNIFFER;
+		break;
+
+	default:
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	while (i < ARRAY_SIZE(type) && type[i]) {
+		err = __mlx4_ib_create_flow(qp, flow_attr, domain, type[i],
+					    &mflow->reg_id[i]);
+		if (err)
+			goto err_free;
+		i++;
+	}
+
+	return &mflow->ibflow;
+
+err_free:
+	kfree(mflow);
+	return ERR_PTR(err);
+}
+
+static int mlx4_ib_destroy_flow(struct ib_flow *flow_id)
+{
+	int err, ret = 0;
+	int i = 0;
+	struct mlx4_ib_dev *mdev = to_mdev(flow_id->qp->device);
+	struct mlx4_ib_flow *mflow = to_mflow(flow_id);
+
+	while (i < ARRAY_SIZE(mflow->reg_id) && mflow->reg_id[i]) {
+		err = __mlx4_ib_destroy_flow(mdev->dev, mflow->reg_id[i]);
+		if (err)
+			ret = err;
+		i++;
+	}
+
+	kfree(mflow);
+	return ret;
+}
+
 static int mlx4_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
 {
 	int err;
@@ -1461,6 +1696,15 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 			(1ull << IB_USER_VERBS_CMD_CLOSE_XRCD);
 	}
 
+	if (check_flow_steering_support(dev)) {
+		ibdev->ib_dev.create_flow	= mlx4_ib_create_flow;
+		ibdev->ib_dev.destroy_flow	= mlx4_ib_destroy_flow;
+
+		ibdev->ib_dev.uverbs_cmd_mask	|=
+			(1ull << IB_USER_VERBS_CMD_CREATE_FLOW) |
+			(1ull << IB_USER_VERBS_CMD_DESTROY_FLOW);
+	}
+
 	mlx4_ib_alloc_eqs(dev, ibdev);
 
 	spin_lock_init(&iboe->lock);
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index f61ec26..036b663 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -132,6 +132,12 @@ struct mlx4_ib_fmr {
 	struct mlx4_fmr         mfmr;
 };
 
+struct mlx4_ib_flow {
+	struct ib_flow ibflow;
+	/* translating DMFS verbs sniffer rule to FW API requires two reg IDs */
+	u64 reg_id[2];
+};
+
 struct mlx4_ib_wq {
 	u64		       *wrid;
 	spinlock_t		lock;
@@ -552,6 +558,12 @@ static inline struct mlx4_ib_fmr *to_mfmr(struct ib_fmr *ibfmr)
 {
 	return container_of(ibfmr, struct mlx4_ib_fmr, ibfmr);
 }
+
+static inline struct mlx4_ib_flow *to_mflow(struct ib_flow *ibflow)
+{
+	return container_of(ibflow, struct mlx4_ib_flow, ibflow);
+}
+
 static inline struct mlx4_ib_qp *to_mqp(struct ib_qp *ibqp)
 {
 	return container_of(ibqp, struct mlx4_ib_qp, ibqp);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index a51b013..aa9e1a8 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -1051,11 +1051,6 @@ struct _rule_hw {
 	};
 };
 
-/* translating DMFS verbs sniffer rule to the FW API would need two reg IDs */
-struct mlx4_flow_handle {
-	u64 reg_id[2];
-};
-
 int mlx4_flow_steer_promisc_add(struct mlx4_dev *dev, u8 port, u32 qpn,
 				enum mlx4_net_trans_promisc_mode mode);
 int mlx4_flow_steer_promisc_remove(struct mlx4_dev *dev, u8 port,
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found]     ` <1372251464-13394-3-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-26 13:05       ` Roland Dreier
       [not found]         ` <CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2013-06-26 13:05 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hadar Hen Zion, matanb, Igor Ivanov

On Wed, Jun 26, 2013 at 5:57 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> Add Infra-structure to support extended uverbs capabilities in a forward/backward
> manner. Uverbs command opcodes which are based on the verbs extensions approach should
> be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format
> and processed a bit differently.

I think you missed the feedback I gave to the previous version of this patch:

 This patch at least doesn't have a sufficient changelog.  I don't
 understand what "extended capabilities" are or why we need to change
 the header format.

 What is the "verbs extensions approach"?  Why does the kernel need to
 know about it?  What is different about the processing?  The only
 difference I see is that userspace now has a more complicated way to
 pass the size in, which the kernel seems to nearly ignore -- it just
 adds the sizes together and proceeds as before.

I'm still wondering why the flow steering uverbs commands can't be
normal uverbs commands.

[And BTW, "infrastructure" is a normal word with no need for
capitalization or hyphenation]

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found]         ` <CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-06-26 15:17           ` Or Gerlitz
       [not found]             ` <51CB05F3.3040409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2013-07-09 15:00           ` Tzahi Oved
  1 sibling, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 15:17 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hadar Hen Zion, matanb,
	Igor Ivanov, Tzahi Oved

On 26/06/2013 16:05, Roland Dreier wrote:
> On Wed, Jun 26, 2013 at 5:57 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> Add Infra-structure to support extended uverbs capabilities in a forward/backward manner. Uverbs command opcodes which are based on the verbs extensions approach should be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format and processed a bit differently.
> I think you missed the feedback I gave to the previous version of this patch:
>
>   This patch at least doesn't have a sufficient changelog.  I don't
>   understand what "extended capabilities" are or why we need to change
>   the header format.
>
>   What is the "verbs extensions approach"?  Why does the kernel need to
>   know about it?  What is different about the processing?  The only
>   difference I see is that userspace now has a more complicated way to
>   pass the size in, which the kernel seems to nearly ignore -- it just
>   adds the sizes together and proceeds as before.

Roland, you provided the comment to this patch indeed, but it was on 
another series where the patch was posted, the RoCE IP based addressing 
one. I posted it twice since its an infrastructure (...) patch used by 
both series, I wanted to post V2 of the flow steering patches to make 
sure I addressed your comment on the void pointer OK, and take things 
from there, never mind.

To the point, the uverbs extensions construct is basically made from two 
building blocks

1. extended header which explicitly specifies the in/out verbs data size 
and in/out provider data size

2. a bit mask ("comp mask") which allows to specify what fields in the 
uverbs command structure are used.

The combination of 1 + 2 will allow to extend commands which are 
provided along these building blocks without a need to bump the uverbs ABI.

Today, the kernel uverbs layer assumes a given size for each command, so 
for example, the provider udata IN size is in_words - size_of_cmd.

For commands added along this framework, the kernel could support all 
the previous "versions" towards user space in parallel, say we added new 
command cmdX, to both user and kernel, where v0 is the initial version, 
and later we added few fields to  and have cmdX_v1 and later on more 
fields and have cmdX_v2


> +struct ib_uverbs_cmd_hdr_ex {
> +	__u32 command;
> +	__u16 in_words;
> +	__u16 out_words;
> +	__u16 provider_in_words;
> +	__u16 provider_out_words;
> +	__u32 cmd_hdr_reserved;
> +};
> +
>

Based on the bits set in the comp mask and the in_words field value, the 
kernel which has cmdX_v2 can work towards older user space 
libraries/applications e.g cmdX_v1 and cmdX_v0

The comp mask is not part of the header, but rather the 1st field of 
every uverbs command and response, here, in this series, it was added in 
patch 3/4 for the uverbs flow-steering structures which are cmdX_v0 in 
this context.

If we only used (in_words - size_of_cmd) we can't achieve that support.

Or.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found]             ` <51CB05F3.3040409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-26 15:34               ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 15:34 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hadar Hen Zion, matanb,
	Igor Ivanov, Tzahi Oved

On 26/06/2013 18:17, Or Gerlitz wrote:
>
> Based on the bits set in the comp mask and the in_words field value, 
> the kernel which has cmdX_v2 can work towards older user space 
> libraries/applications e.g cmdX_v1 and cmdX_v0
>
> The comp mask is not part of the header, but rather the 1st field of 
> every uverbs command and response, here, in this series, it was added 
> in patch 3/4 for the uverbs flow-steering structures which are cmdX_v0 
> in this context.

The comp mask biz logic is also explained in Tzahi's OFA 2013 talk on 
verbs extensions, he is referring their to extending libibverbs API in 
user space towards applications but the concept is the same, slides here 
https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/549-extending-verbs-api.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]     ` <1372251464-13394-2-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2013-06-26 19:56       ` Hefty, Sean
       [not found]         ` <1828884A29C6694DAF28B7E6B8A823736FD36FF3-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Hefty, Sean @ 2013-06-26 19:56 UTC (permalink / raw)
  To: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

> The input to ib_create_flow is instance of struct ib_flow_attr which
> contain few mandatory control elements and optional flow specs.
> 
> struct ib_flow_attr {
> 	enum ib_flow_attr_type type;
> 	u16      size;
> 	u16      priority;
> 	u8       num_of_specs;
> 	u8       port;
> 	u32      flags;

This structure could be aligned better.

> 	/* Following are the optional layers according to user request
> 	 * struct ib_flow_spec_yyy
> 	 * struct ib_flow_spec_zzz
> 	 */
> };
> 
> As these specs are eventually coming from user space, they are defined and
> used in a way which allows adding new spec types without kernel/user ABI
> change, and with a little API enhancement which defines the newly added spec.
> 
> The flow spec structures are defined in a TLV (Type-Length-Value) manner,
> which allows to call ib_create_flow with a list of variable length of
> optional specs.
> 
> For the actual processing of ib_flow_attr the driver uses the number of
> specs and the size mandatory fields along with the TLV nature of the specs.
> 
> Steering rules processing order is according to rules priority. The user
> sets the 12 low-order bits from the priority field and the remaining
> 4 high-order bits are set by the kernel according to a domain the
> application or the layer that created the rule belongs to. Lower
> priority numerical value means higher priority.

Why are bit fields being exposed to the user in this way?
 
> +struct ib_flow *ib_create_flow(struct ib_qp *qp,
> +			       struct ib_flow_attr *flow_attr,
> +			       int domain)
> +{
> +	struct ib_flow *flow_id;
> +	if (!qp->device->create_flow)
> +		return ERR_PTR(-ENOSYS);
> +
> +	flow_id = qp->device->create_flow(qp, flow_attr, domain);
> +	if (!IS_ERR(flow_id))
> +		atomic_inc(&qp->usecnt);
> +	return flow_id;
> +}
> +EXPORT_SYMBOL(ib_create_flow);
> +
> +int ib_destroy_flow(struct ib_flow *flow_id)
> +{
> +	int err;
> +	struct ib_qp *qp = flow_id->qp;
> +
> +	if (!flow_id->qp->device->destroy_flow)
> +		return -ENOSYS;

We can assume destroy_flow exists if create_flow does.

> +struct ib_flow_ib_filter {
> +	__be32	l3_type_qpn;
> +	u8	dst_gid[16];
> +};

Maybe this is just a naming issue, but why wouldn't an IB filter have SLID/DLID instead of just DGID?  What does l3_type_qpn mean?  Is this just the QPN?

The TCP/IP filters are broken into separate filters based in L4/L3.  It would seem to make sense if the IB filters were similarly divided into L2/L3/L4 filters.  IB and IPv6 could probably share the same filter definition.

> +struct ib_flow_spec_ib {
> +	enum ib_flow_spec_type	 type;
> +	u16			 size;
> +	struct ib_flow_ib_filter val;
> +	struct ib_flow_ib_filter mask;
> +};
> +
> +struct ib_flow_ipv4_filter {
> +	__be32	src_ip;
> +	__be32	dst_ip;
> +};
> +
> +struct ib_flow_spec_ipv4 {
> +	enum ib_flow_spec_type	   type;
> +	u16			   size;
> +	struct ib_flow_ipv4_filter val;
> +	struct ib_flow_ipv4_filter mask;
> +};
> +
> +struct ib_flow_tcp_udp_filter {
> +	__be16	dst_port;
> +	__be16	src_port;
> +};
> +
> +struct ib_flow_spec_tcp_udp {
> +	enum ib_flow_spec_type	      type;
> +	u16			      size;
> +	struct ib_flow_tcp_udp_filter val;
> +	struct ib_flow_tcp_udp_filter mask;
> +};

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]         ` <1828884A29C6694DAF28B7E6B8A823736FD36FF3-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2013-06-26 21:13           ` Or Gerlitz
       [not found]             ` <CAJZOPZK_FkCJZxjyxEdk4WOTvbo8DQpcpqmuPUsqV=bZmU5W_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-26 21:13 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 26, 2013 at 10:56 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> The input to ib_create_flow is instance of struct ib_flow_attr which
>> contain few mandatory control elements and optional flow specs.
>>
>> struct ib_flow_attr {
>>       enum ib_flow_attr_type type;
>>       u16      size;
>>       u16      priority;
>>       u8       num_of_specs;
>>       u8       port;
>>       u32      flags;
>
> This structure could be aligned better.

OK, I assume you mean arrange fields by decreasing size, correct? so
here we need to  put the flags field before the size field.

>
>>       /* Following are the optional layers according to user request
>>        * struct ib_flow_spec_yyy
>>        * struct ib_flow_spec_zzz
>>        */
>> };
>>
>> As these specs are eventually coming from user space, they are defined and
>> used in a way which allows adding new spec types without kernel/user ABI
>> change, and with a little API enhancement which defines the newly added spec.
>>
>> The flow spec structures are defined in a TLV (Type-Length-Value) manner,
>> which allows to call ib_create_flow with a list of variable length of
>> optional specs.
>>
>> For the actual processing of ib_flow_attr the driver uses the number of
>> specs and the size mandatory fields along with the TLV nature of the specs.
>>
>> Steering rules processing order is according to rules priority. The user
>> sets the 12 low-order bits from the priority field and the remaining
>> 4 high-order bits are set by the kernel according to a domain the
>> application or the layer that created the rule belongs to. Lower
>> priority numerical value means higher priority.
>
> Why are bit fields being exposed to the user in this way?

Yes, this is probably not general enough. So what would you suggest,
use a more integral division? e.g 16 bits for priority and 16 bits for
location?


>> +struct ib_flow *ib_create_flow(struct ib_qp *qp,
>> +                            struct ib_flow_attr *flow_attr,
>> +                            int domain)
>> +{
>> +     struct ib_flow *flow_id;
>> +     if (!qp->device->create_flow)
>> +             return ERR_PTR(-ENOSYS);
>> +
>> +     flow_id = qp->device->create_flow(qp, flow_attr, domain);
>> +     if (!IS_ERR(flow_id))
>> +             atomic_inc(&qp->usecnt);
>> +     return flow_id;
>> +}
>> +EXPORT_SYMBOL(ib_create_flow);
>> +
>> +int ib_destroy_flow(struct ib_flow *flow_id)
>> +{
>> +     int err;
>> +     struct ib_qp *qp = flow_id->qp;
>> +
>> +     if (!flow_id->qp->device->destroy_flow)
>> +             return -ENOSYS;
>
> We can assume destroy_flow exists if create_flow does.

OK, will fix.

>
>> +struct ib_flow_ib_filter {
>> +     __be32  l3_type_qpn;
>> +     u8      dst_gid[16];
>> +};


> Maybe this is just a naming issue, but why wouldn't an IB filter have SLID/DLID instead > of just DGID?  What does l3_type_qpn mean?  Is this just the QPN?

yes, its just the QPN, will fix the name to better match.

> The TCP/IP filters are broken into separate filters based in L4/L3.  It would seem to
> make sense if the IB filters were similarly divided into L2/L3/L4 filters.  IB and IPv6
> could probably share the same filter definition.

IPv6 filters wasn't defined through this submission, but as I wrote,
the scheme provided allows for adding more filters and flow specs.



>
>> +struct ib_flow_spec_ib {
>> +     enum ib_flow_spec_type   type;
>> +     u16                      size;
>> +     struct ib_flow_ib_filter val;
>> +     struct ib_flow_ib_filter mask;
>> +};
>> +
>> +struct ib_flow_ipv4_filter {
>> +     __be32  src_ip;
>> +     __be32  dst_ip;
>> +};
>> +
>> +struct ib_flow_spec_ipv4 {
>> +     enum ib_flow_spec_type     type;
>> +     u16                        size;
>> +     struct ib_flow_ipv4_filter val;
>> +     struct ib_flow_ipv4_filter mask;
>> +};
>> +
>> +struct ib_flow_tcp_udp_filter {
>> +     __be16  dst_port;
>> +     __be16  src_port;
>> +};
>> +
>> +struct ib_flow_spec_tcp_udp {
>> +     enum ib_flow_spec_type        type;
>> +     u16                           size;
>> +     struct ib_flow_tcp_udp_filter val;
>> +     struct ib_flow_tcp_udp_filter mask;
>> +};
>
> - Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]             ` <CAJZOPZK_FkCJZxjyxEdk4WOTvbo8DQpcpqmuPUsqV=bZmU5W_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-06-26 21:33               ` Steve Wise
       [not found]                 ` <51CB5E47.7090404-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  2013-06-27 20:55               ` Hefty, Sean
  1 sibling, 1 reply; 17+ messages in thread
From: Steve Wise @ 2013-06-26 21:33 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Hefty, Sean, Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

On 6/26/2013 4:13 PM, Or Gerlitz wrote:
> On Wed, Jun 26, 2013 at 10:56 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>> The input to ib_create_flow is instance of struct ib_flow_attr which
>>> contain few mandatory control elements and optional flow specs.
>>>
>>> struct ib_flow_attr {
>>>        enum ib_flow_attr_type type;
>>>        u16      size;
>>>        u16      priority;
>>>        u8       num_of_specs;
>>>        u8       port;
>>>        u32      flags;
>> This structure could be aligned better.
> OK, I assume you mean arrange fields by decreasing size, correct? so
> here we need to  put the flags field before the size field.
>
>>>        /* Following are the optional layers according to user request
>>>         * struct ib_flow_spec_yyy
>>>         * struct ib_flow_spec_zzz
>>>         */
>>> };
>>>
>>> As these specs are eventually coming from user space, they are defined and
>>> used in a way which allows adding new spec types without kernel/user ABI
>>> change, and with a little API enhancement which defines the newly added spec.
>>>
>>> The flow spec structures are defined in a TLV (Type-Length-Value) manner,
>>> which allows to call ib_create_flow with a list of variable length of
>>> optional specs.
>>>
>>> For the actual processing of ib_flow_attr the driver uses the number of
>>> specs and the size mandatory fields along with the TLV nature of the specs.
>>>
>>> Steering rules processing order is according to rules priority. The user
>>> sets the 12 low-order bits from the priority field and the remaining
>>> 4 high-order bits are set by the kernel according to a domain the
>>> application or the layer that created the rule belongs to. Lower
>>> priority numerical value means higher priority.
>> Why are bit fields being exposed to the user in this way?
> Yes, this is probably not general enough. So what would you suggest,
> use a more integral division? e.g 16 bits for priority and 16 bits for
> location?

If the kernel driver is setting the "location", whatever that is, why 
would the application need access to it?  IE isn't a priority field 
enough to allow the application provide an ordering/prioritization to 
the rules?

Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]             ` <CAJZOPZK_FkCJZxjyxEdk4WOTvbo8DQpcpqmuPUsqV=bZmU5W_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-06-26 21:33               ` Steve Wise
@ 2013-06-27 20:55               ` Hefty, Sean
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A823736FD37415-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Hefty, Sean @ 2013-06-27 20:55 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

> > The TCP/IP filters are broken into separate filters based in L4/L3.  It would
> seem to
> > make sense if the IB filters were similarly divided into L2/L3/L4 filters.
> IB and IPv6
> > could probably share the same filter definition.
> 
> IPv6 filters wasn't defined through this submission, but as I wrote,
> the scheme provided allows for adding more filters and flow specs.

My point was that the IPv6 filter should be defined and used here.  The following basic filters were defined:

ethernet -	src/dst mac ...
ip -		src/dst ip
tcp/udp -	src/dst port

These are at least somewhat intuitive to me.  The IB filter is

ib -		(src/dst?) qpn, dgid

This is equivalent to creating a filter that's:

tcpip - 	port, dst ip

IMO, it would be better to define IB filters using the same structure that you used for tcp/ip/ethernet.  For example

ibqp - 	src/dst qpn (pkey?)
ipv6 -	src/dst ipv6/gids (flowlabel?)
iblink - 	src/dst lids, (sl?)

If the hardware can only support matching on the qpn and dgid, then it can simply fail any requests which specify a non-zero mask on the unsupported components.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]                 ` <51CB5E47.7090404-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2013-06-27 22:05                   ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2013-06-27 22:05 UTC (permalink / raw)
  To: Steve Wise
  Cc: Hefty, Sean, Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

On Thu, Jun 27, 2013 at 12:33 AM, Steve Wise
<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote:
> On 6/26/2013 4:13 PM, Or Gerlitz wrote:
>> On Wed, Jun 26, 2013 at 10:56 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

>>>> Steering rules processing order is according to rules priority. The user
>>>> sets the 12 low-order bits from the priority field and the remaining
>>>> 4 high-order bits are set by the kernel according to a domain the
>>>> application or the layer that created the rule belongs to. Lower
>>>> priority numerical value means higher priority.

>>> Why are bit fields being exposed to the user in this way?

>> Yes, this is probably not general enough. So what would you suggest,
>> use a more integral division? e.g 16 bits for priority and 16 bits for location?

> If the kernel driver is setting the "location", whatever that is, why would
> the application need access to it?  IE isn't a priority field enough to
> allow the application provide an ordering/prioritization to the rules?

I wasn't accurate, the idea is that per domain we allow the app to set
the rule priority, but the actual priority towards the HW is made of
the provided prioriry X domain, where different domains have different
priorities along the order set by the verbs header file see enum
ib_flow_domain
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]                 ` <1828884A29C6694DAF28B7E6B8A823736FD37415-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2013-06-27 22:09                   ` Or Gerlitz
       [not found]                     ` <CAJZOPZLf85TaCM9O3yahspRsuD3KcFzAY5b4nXxe46RiZwnk6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2013-06-27 22:09 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

On Thu, Jun 27, 2013 at 11:55 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

> My point was that the IPv6 filter should be defined and used here.  The following basic > filters were defined:
> ethernet -      src/dst mac ...
> ip -            src/dst ip
> tcp/udp -       src/dst port
> These are at least somewhat intuitive to me.  The IB filter is
> ib -            (src/dst?) qpn, dgid
> This is equivalent to creating a filter that's:
> tcpip -         port, dst ip
> IMO, it would be better to define IB filters using the same structure that you used for
> tcp/ip/ethernet.  For example
> ibqp -  src/dst qpn (pkey?)
> ipv6 -  src/dst ipv6/gids (flowlabel?)
> iblink -        src/dst lids, (sl?)
>
> If the hardware can only support matching on the qpn and dgid, then it can simply fail
> any requests which specify a non-zero mask on the unsupported components.

Sean, I agree that the provided filter on dest qpn / dgid doesn't make
sense and will fix that out.

Still for the initial set of patches that goes in I tend to just
remove the IB filter structure and define the different IB filters
along your proposal in a follow-up patches/es, OK?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH V2 for-next 1/4] IB/core: Add receive Flow Steering support
       [not found]                     ` <CAJZOPZLf85TaCM9O3yahspRsuD3KcFzAY5b4nXxe46RiZwnk6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-06-28  0:10                       ` Hefty, Sean
  0 siblings, 0 replies; 17+ messages in thread
From: Hefty, Sean @ 2013-06-28  0:10 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, roland-DgEjT+Ai2ygdnm+yROfE0A,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, hadarh-VPRAkNaXOzVWk0Htik3J/w,
	matanb-VPRAkNaXOzVWk0Htik3J/w

> Still for the initial set of patches that goes in I tend to just
> remove the IB filter structure and define the different IB filters
> along your proposal in a follow-up patches/es, OK?

That sounds fine to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found]         ` <CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-06-26 15:17           ` Or Gerlitz
@ 2013-07-09 15:00           ` Tzahi Oved
       [not found]             ` <CACZyyF8=dzjktGYAWfHkXdNQycdkP5x0t=rYckTypxj7GLznzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Tzahi Oved @ 2013-07-09 15:00 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hadar Hen Zion,
	matanb, Igor Ivanov

On 26/06/2013 16:05, Roland Dreier wrote:
 >> I don't understand what "extended capabilities" are or why we need
to change the header format.

First let me clarify the target of *this* patch - allow extending
uverbs new commands input and output variables. Whenever a specific
IB_USER_VERBS_CMD_XXX is extended == needs to have additional
arguments, we will be able to add them without creating a completely
new IB_USER_VERBS_CMD_YYY command or bumping the uverbs ABI version.
This patch alone doesn’t provide the whole scheme which is also
dependent on adding comp_mask to the extended uverbs structs - see my
explanation below, and do let me know if an actual use case is more of
self-explaining.

On 26/06/2013 16:05, Roland Dreier wrote:
 >> What is the "verbs extensions approach"?

Similar to the scheme built with Sean and Jason for user verbs
extension support in libibverbs, that can be reviewed in the XRC
patches Sean submitted, we’d like to build a scheme for the kernel
uverbs as well.

On 26/06/2013 16:05, Roland Dreier wrote:
 >> Why does the kernel need to know about it?

The kernel uverbs are the ones that we wish to add extension
capability in this patch (vs. the user verbs which exisit in an
orthogonal patch).

On 26/06/2013 16:05, Roland Dreier wrote:
 >> What is different about the processing?

Ok, so we want to allow future extension of the CMD arguments
(ib_uverbs_cmd_hdr.in_words, ib_uverbs_cmd_hdr.out_words) for an
existing new command (= a command that supports the new uverbs command
header format suggested in this patch) w/o bumping ABI ver and with
maintaining backward and formward compatibility to new and old
libibverbs versions.
In the uverbs command we are passing both uverbs arguments and the
provider arguments (mlx4_ib in our case), for example take the
create_cq call:
Uverbs gets struct ibv_create_cq, mlx4_ib gets struct mlx4_create_cq
(which includes struct ibv_create_cq), and In_words =
sizeof(mlx4_create_cq)/4.
Thus ib_uverbs_cmd_hdr.in_words carry both uverbs plus mlx4_ib input
argument sizes, where uverbs assumes it knows the size of its input
argument - struct ibv_create_cq.
Now, if we wish to add a variable to struct ibv_create_cq, we can add
a comp_mask field to the struct which is basically bit field
indicating which fields exists in the struct (as done for the
libibverbs API extension), but we need a way to tell what is the total
size of the struct and not assume the struct size is predefined (since
we may get different struct sizes from different user libibverbs
vers), so we know at which point the provider input argument (struct
mlx4_create_cq ) begins. Same goes for extending the provider struct
mlx4_create_cq. Thus we split the ib_uverbs_cmd_hdr.in_words to
ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input
argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will
carry the provider (mlx4_ib) input argument size. Same goes for the
response (the uverbs CMD output argument).

Tzahi


On Wed, Jun 26, 2013 at 4:05 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, Jun 26, 2013 at 5:57 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> Add Infra-structure to support extended uverbs capabilities in a forward/backward
>> manner. Uverbs command opcodes which are based on the verbs extensions approach should
>> be greater or equal to IB_USER_VERBS_CMD_THRESHOLD. They have new header format
>> and processed a bit differently.
>
> I think you missed the feedback I gave to the previous version of this patch:
>
>  This patch at least doesn't have a sufficient changelog.  I don't
>  understand what "extended capabilities" are or why we need to change
>  the header format.
>
>  What is the "verbs extensions approach"?  Why does the kernel need to
>  know about it?  What is different about the processing?  The only
>  difference I see is that userspace now has a more complicated way to
>  pass the size in, which the kernel seems to nearly ignore -- it just
>  adds the sizes together and proceeds as before.
>
> I'm still wondering why the flow steering uverbs commands can't be
> normal uverbs commands.
>
> [And BTW, "infrastructure" is a normal word with no need for
> capitalization or hyphenation]
>
>  - R.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs
       [not found]             ` <CACZyyF8=dzjktGYAWfHkXdNQycdkP5x0t=rYckTypxj7GLznzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-07-16  3:33               ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2013-07-16  3:33 UTC (permalink / raw)
  To: Roland Dreier
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hadar Hen Zion, matanb,
	Igor Ivanov, Tzahi Oved

On Tue, Jul 9, 2013 at 6:00 PM, Tzahi Oved <tzahio-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 26/06/2013 16:05, Roland Dreier wrote:
>  >> I don't understand what "extended capabilities" are or why we need
> to change the header format.
>
> First let me clarify the target of *this* patch - allow extending
> uverbs new commands input and output variables. Whenever a specific
> IB_USER_VERBS_CMD_XXX is extended == needs to have additional
> arguments, we will be able to add them without creating a completely
> new IB_USER_VERBS_CMD_YYY command or bumping the uverbs ABI version.
> This patch alone doesn’t provide the whole scheme which is also
> dependent on adding comp_mask to the extended uverbs structs - see my
> explanation below, and do let me know if an actual use case is more of
> self-explaining.
>
> On 26/06/2013 16:05, Roland Dreier wrote:
>  >> What is the "verbs extensions approach"?
>
> Similar to the scheme built with Sean and Jason for user verbs
> extension support in libibverbs, that can be reviewed in the XRC
> patches Sean submitted, we’d like to build a scheme for the kernel
> uverbs as well.
>
> On 26/06/2013 16:05, Roland Dreier wrote:
>  >> Why does the kernel need to know about it?
>
> The kernel uverbs are the ones that we wish to add extension
> capability in this patch (vs. the user verbs which exisit in an
> orthogonal patch).
>
> On 26/06/2013 16:05, Roland Dreier wrote:
>  >> What is different about the processing?
>
> Ok, so we want to allow future extension of the CMD arguments
> (ib_uverbs_cmd_hdr.in_words, ib_uverbs_cmd_hdr.out_words) for an
> existing new command (= a command that supports the new uverbs command
> header format suggested in this patch) w/o bumping ABI ver and with
> maintaining backward and formward compatibility to new and old
> libibverbs versions.
> In the uverbs command we are passing both uverbs arguments and the
> provider arguments (mlx4_ib in our case), for example take the
> create_cq call:
> Uverbs gets struct ibv_create_cq, mlx4_ib gets struct mlx4_create_cq
> (which includes struct ibv_create_cq), and In_words =
> sizeof(mlx4_create_cq)/4.
> Thus ib_uverbs_cmd_hdr.in_words carry both uverbs plus mlx4_ib input
> argument sizes, where uverbs assumes it knows the size of its input
> argument - struct ibv_create_cq.
> Now, if we wish to add a variable to struct ibv_create_cq, we can add
> a comp_mask field to the struct which is basically bit field
> indicating which fields exists in the struct (as done for the
> libibverbs API extension), but we need a way to tell what is the total
> size of the struct and not assume the struct size is predefined (since
> we may get different struct sizes from different user libibverbs
> vers), so we know at which point the provider input argument (struct
> mlx4_create_cq ) begins. Same goes for extending the provider struct
> mlx4_create_cq. Thus we split the ib_uverbs_cmd_hdr.in_words to
> ib_uverbs_cmd_hdr.in_words which will now carry only uverbs input
> argument struct size and ib_uverbs_cmd_hdr.provider_in_words that will
> carry the provider (mlx4_ib) input argument size. Same goes for the
> response (the uverbs CMD output argument).
>
> Tzahi

Roland,

Does the above helps to address your question/concerns on the matter?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-07-16  3:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-26 12:57 [PATCH V2 for-next 0/4] Add receive Flow Steering support Or Gerlitz
     [not found] ` <1372251464-13394-1-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-26 12:57   ` [PATCH V2 for-next 1/4] IB/core: " Or Gerlitz
     [not found]     ` <1372251464-13394-2-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-26 19:56       ` Hefty, Sean
     [not found]         ` <1828884A29C6694DAF28B7E6B8A823736FD36FF3-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-06-26 21:13           ` Or Gerlitz
     [not found]             ` <CAJZOPZK_FkCJZxjyxEdk4WOTvbo8DQpcpqmuPUsqV=bZmU5W_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-26 21:33               ` Steve Wise
     [not found]                 ` <51CB5E47.7090404-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2013-06-27 22:05                   ` Or Gerlitz
2013-06-27 20:55               ` Hefty, Sean
     [not found]                 ` <1828884A29C6694DAF28B7E6B8A823736FD37415-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-06-27 22:09                   ` Or Gerlitz
     [not found]                     ` <CAJZOPZLf85TaCM9O3yahspRsuD3KcFzAY5b4nXxe46RiZwnk6Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-28  0:10                       ` Hefty, Sean
2013-06-26 12:57   ` [PATCH V2 for-next 2/4] IB/core: Infra-structure to support verbs extensions through uverbs Or Gerlitz
     [not found]     ` <1372251464-13394-3-git-send-email-ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-26 13:05       ` Roland Dreier
     [not found]         ` <CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-26 15:17           ` Or Gerlitz
     [not found]             ` <51CB05F3.3040409-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-26 15:34               ` Or Gerlitz
2013-07-09 15:00           ` Tzahi Oved
     [not found]             ` <CACZyyF8=dzjktGYAWfHkXdNQycdkP5x0t=rYckTypxj7GLznzw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-16  3:33               ` Or Gerlitz
2013-06-26 12:57   ` [PATCH V2 for-next 3/4] IB/core: Export ib_create/destroy_flow " Or Gerlitz
2013-06-26 12:57   ` [PATCH V2 for-next 4/4] IB/mlx4: Add receive Flow Steering support Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.