All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 for-next 00/10] Verbs RSS
@ 2016-05-23 12:20 Yishai Hadas
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Hi Doug,

This V4 series addressed the note to move RSS hash capabilities
from the common udata to be part of vendor specific data.
As was agreed in the OFAWG meeting, WQ and indirection table
are common objects. The last 3 patches were changed
accordingly.

As the RSS series already missed few kernel merging, would appreciate
if it can be taken into 4.7.

Thanks,
Yishai
 

RSS (Receive Side Scaling) technology allows to spread incoming traffic
between different receive descriptor queues.
Assigning each queue to different CPU cores allows to better load
balance the incoming traffic and improve performance.

This patch-set introduces some new objects and verbs in order to allow
verbs based solutions to utilize the RSS offload capability which is
widely supported today by many modern NICs. It extends the IB and uverbs
layers to support the above functionality and supplies a specific
implementation for the mlx5_ib driver.

The implementation is based on an RFC that was sent to the list some
months ago and describes the expected verbs and objects.
RFC: http://www.spinics.net/lists/linux-rdma/msg25012.html

In addition, below URL can be used as a reference to the motivation and
the justification to add the new objects that are described below.
http://lxr.free-electrons.com/source/Documentation/networking/scaling.txt

Overview of the changes:
- Add new objects: Work Queue and Receive Work Queues Indirection Table.
- Add new verbs that are required to handle the new objects:
  ib_create_wq(), ib_modify_wq(), ib_destory_wq(),
  ib_create_rwq_ind_table(), ib_destroy_rwq_ind_table().
- Extend ib_create_qp() to get Receive Work Queues Indirection Table.

Work Queue: (ib_wq)
- Work Queue is associated (many to one) with  Completion Queue.
- It owns Work Queue properties (PD, WQ size etc.).
- Currently Work Queue type can be IB_WQT_RQ (receive queue), other ones
  may be added in the future. (e.g. IB_WQT_SQ, send queue)
- Work Queue from type IB_WQT_RQ contains receive work requests.
- Work Queue context is subject to a well-defined state transitions done
  by the modify_wq verb.
- Work Queue is a necessary component for RSS technology since RSS
  mechanism is supposed to distribute the traffic between multiple
  Receive Work Queues.

Receive Work Queue Indirection Table: (ib_rwq_ind_tbl)
- Serves to spread traffic between Work Queues from type RQ.
- Can be modified dynamically to give different queues different relative
  weights.
- The receive queue for a packet is determined by computed hash for the
  incoming packet.
- Receive Work Queue Indirection Table is associated (one to many) with QPs.

RSS hashing configuration:
- Should be used to compute the required RQ entry for the incoming packet.
- It includes:
    * The hashing function used to choose the WQ from this table.
    * The packet's properties that the hashing function should use.
    * Was changed to be vendor specific based on OFAWG verbs discussion.

Future extensions to this patch-set:
- Add ib_modify_rwq_ind_table() verb to enable a dynamic RQ mapping change.
- Reflect RSS capabilities by the query device verb.
- User space support (i.e. libibverbs/vendor drivers) to expose the new verbs
  and objects.

Patches:
#1:  Exposes the required APIs from mlx5_core to be used in coming patches
     by mlx5_ib driver.
#2:  Introduces the Work Queue object and its verbs in the IB layer.
#3:  Adds uverbs support for the Work Queue verbs.
#4:  Implements the Work Queue verbs in mlx5_ib driver.
#5:  Introduces Receive Work Queue indirection table and its verbs in
     the IB layer.
#6:  Adds uverbs support for the Receive Work Queue indirection table verbs.
#7:  Implements the Receive Work Queue indirection table verbs in mlx5_ib driver.
#8:  Extends create QP to get indirection table in the IB layer.
#9:  Extends create QP to get indirection table in the uverbs layer.
#10: Adds RSS QP support in mlx5_ib driver.

Changes from V3:
Move hash properties from the common udata to be vendor specific data.

Changes from V2:
- Improve the new verbs to enable clean future extensions in both uverbs
  and mlx5_ib layers.
- Add the last three patches to enable RSS QP creation.

Changes from V1:
#patch #2: Change ib_modify_wq to use u32 instead of enum for bit wise values.
#patch #3: Improve usage of attr_mask/comp_mask.
#patch #4: Fix driver issue in mlx5_ib in PPC.
#patch #6: Limit un-expected memory allocation.

Changes from V0:
patch #2: Move the new verbs documentation to be in the C file,
          improve the commit message.
patch #5: Move the new verbs documentation to be in the C file.

Yishai Hadas (10):
  net/mlx5: Export required core functions to support RSS
  IB/core: Introduce Work Queue object and its verbs
  IB/uverbs: Add WQ support
  IB/mlx5: Add receive Work Queue verbs
  IB/core: Introduce Receive Work Queue indirection table
  IB/uverbs: Introduce RWQ Indirection table
  IB/mlx5: Add Receive Work Queue Indirection table operations
  IB/core: Extend create QP to get indirection table
  IB/uverbs: Extend create QP to get RWQ indirection table
  IB/mlx5: Add RSS QP support

 drivers/infiniband/core/uverbs.h                   |  12 +
 drivers/infiniband/core/uverbs_cmd.c               | 528 ++++++++++++++++++-
 drivers/infiniband/core/uverbs_main.c              |  38 ++
 drivers/infiniband/core/verbs.c                    | 163 +++++-
 drivers/infiniband/hw/mlx5/main.c                  |  12 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h               |  54 ++
 drivers/infiniband/hw/mlx5/qp.c                    | 584 +++++++++++++++++++++
 drivers/infiniband/hw/mlx5/user.h                  |  64 +++
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c |   4 +
 include/rdma/ib_verbs.h                            |  85 ++-
 include/uapi/rdma/ib_user_verbs.h                  |  77 +++
 11 files changed, 1605 insertions(+), 16 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 01/10] net/mlx5: Export required core functions to support RSS
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs Yishai Hadas
                     ` (10 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

In order to support RSS QPs, we need to create Ethernet based objects.
This is done by create_rq, destroy_rq, create_rqt and destroy_rqt
mlx5_core functions. We export these functions.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/transobj.c b/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
index 03a5093..28274a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/transobj.c
@@ -85,6 +85,7 @@ int mlx5_core_create_rq(struct mlx5_core_dev *dev, u32 *in, int inlen, u32 *rqn)
 
 	return err;
 }
+EXPORT_SYMBOL(mlx5_core_create_rq);
 
 int mlx5_core_modify_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *in, int inlen)
 {
@@ -110,6 +111,7 @@ void mlx5_core_destroy_rq(struct mlx5_core_dev *dev, u32 rqn)
 
 	mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
 }
+EXPORT_SYMBOL(mlx5_core_destroy_rq);
 
 int mlx5_core_query_rq(struct mlx5_core_dev *dev, u32 rqn, u32 *out)
 {
@@ -430,6 +432,7 @@ int mlx5_core_create_rqt(struct mlx5_core_dev *dev, u32 *in, int inlen,
 
 	return err;
 }
+EXPORT_SYMBOL(mlx5_core_create_rqt);
 
 int mlx5_core_modify_rqt(struct mlx5_core_dev *dev, u32 rqtn, u32 *in,
 			 int inlen)
@@ -455,3 +458,4 @@ void mlx5_core_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn)
 
 	mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
 }
+EXPORT_SYMBOL(mlx5_core_destroy_rqt);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 01/10] net/mlx5: Export required core functions to support RSS Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 03/10] IB/uverbs: Add WQ support Yishai Hadas
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Introduce Work Queue object and its create/destroy/modify verbs.

QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.
WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues.

WQ associated (many to one) with Completion Queue and it owns WQ
properties (PD, WQ size, etc.).
WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
it may be extend to others such as IB_WQT_SQ. (send queue).
WQ from type IB_WQT_RQ contains receive work requests.

PD is an attribute of a work queue (i.e. send/receive queue), it's used
by the hardware for security validation before scattering to a memory
region which is pointed by the WQ. For that, an external WQ object
needs a PD, letting the hardware makes that validation.

When accessing a memory region that is pointed by the WQ its PD
is used and not the QP's PD, this behavior is similar
to a SRQ and a QP.

WQ context is subject to a well-defined state transitions done by
the modify_wq verb.
When WQ is created its initial state becomes IB_WQS_RESET.
>From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
>From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
or to IB_WQS_ERR.
>From IB_WQS_ERR it can be modified to IB_WQS_RESET.

Note: transition to IB_WQS_ERR might occur implicitly in case there
was some HW error.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 82 +++++++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         | 56 +++++++++++++++++++++++++++-
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 1d7d4cf..c096cad 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1554,6 +1554,88 @@ int ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 }
 EXPORT_SYMBOL(ib_dealloc_xrcd);
 
+/**
+ * ib_create_wq - Creates a WQ associated with the specified protection
+ * domain.
+ * @pd: The protection domain associated with the WQ.
+ * @wq_init_attr: A list of initial attributes required to create the
+ * WQ. If WQ creation succeeds, then the attributes are updated to
+ * the actual capabilities of the created WQ.
+ *
+ * wq_init_attr->max_wr and wq_init_attr->max_sge determine
+ * the requested size of the WQ, and set to the actual values allocated
+ * on return.
+ * If ib_create_wq() succeeds, then max_wr and max_sge will always be
+ * at least as large as the requested values.
+ */
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *wq_attr)
+{
+	struct ib_wq *wq;
+
+	if (!pd->device->create_wq)
+		return ERR_PTR(-ENOSYS);
+
+	wq = pd->device->create_wq(pd, wq_attr, NULL);
+	if (!IS_ERR(wq)) {
+		wq->event_handler = wq_attr->event_handler;
+		wq->wq_context = wq_attr->wq_context;
+		wq->wq_type = wq_attr->wq_type;
+		wq->cq = wq_attr->cq;
+		wq->device = pd->device;
+		wq->pd = pd;
+		wq->uobject = NULL;
+		atomic_inc(&pd->usecnt);
+		atomic_inc(&wq_attr->cq->usecnt);
+		atomic_set(&wq->usecnt, 0);
+	}
+	return wq;
+}
+EXPORT_SYMBOL(ib_create_wq);
+
+/**
+ * ib_destroy_wq - Destroys the specified WQ.
+ * @wq: The WQ to destroy.
+ */
+int ib_destroy_wq(struct ib_wq *wq)
+{
+	int err;
+	struct ib_cq *cq = wq->cq;
+	struct ib_pd *pd = wq->pd;
+
+	if (atomic_read(&wq->usecnt))
+		return -EBUSY;
+
+	err = wq->device->destroy_wq(wq);
+	if (!err) {
+		atomic_dec(&pd->usecnt);
+		atomic_dec(&cq->usecnt);
+	}
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_wq);
+
+/**
+ * ib_modify_wq - Modifies the specified WQ.
+ * @wq: The WQ to modify.
+ * @wq_attr: On input, specifies the WQ attributes to modify.
+ * @wq_attr_mask: A bit-mask used to specify which attributes of the WQ
+ *   are being modified.
+ * On output, the current values of selected WQ attributes are returned.
+ */
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		 u32 wq_attr_mask)
+{
+	int err;
+
+	if (!wq->device->modify_wq)
+		return -ENOSYS;
+
+	err = wq->device->modify_wq(wq, wq_attr, wq_attr_mask, NULL);
+	return err;
+}
+EXPORT_SYMBOL(ib_modify_wq);
+
 struct ib_flow *ib_create_flow(struct ib_qp *qp,
 			       struct ib_flow_attr *flow_attr,
 			       int domain)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index fc0320c..e58c1a4 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1429,6 +1429,48 @@ struct ib_srq {
 	} ext;
 };
 
+enum ib_wq_type {
+	IB_WQT_RQ
+};
+
+enum ib_wq_state {
+	IB_WQS_RESET,
+	IB_WQS_RDY,
+	IB_WQS_ERR
+};
+
+struct ib_wq {
+	struct ib_device       *device;
+	struct ib_uobject      *uobject;
+	void		    *wq_context;
+	void		    (*event_handler)(struct ib_event *, void *);
+	struct ib_pd	       *pd;
+	struct ib_cq	       *cq;
+	u32		wq_num;
+	enum ib_wq_state       state;
+	enum ib_wq_type	wq_type;
+	atomic_t		usecnt;
+};
+
+struct ib_wq_init_attr {
+	void		       *wq_context;
+	enum ib_wq_type	wq_type;
+	u32		max_wr;
+	u32		max_sge;
+	struct	ib_cq	       *cq;
+	void		    (*event_handler)(struct ib_event *, void *);
+};
+
+enum ib_wq_attr_mask {
+	IB_WQ_STATE	= 1 << 0,
+	IB_WQ_CUR_STATE	= 1 << 1,
+};
+
+struct ib_wq_attr {
+	enum	ib_wq_state	wq_state;
+	enum	ib_wq_state	curr_wq_state;
+};
+
 struct ib_qp {
 	struct ib_device       *device;
 	struct ib_pd	       *pd;
@@ -1901,7 +1943,14 @@ struct ib_device {
 						   struct ifla_vf_stats *stats);
 	int			   (*set_vf_guid)(struct ib_device *device, int vf, u8 port, u64 guid,
 						  int type);
-
+	struct ib_wq *		   (*create_wq)(struct ib_pd *pd,
+						struct ib_wq_init_attr *init_attr,
+						struct ib_udata *udata);
+	int			   (*destroy_wq)(struct ib_wq *wq);
+	int			   (*modify_wq)(struct ib_wq *wq,
+						struct ib_wq_attr *attr,
+						u32 wq_attr_mask,
+						struct ib_udata *udata);
 	struct ib_dma_mapping_ops   *dma_ops;
 
 	struct module               *owner;
@@ -3145,6 +3194,11 @@ int ib_check_mr_status(struct ib_mr *mr, u32 check_mask,
 struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, u8 port,
 					    u16 pkey, const union ib_gid *gid,
 					    const struct sockaddr *addr);
+struct ib_wq *ib_create_wq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *init_attr);
+int ib_destroy_wq(struct ib_wq *wq);
+int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
+		 u32 wq_attr_mask);
 
 int ib_map_mr_sg(struct ib_mr *mr, struct scatterlist *sg, int sg_nents,
 		 unsigned int *sg_offset, unsigned int page_size);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 03/10] IB/uverbs: Add WQ support
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 01/10] net/mlx5: Export required core functions to support RSS Yishai Hadas
  2016-05-23 12:20   ` [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
                     ` (8 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

User space applications which use RSS functionality need to create
a work queue object (WQ). The lifetime of such an object is:
 * Create a WQ
 * Modify the WQ from reset to init state.
 * Use the WQ (by downstream patches).
 * Destroy the WQ.

These commands are added to the uverbs API.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |   9 ++
 drivers/infiniband/core/uverbs_cmd.c  | 243 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |  25 ++++
 include/rdma/ib_verbs.h               |   3 +
 include/uapi/rdma/ib_user_verbs.h     |  41 ++++++
 5 files changed, 321 insertions(+)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 612ccfd..74776c6 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -162,6 +162,10 @@ struct ib_uqp_object {
 	struct ib_uxrcd_object *uxrcd;
 };
 
+struct ib_uwq_object {
+	struct ib_uevent_object	uevent;
+};
+
 struct ib_ucq_object {
 	struct ib_uobject	uobject;
 	struct ib_uverbs_file  *uverbs_file;
@@ -181,6 +185,7 @@ extern struct idr ib_uverbs_qp_idr;
 extern struct idr ib_uverbs_srq_idr;
 extern struct idr ib_uverbs_xrcd_idr;
 extern struct idr ib_uverbs_rule_idr;
+extern struct idr ib_uverbs_wq_idr;
 
 void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
 
@@ -199,6 +204,7 @@ void ib_uverbs_release_uevent(struct ib_uverbs_file *file,
 void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context);
 void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr);
+void ib_uverbs_wq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr);
 void ib_uverbs_event_handler(struct ib_event_handler *handler,
 			     struct ib_event *event);
@@ -275,5 +281,8 @@ IB_UVERBS_DECLARE_EX_CMD(destroy_flow);
 IB_UVERBS_DECLARE_EX_CMD(query_device);
 IB_UVERBS_DECLARE_EX_CMD(create_cq);
 IB_UVERBS_DECLARE_EX_CMD(create_qp);
+IB_UVERBS_DECLARE_EX_CMD(create_wq);
+IB_UVERBS_DECLARE_EX_CMD(modify_wq);
+IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 1a8babb..22e6173 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -57,6 +57,7 @@ static struct uverbs_lock_class ah_lock_class	= { .name = "AH-uobj" };
 static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
 static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
 static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
+static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
 
 /*
  * The ib_uobject locking scheme is as follows:
@@ -243,6 +244,16 @@ static struct ib_qp *idr_read_qp(int qp_handle, struct ib_ucontext *context)
 	return idr_read_obj(&ib_uverbs_qp_idr, qp_handle, context, 0);
 }
 
+static struct ib_wq *idr_read_wq(int wq_handle, struct ib_ucontext *context)
+{
+	return idr_read_obj(&ib_uverbs_wq_idr, wq_handle, context, 0);
+}
+
+static void put_wq_read(struct ib_wq *wq)
+{
+	put_uobj_read(wq->uobject);
+}
+
 static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
@@ -326,6 +337,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->qp_list);
 	INIT_LIST_HEAD(&ucontext->srq_list);
 	INIT_LIST_HEAD(&ucontext->ah_list);
+	INIT_LIST_HEAD(&ucontext->wq_list);
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
 	INIT_LIST_HEAD(&ucontext->rule_list);
 	rcu_read_lock();
@@ -3056,6 +3068,237 @@ static int kern_spec_to_ib_spec(struct ib_uverbs_flow_spec *kern_spec,
 	return 0;
 }
 
+int ib_uverbs_ex_create_wq(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_create_wq	  cmd = {};
+	struct ib_uverbs_ex_create_wq_resp resp = {};
+	struct ib_uwq_object           *obj;
+	int err = 0;
+	struct ib_cq *cq;
+	struct ib_pd *pd;
+	struct ib_wq *wq;
+	struct ib_wq_init_attr wq_init_attr = {};
+	size_t required_cmd_sz;
+	size_t required_resp_len;
+
+	required_cmd_sz = offsetof(typeof(cmd), max_sge) + sizeof(cmd.max_sge);
+	required_resp_len = offsetof(typeof(resp), wqn) + sizeof(resp.wqn);
+
+	if (ucore->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (ucore->outlen < required_resp_len)
+		return -ENOSPC;
+
+	if (ucore->inlen > sizeof(cmd) &&
+	    !ib_is_udata_cleared(ucore, sizeof(cmd),
+				 ucore->inlen - sizeof(cmd)))
+		return -EOPNOTSUPP;
+
+	err = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (err)
+		return err;
+
+	if (cmd.comp_mask)
+		return -EOPNOTSUPP;
+
+	obj = kmalloc(sizeof(*obj), GFP_KERNEL);
+	if (!obj)
+		return -ENOMEM;
+
+	init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext,
+		  &wq_lock_class);
+	down_write(&obj->uevent.uobject.mutex);
+	pd  = idr_read_pd(cmd.pd_handle, file->ucontext);
+	if (!pd) {
+		err = -EINVAL;
+		goto err_uobj;
+	}
+
+	cq = idr_read_cq(cmd.cq_handle, file->ucontext, 0);
+	if (!cq) {
+		err = -EINVAL;
+		goto err_put_pd;
+	}
+
+	wq_init_attr.cq = cq;
+	wq_init_attr.max_sge = cmd.max_sge;
+	wq_init_attr.max_wr = cmd.max_wr;
+	wq_init_attr.wq_context = file;
+	wq_init_attr.wq_type = cmd.wq_type;
+	wq_init_attr.event_handler = ib_uverbs_wq_event_handler;
+	obj->uevent.events_reported = 0;
+	INIT_LIST_HEAD(&obj->uevent.event_list);
+	wq = pd->device->create_wq(pd, &wq_init_attr, uhw);
+	if (IS_ERR(wq)) {
+		err = PTR_ERR(wq);
+		goto err_put_cq;
+	}
+
+	wq->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = wq;
+	wq->wq_type = wq_init_attr.wq_type;
+	wq->cq = cq;
+	wq->pd = pd;
+	wq->device = pd->device;
+	wq->wq_context = wq_init_attr.wq_context;
+	atomic_set(&wq->usecnt, 0);
+	atomic_inc(&pd->usecnt);
+	atomic_inc(&cq->usecnt);
+	wq->uobject = &obj->uevent.uobject;
+	obj->uevent.uobject.object = wq;
+	err = idr_add_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+	if (err)
+		goto destroy_wq;
+
+	memset(&resp, 0, sizeof(resp));
+	resp.wq_handle = obj->uevent.uobject.id;
+	resp.max_sge = wq_init_attr.max_sge;
+	resp.max_wr = wq_init_attr.max_wr;
+	resp.wqn = wq->wq_num;
+	resp.response_length = required_resp_len;
+	err = ib_copy_to_udata(ucore,
+			       &resp, resp.response_length);
+	if (err)
+		goto err_copy;
+
+	put_pd_read(pd);
+	put_cq_read(cq);
+
+	mutex_lock(&file->mutex);
+	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->wq_list);
+	mutex_unlock(&file->mutex);
+
+	obj->uevent.uobject.live = 1;
+	up_write(&obj->uevent.uobject.mutex);
+	return 0;
+
+err_copy:
+	idr_remove_uobj(&ib_uverbs_wq_idr, &obj->uevent.uobject);
+destroy_wq:
+	ib_destroy_wq(wq);
+err_put_cq:
+	put_cq_read(cq);
+err_put_pd:
+	put_pd_read(pd);
+err_uobj:
+	put_uobj_write(&obj->uevent.uobject);
+
+	return err;
+}
+
+int ib_uverbs_ex_destroy_wq(struct ib_uverbs_file *file,
+			    struct ib_device *ib_dev,
+			    struct ib_udata *ucore,
+			    struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_destroy_wq	cmd = {};
+	struct ib_uverbs_ex_destroy_wq_resp	resp = {};
+	struct ib_wq			*wq;
+	struct ib_uobject		*uobj;
+	struct ib_uwq_object		*obj;
+	size_t required_cmd_sz;
+	size_t required_resp_len;
+	int				ret;
+
+	required_cmd_sz = offsetof(typeof(cmd), wq_handle) + sizeof(cmd.wq_handle);
+	required_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved);
+
+	if (ucore->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (ucore->outlen < required_resp_len)
+		return -ENOSPC;
+
+	if (ucore->inlen > sizeof(cmd) &&
+	    !ib_is_udata_cleared(ucore, sizeof(cmd),
+				 ucore->inlen - sizeof(cmd)))
+		return -EOPNOTSUPP;
+
+	ret = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (ret)
+		return ret;
+
+	if (cmd.comp_mask)
+		return -EOPNOTSUPP;
+
+	resp.response_length = required_resp_len;
+	uobj = idr_write_uobj(&ib_uverbs_wq_idr, cmd.wq_handle,
+			      file->ucontext);
+	if (!uobj)
+		return -EINVAL;
+
+	wq = uobj->object;
+	obj = container_of(uobj, struct ib_uwq_object, uevent.uobject);
+	ret = ib_destroy_wq(wq);
+	if (!ret)
+		uobj->live = 0;
+
+	put_uobj_write(uobj);
+	if (ret)
+		return ret;
+
+	idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+
+	mutex_lock(&file->mutex);
+	list_del(&uobj->list);
+	mutex_unlock(&file->mutex);
+
+	ib_uverbs_release_uevent(file, &obj->uevent);
+	resp.events_reported = obj->uevent.events_reported;
+	put_uobj(uobj);
+
+	ret = ib_copy_to_udata(ucore, &resp, resp.response_length);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
+			   struct ib_device *ib_dev,
+			   struct ib_udata *ucore,
+			   struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_modify_wq cmd = {};
+	struct ib_wq *wq;
+	struct ib_wq_attr wq_attr = {};
+	size_t required_cmd_sz;
+	int ret;
+
+	required_cmd_sz = offsetof(typeof(cmd), curr_wq_state) + sizeof(cmd.curr_wq_state);
+	if (ucore->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (ucore->inlen > sizeof(cmd) &&
+	    !ib_is_udata_cleared(ucore, sizeof(cmd),
+				 ucore->inlen - sizeof(cmd)))
+		return -EOPNOTSUPP;
+
+	ret = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (ret)
+		return ret;
+
+	if (!cmd.attr_mask)
+		return -EINVAL;
+
+	if (cmd.attr_mask > (IB_WQ_STATE | IB_WQ_CUR_STATE))
+		return -EINVAL;
+
+	wq = idr_read_wq(cmd.wq_handle, file->ucontext);
+	if (!wq)
+		return -EINVAL;
+
+	wq_attr.curr_wq_state = cmd.curr_wq_state;
+	wq_attr.wq_state = cmd.wq_state;
+	ret = wq->device->modify_wq(wq, &wq_attr, cmd.attr_mask, uhw);
+	put_wq_read(wq);
+	return ret;
+}
+
 int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     struct ib_udata *ucore,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 31f422a..91cb36f 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -76,6 +76,7 @@ DEFINE_IDR(ib_uverbs_qp_idr);
 DEFINE_IDR(ib_uverbs_srq_idr);
 DEFINE_IDR(ib_uverbs_xrcd_idr);
 DEFINE_IDR(ib_uverbs_rule_idr);
+DEFINE_IDR(ib_uverbs_wq_idr);
 
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -130,6 +131,9 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_QUERY_DEVICE]	= ib_uverbs_ex_query_device,
 	[IB_USER_VERBS_EX_CMD_CREATE_CQ]	= ib_uverbs_ex_create_cq,
 	[IB_USER_VERBS_EX_CMD_CREATE_QP]        = ib_uverbs_ex_create_qp,
+	[IB_USER_VERBS_EX_CMD_CREATE_WQ]        = ib_uverbs_ex_create_wq,
+	[IB_USER_VERBS_EX_CMD_MODIFY_WQ]        = ib_uverbs_ex_modify_wq,
+	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]       = ib_uverbs_ex_destroy_wq,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
@@ -265,6 +269,17 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uqp);
 	}
 
+	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
+		struct ib_wq *wq = uobj->object;
+		struct ib_uwq_object *uwq =
+			container_of(uobj, struct ib_uwq_object, uevent.uobject);
+
+		idr_remove_uobj(&ib_uverbs_wq_idr, uobj);
+		ib_destroy_wq(wq);
+		ib_uverbs_release_uevent(file, &uwq->uevent);
+		kfree(uwq);
+	}
+
 	list_for_each_entry_safe(uobj, tmp, &context->srq_list, list) {
 		struct ib_srq *srq = uobj->object;
 		struct ib_uevent_object *uevent =
@@ -568,6 +583,16 @@ void ib_uverbs_qp_event_handler(struct ib_event *event, void *context_ptr)
 				&uobj->events_reported);
 }
 
+void ib_uverbs_wq_event_handler(struct ib_event *event, void *context_ptr)
+{
+	struct ib_uevent_object *uobj = container_of(event->element.wq->uobject,
+						  struct ib_uevent_object, uobject);
+
+	ib_uverbs_async_handler(context_ptr, uobj->uobject.user_handle,
+				event->event, &uobj->event_list,
+				&uobj->events_reported);
+}
+
 void ib_uverbs_srq_event_handler(struct ib_event *event, void *context_ptr)
 {
 	struct ib_uevent_object *uobj;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e58c1a4..fc0f817 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -563,6 +563,7 @@ enum ib_event_type {
 	IB_EVENT_QP_LAST_WQE_REACHED,
 	IB_EVENT_CLIENT_REREGISTER,
 	IB_EVENT_GID_CHANGE,
+	IB_EVENT_WQ_FATAL,
 };
 
 const char *__attribute_const__ ib_event_msg(enum ib_event_type event);
@@ -573,6 +574,7 @@ struct ib_event {
 		struct ib_cq	*cq;
 		struct ib_qp	*qp;
 		struct ib_srq	*srq;
+		struct ib_wq	*wq;
 		u8		port_num;
 	} element;
 	enum ib_event_type	event;
@@ -1324,6 +1326,7 @@ struct ib_ucontext {
 	struct list_head	ah_list;
 	struct list_head	xrcd_list;
 	struct list_head	rule_list;
+	struct list_head	wq_list;
 	int			closing;
 
 	struct pid             *tgid;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index b6543d7..c9470e5 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -95,6 +95,9 @@ enum {
 	IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP,
 	IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD,
 	IB_USER_VERBS_EX_CMD_DESTROY_FLOW,
+	IB_USER_VERBS_EX_CMD_CREATE_WQ,
+	IB_USER_VERBS_EX_CMD_MODIFY_WQ,
+	IB_USER_VERBS_EX_CMD_DESTROY_WQ,
 };
 
 /*
@@ -946,4 +949,42 @@ struct ib_uverbs_destroy_srq_resp {
 	__u32 events_reported;
 };
 
+struct ib_uverbs_ex_create_wq  {
+	__u32 comp_mask;
+	__u32 wq_type;
+	__u64 user_handle;
+	__u32 pd_handle;
+	__u32 cq_handle;
+	__u32 max_wr;
+	__u32 max_sge;
+};
+
+struct ib_uverbs_ex_create_wq_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 wq_handle;
+	__u32 max_wr;
+	__u32 max_sge;
+	__u32 wqn;
+};
+
+struct ib_uverbs_ex_destroy_wq  {
+	__u32 comp_mask;
+	__u32 wq_handle;
+};
+
+struct ib_uverbs_ex_destroy_wq_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 events_reported;
+	__u32 reserved;
+};
+
+struct ib_uverbs_ex_modify_wq  {
+	__u32 attr_mask;
+	__u32 wq_handle;
+	__u32 wq_state;
+	__u32 curr_wq_state;
+};
+
 #endif /* IB_USER_VERBS_H */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 03/10] IB/uverbs: Add WQ support Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-5-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table Yishai Hadas
                     ` (7 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

A QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.

WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues

Receive WQs are implemented by RQs.

Implement the WQ creation, modification and destruction verbs.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |   8 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  35 ++++
 drivers/infiniband/hw/mlx5/qp.c      | 306 +++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/user.h    |  25 +++
 4 files changed, 373 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 1b04df7..1c6036e 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2441,9 +2441,15 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	    IB_LINK_LAYER_ETHERNET) {
 		dev->ib_dev.create_flow	= mlx5_ib_create_flow;
 		dev->ib_dev.destroy_flow = mlx5_ib_destroy_flow;
+		dev->ib_dev.create_wq	 = mlx5_ib_create_wq;
+		dev->ib_dev.modify_wq	 = mlx5_ib_modify_wq;
+		dev->ib_dev.destroy_wq	 = mlx5_ib_destroy_wq;
 		dev->ib_dev.uverbs_ex_cmd_mask |=
 			(1ull << IB_USER_VERBS_EX_CMD_CREATE_FLOW) |
-			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW);
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW) |
+			(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ) |
+			(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ) |
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ);
 	}
 	err = init_node_data(dev);
 	if (err)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index c4a9825..62d4e13 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -217,12 +217,36 @@ struct mlx5_ib_wq {
 	void		       *qend;
 };
 
+struct mlx5_ib_rwq {
+	struct ib_wq		ibwq;
+	u32			rqn;
+	u32			rq_num_pas;
+	u32			log_rq_stride;
+	u32			log_rq_size;
+	u32			rq_page_offset;
+	u32			log_page_size;
+	struct ib_umem		*umem;
+	size_t			buf_size;
+	unsigned int		page_shift;
+	int			create_type;
+	struct mlx5_db		db;
+	u32			user_index;
+	u32			wqe_count;
+	u32			wqe_shift;
+	int			wq_sig;
+};
+
 enum {
 	MLX5_QP_USER,
 	MLX5_QP_KERNEL,
 	MLX5_QP_EMPTY
 };
 
+enum {
+	MLX5_WQ_USER,
+	MLX5_WQ_KERNEL
+};
+
 /*
  * Connect-IB can trigger up to four concurrent pagefaults
  * per-QP.
@@ -628,6 +652,11 @@ static inline struct mlx5_ib_qp *to_mqp(struct ib_qp *ibqp)
 	return container_of(ibqp, struct mlx5_ib_qp, ibqp);
 }
 
+static inline struct mlx5_ib_rwq *to_mrwq(struct ib_wq *ibwq)
+{
+	return container_of(ibwq, struct mlx5_ib_rwq, ibwq);
+}
+
 static inline struct mlx5_ib_srq *to_mibsrq(struct mlx5_core_srq *msrq)
 {
 	return container_of(msrq, struct mlx5_ib_srq, msrq);
@@ -762,6 +791,12 @@ int mlx5_mr_cache_cleanup(struct mlx5_ib_dev *dev);
 int mlx5_mr_ib_cont_pages(struct ib_umem *umem, u64 addr, int *count, int *shift);
 int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
 			    struct ib_mr_status *mr_status);
+struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata);
+int mlx5_ib_destroy_wq(struct ib_wq *wq);
+int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		      u32 wq_attr_mask, struct ib_udata *udata);
 
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 extern struct workqueue_struct *mlx5_ib_page_fault_wq;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 5041176..9ce4b3a 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -647,6 +647,71 @@ err_umem:
 	return err;
 }
 
+static void destroy_user_rq(struct ib_pd *pd, struct mlx5_ib_rwq *rwq)
+{
+	struct mlx5_ib_ucontext *context;
+
+	context = to_mucontext(pd->uobject->context);
+	mlx5_ib_db_unmap_user(context, &rwq->db);
+	if (rwq->umem)
+		ib_umem_release(rwq->umem);
+}
+
+static int create_user_rq(struct mlx5_ib_dev *dev, struct ib_pd *pd,
+			  struct mlx5_ib_rwq *rwq,
+			  struct mlx5_ib_create_wq *ucmd)
+{
+	struct mlx5_ib_ucontext *context;
+	int page_shift = 0;
+	int npages;
+	u32 offset = 0;
+	int ncont = 0;
+	int err;
+
+	if (!ucmd->buf_addr)
+		return -EINVAL;
+
+	context = to_mucontext(pd->uobject->context);
+	rwq->umem = ib_umem_get(pd->uobject->context, ucmd->buf_addr,
+			       rwq->buf_size, 0, 0);
+	if (IS_ERR(rwq->umem)) {
+		mlx5_ib_dbg(dev, "umem_get failed\n");
+		err = PTR_ERR(rwq->umem);
+		return err;
+	}
+
+	mlx5_ib_cont_pages(rwq->umem, ucmd->buf_addr, &npages, &page_shift,
+			   &ncont, NULL);
+	err = mlx5_ib_get_buf_offset(ucmd->buf_addr, page_shift,
+				     &rwq->rq_page_offset);
+	if (err) {
+		mlx5_ib_warn(dev, "bad offset\n");
+		goto err_umem;
+	}
+
+	rwq->rq_num_pas = ncont;
+	rwq->page_shift = page_shift;
+	rwq->log_page_size =  page_shift - MLX5_ADAPTER_PAGE_SHIFT;
+	rwq->wq_sig = !!(ucmd->flags & MLX5_WQ_FLAG_SIGNATURE);
+
+	mlx5_ib_dbg(dev, "addr 0x%llx, size %zd, npages %d, page_shift %d, ncont %d, offset %d\n",
+		    (unsigned long long)ucmd->buf_addr, rwq->buf_size,
+		    npages, page_shift, ncont, offset);
+
+	err = mlx5_ib_db_map_user(context, ucmd->db_addr, &rwq->db);
+	if (err) {
+		mlx5_ib_dbg(dev, "map failed\n");
+		goto err_umem;
+	}
+
+	rwq->create_type = MLX5_WQ_USER;
+	return 0;
+
+err_umem:
+	ib_umem_release(rwq->umem);
+	return err;
+}
+
 static int create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 			  struct mlx5_ib_qp *qp, struct ib_udata *udata,
 			  struct ib_qp_init_attr *attr,
@@ -4154,3 +4219,244 @@ int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd)
 
 	return 0;
 }
+
+static int  create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
+		      struct ib_wq_init_attr *init_attr)
+{
+	struct mlx5_ib_dev *dev;
+	__be64 *rq_pas0;
+	void *in;
+	void *rqc;
+	void *wq;
+	int inlen;
+	int err;
+
+	dev = to_mdev(pd->device);
+
+	inlen = MLX5_ST_SZ_BYTES(create_rq_in) + sizeof(u64) * rwq->rq_num_pas;
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	rqc = MLX5_ADDR_OF(create_rq_in, in, ctx);
+	MLX5_SET(rqc,  rqc, mem_rq_type,
+		 MLX5_RQC_MEM_RQ_TYPE_MEMORY_RQ_INLINE);
+	MLX5_SET(rqc, rqc, user_index, rwq->user_index);
+	MLX5_SET(rqc,  rqc, cqn, to_mcq(init_attr->cq)->mcq.cqn);
+	MLX5_SET(rqc,  rqc, state, MLX5_RQC_STATE_RST);
+	MLX5_SET(rqc,  rqc, flush_in_error_en, 1);
+	wq = MLX5_ADDR_OF(rqc, rqc, wq);
+	MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
+	MLX5_SET(wq, wq, end_padding_mode, MLX5_WQ_END_PAD_MODE_ALIGN);
+	MLX5_SET(wq, wq, log_wq_stride, rwq->log_rq_stride);
+	MLX5_SET(wq, wq, log_wq_sz, rwq->log_rq_size);
+	MLX5_SET(wq, wq, pd, to_mpd(pd)->pdn);
+	MLX5_SET(wq, wq, page_offset, rwq->rq_page_offset);
+	MLX5_SET(wq, wq, log_wq_pg_sz, rwq->log_page_size);
+	MLX5_SET(wq, wq, wq_signature, rwq->wq_sig);
+	MLX5_SET64(wq, wq, dbr_addr, rwq->db.dma);
+	rq_pas0 = (__be64 *)MLX5_ADDR_OF(wq, wq, pas);
+	mlx5_ib_populate_pas(dev, rwq->umem, rwq->page_shift, rq_pas0, 0);
+	err = mlx5_core_create_rq(dev->mdev, in, inlen, &rwq->rqn);
+	kvfree(in);
+	return err;
+}
+
+static int set_user_rq_size(struct mlx5_ib_dev *dev,
+			    struct ib_wq_init_attr *wq_init_attr,
+			    struct mlx5_ib_create_wq *ucmd,
+			    struct mlx5_ib_rwq *rwq)
+{
+	/* Sanity check RQ size before proceeding */
+	if (wq_init_attr->max_wr > (1 << MLX5_CAP_GEN(dev->mdev, log_max_wq_sz)))
+		return -EINVAL;
+
+	if (!ucmd->rq_wqe_count)
+		return -EINVAL;
+
+	rwq->wqe_count = ucmd->rq_wqe_count;
+	rwq->wqe_shift = ucmd->rq_wqe_shift;
+	rwq->buf_size = (rwq->wqe_count << rwq->wqe_shift);
+	rwq->log_rq_stride = rwq->wqe_shift;
+	rwq->log_rq_size = ilog2(rwq->wqe_count);
+	return 0;
+}
+
+static int prepare_user_rq(struct ib_pd *pd,
+			   struct ib_wq_init_attr *init_attr,
+			   struct ib_udata *udata,
+			   struct mlx5_ib_rwq *rwq)
+{
+	struct mlx5_ib_dev *dev = to_mdev(pd->device);
+	struct mlx5_ib_create_wq ucmd = {};
+	int err;
+	size_t required_cmd_sz;
+
+	required_cmd_sz = offsetof(typeof(ucmd), reserved) + sizeof(ucmd.reserved);
+	if (udata->inlen < required_cmd_sz) {
+		mlx5_ib_dbg(dev, "invalid inlen\n");
+		return -EINVAL;
+	}
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd))) {
+		mlx5_ib_dbg(dev, "inlen is not supported\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+		mlx5_ib_dbg(dev, "copy failed\n");
+		return -EFAULT;
+	}
+
+	if (ucmd.comp_mask) {
+		mlx5_ib_dbg(dev, "invalid comp mask\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (ucmd.reserved) {
+		mlx5_ib_dbg(dev, "invalid reserved\n");
+		return -EOPNOTSUPP;
+	}
+
+	err = set_user_rq_size(dev, init_attr, &ucmd, rwq);
+	if (err) {
+		mlx5_ib_dbg(dev, "err %d\n", err);
+		return err;
+	}
+
+	err = create_user_rq(dev, pd, rwq, &ucmd);
+	if (err) {
+		mlx5_ib_dbg(dev, "err %d\n", err);
+		if (err)
+			return err;
+	}
+
+	rwq->user_index = ucmd.user_index;
+	return 0;
+}
+
+struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev;
+	struct mlx5_ib_rwq *rwq;
+	struct mlx5_ib_create_wq_resp resp = {};
+	size_t min_resp_len;
+	int err;
+
+	if (!udata)
+		return ERR_PTR(-ENOSYS);
+
+	min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved);
+	if (udata->outlen && udata->outlen < min_resp_len)
+		return ERR_PTR(-EINVAL);
+
+	dev = to_mdev(pd->device);
+	switch (init_attr->wq_type) {
+	case IB_WQT_RQ:
+		rwq = kzalloc(sizeof(*rwq), GFP_KERNEL);
+		if (!rwq)
+			return ERR_PTR(-ENOMEM);
+		err = prepare_user_rq(pd, init_attr, udata, rwq);
+		if (err)
+			goto err;
+		err = create_rq(rwq, pd, init_attr);
+		if (err)
+			goto err_user_rq;
+		break;
+	default:
+		mlx5_ib_dbg(dev, "unsupported wq type %d\n",
+			    init_attr->wq_type);
+		return ERR_PTR(-EINVAL);
+	}
+
+	rwq->ibwq.wq_num = rwq->rqn;
+	rwq->ibwq.state = IB_WQS_RESET;
+	if (udata->outlen) {
+		resp.response_length = offsetof(typeof(resp), response_length) +
+				sizeof(resp.response_length);
+		err = ib_copy_to_udata(udata, &resp, resp.response_length);
+		if (err)
+			goto err_copy;
+	}
+
+	return &rwq->ibwq;
+
+err_copy:
+	mlx5_core_destroy_rq(dev->mdev, rwq->rqn);
+err_user_rq:
+	destroy_user_rq(pd, rwq);
+err:
+	kfree(rwq);
+	return ERR_PTR(err);
+}
+
+int mlx5_ib_destroy_wq(struct ib_wq *wq)
+{
+	struct mlx5_ib_dev *dev = to_mdev(wq->device);
+	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
+
+	mlx5_core_destroy_rq(dev->mdev, rwq->rqn);
+	destroy_user_rq(wq->pd, rwq);
+	kfree(rwq);
+
+	return 0;
+}
+
+int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		      u32 wq_attr_mask, struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev = to_mdev(wq->device);
+	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
+	struct mlx5_ib_modify_wq ucmd = {};
+	size_t required_cmd_sz;
+	int curr_wq_state;
+	int wq_state;
+	int inlen;
+	int err;
+	void *rqc;
+	void *in;
+
+	required_cmd_sz = offsetof(typeof(ucmd), reserved) + sizeof(ucmd.reserved);
+	if (udata->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd)))
+		return -EOPNOTSUPP;
+
+	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
+		return -EFAULT;
+
+	if (ucmd.comp_mask || ucmd.reserved)
+		return -EOPNOTSUPP;
+
+	inlen = MLX5_ST_SZ_BYTES(modify_rq_in);
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	rqc = MLX5_ADDR_OF(modify_rq_in, in, ctx);
+
+	curr_wq_state = (wq_attr_mask & IB_WQ_CUR_STATE) ?
+		wq_attr->curr_wq_state : wq->state;
+	wq_state = (wq_attr_mask & IB_WQ_STATE) ?
+		wq_attr->wq_state : curr_wq_state;
+	if (curr_wq_state == IB_WQS_ERR)
+		curr_wq_state = MLX5_RQC_STATE_ERR;
+	if (wq_state == IB_WQS_ERR)
+		wq_state = MLX5_RQC_STATE_ERR;
+	MLX5_SET(modify_rq_in, in, rq_state, curr_wq_state);
+	MLX5_SET(rqc, rqc, state, wq_state);
+
+	err = mlx5_core_modify_rq(dev->mdev, rwq->rqn, in, inlen);
+	kvfree(in);
+	if (!err)
+		rwq->ibwq.state = (wq_state == MLX5_RQC_STATE_ERR) ? IB_WQS_ERR : wq_state;
+
+	return err;
+}
diff --git a/drivers/infiniband/hw/mlx5/user.h b/drivers/infiniband/hw/mlx5/user.h
index 61bc308..3e66f93 100644
--- a/drivers/infiniband/hw/mlx5/user.h
+++ b/drivers/infiniband/hw/mlx5/user.h
@@ -46,6 +46,10 @@ enum {
 	MLX5_SRQ_FLAG_SIGNATURE		= 1 << 0,
 };
 
+enum {
+	MLX5_WQ_FLAG_SIGNATURE		= 1 << 0,
+};
+
 
 /* Increment this value if any changes that break userspace ABI
  * compatibility are made.
@@ -159,6 +163,27 @@ struct mlx5_ib_alloc_mw {
 	__u16	reserved2;
 };
 
+struct mlx5_ib_create_wq {
+	__u64   buf_addr;
+	__u64   db_addr;
+	__u32   rq_wqe_count;
+	__u32   rq_wqe_shift;
+	__u32   user_index;
+	__u32   flags;
+	__u32   comp_mask;
+	__u32   reserved;
+};
+
+struct mlx5_ib_create_wq_resp {
+	__u32	response_length;
+	__u32	reserved;
+};
+
+struct mlx5_ib_modify_wq {
+	__u32	comp_mask;
+	__u32	reserved;
+};
+
 static inline int get_qp_user_index(struct mlx5_ib_ucontext *ucontext,
 				    struct mlx5_ib_create_qp *ucmd,
 				    int inlen,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 06/10] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Introduce Receive Work Queue (WQ) indirection table.
This object can be used to spread incoming traffic to different
receive Work Queues.

A Receive WQ indirection table points to variable size of WQs.
This table is given to a QP in downstream patches.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 62 +++++++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         | 23 +++++++++++++++
 2 files changed, 85 insertions(+)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index c096cad..6b548d7 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1636,6 +1636,68 @@ int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 }
 EXPORT_SYMBOL(ib_modify_wq);
 
+/*
+ * ib_create_rwq_ind_table - Creates a RQ Indirection Table.
+ * @device: The device on which to create the rwq indirection table.
+ * @ib_rwq_ind_table_init_attr: A list of initial attributes required to
+ * create the Indirection Table.
+ *
+ * Note: The life time of ib_rwq_ind_table_init_attr->ind_tbl is not less
+ *	than the created ib_rwq_ind_table object and the caller is responsible
+ *	for its memory allocation/free.
+ */
+struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
+						 struct ib_rwq_ind_table_init_attr *init_attr)
+{
+	struct ib_rwq_ind_table *rwq_ind_table;
+	int i;
+	u32 table_size;
+
+	if (!device->create_rwq_ind_table)
+		return ERR_PTR(-ENOSYS);
+
+	table_size = (1 << init_attr->log_ind_tbl_size);
+	rwq_ind_table = device->create_rwq_ind_table(device,
+				init_attr, NULL);
+	if (IS_ERR(rwq_ind_table))
+		return rwq_ind_table;
+
+	rwq_ind_table->ind_tbl = init_attr->ind_tbl;
+	rwq_ind_table->log_ind_tbl_size = init_attr->log_ind_tbl_size;
+	rwq_ind_table->device = device;
+	rwq_ind_table->uobject = NULL;
+	atomic_set(&rwq_ind_table->usecnt, 0);
+
+	for (i = 0; i < table_size; i++)
+		atomic_inc(&rwq_ind_table->ind_tbl[i]->usecnt);
+
+	return rwq_ind_table;
+}
+EXPORT_SYMBOL(ib_create_rwq_ind_table);
+
+/*
+ * ib_destroy_rwq_ind_table - Destroys the specified Indirection Table.
+ * @wq_ind_table: The Indirection Table to destroy.
+*/
+int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *rwq_ind_table)
+{
+	int err, i;
+	u32 table_size = (1 << rwq_ind_table->log_ind_tbl_size);
+	struct ib_wq **ind_tbl = rwq_ind_table->ind_tbl;
+
+	if (atomic_read(&rwq_ind_table->usecnt))
+		return -EBUSY;
+
+	err = rwq_ind_table->device->destroy_rwq_ind_table(rwq_ind_table);
+	if (!err) {
+		for (i = 0; i < table_size; i++)
+			atomic_dec(&ind_tbl[i]->usecnt);
+	}
+
+	return err;
+}
+EXPORT_SYMBOL(ib_destroy_rwq_ind_table);
+
 struct ib_flow *ib_create_flow(struct ib_qp *qp,
 			       struct ib_flow_attr *flow_attr,
 			       int domain)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index fc0f817..5ca3b1a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1474,6 +1474,21 @@ struct ib_wq_attr {
 	enum	ib_wq_state	curr_wq_state;
 };
 
+struct ib_rwq_ind_table {
+	struct ib_device	*device;
+	struct ib_uobject      *uobject;
+	atomic_t		usecnt;
+	u32		ind_tbl_num;
+	u32		log_ind_tbl_size;
+	struct ib_wq	**ind_tbl;
+};
+
+struct ib_rwq_ind_table_init_attr {
+	u32		log_ind_tbl_size;
+	/* Each entry is a pointer to Receive Work Queue */
+	struct ib_wq	**ind_tbl;
+};
+
 struct ib_qp {
 	struct ib_device       *device;
 	struct ib_pd	       *pd;
@@ -1954,6 +1969,10 @@ struct ib_device {
 						struct ib_wq_attr *attr,
 						u32 wq_attr_mask,
 						struct ib_udata *udata);
+	struct ib_rwq_ind_table *  (*create_rwq_ind_table)(struct ib_device *device,
+							   struct ib_rwq_ind_table_init_attr *init_attr,
+							   struct ib_udata *udata);
+	int                        (*destroy_rwq_ind_table)(struct ib_rwq_ind_table *wq_ind_table);
 	struct ib_dma_mapping_ops   *dma_ops;
 
 	struct module               *owner;
@@ -3202,6 +3221,10 @@ struct ib_wq *ib_create_wq(struct ib_pd *pd,
 int ib_destroy_wq(struct ib_wq *wq);
 int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
 		 u32 wq_attr_mask);
+struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
+						 struct ib_rwq_ind_table_init_attr*
+						 wq_ind_table_init_attr);
+int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
 
 int ib_map_mr_sg(struct ib_mr *mr, struct scatterlist *sg, int sg_nents,
 		 unsigned int *sg_offset, unsigned int page_size);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 06/10] IB/uverbs: Introduce RWQ Indirection table
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-7-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 07/10] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas
                     ` (5 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

User applications that want to spread traffic on several WQs, need to
create an indirection table, by using already created WQs.

Adding uverbs API in order to create and destroy this table.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs.h      |   3 +
 drivers/infiniband/core/uverbs_cmd.c  | 210 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c |  13 +++
 include/rdma/ib_verbs.h               |   1 +
 include/uapi/rdma/ib_user_verbs.h     |  26 +++++
 5 files changed, 253 insertions(+)

diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 74776c6..6c22923 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -186,6 +186,7 @@ extern struct idr ib_uverbs_srq_idr;
 extern struct idr ib_uverbs_xrcd_idr;
 extern struct idr ib_uverbs_rule_idr;
 extern struct idr ib_uverbs_wq_idr;
+extern struct idr ib_uverbs_rwq_ind_tbl_idr;
 
 void idr_remove_uobj(struct idr *idp, struct ib_uobject *uobj);
 
@@ -284,5 +285,7 @@ IB_UVERBS_DECLARE_EX_CMD(create_qp);
 IB_UVERBS_DECLARE_EX_CMD(create_wq);
 IB_UVERBS_DECLARE_EX_CMD(modify_wq);
 IB_UVERBS_DECLARE_EX_CMD(destroy_wq);
+IB_UVERBS_DECLARE_EX_CMD(create_rwq_ind_table);
+IB_UVERBS_DECLARE_EX_CMD(destroy_rwq_ind_table);
 
 #endif /* UVERBS_H */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 22e6173..327a56c 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -58,6 +58,7 @@ static struct uverbs_lock_class srq_lock_class	= { .name = "SRQ-uobj" };
 static struct uverbs_lock_class xrcd_lock_class = { .name = "XRCD-uobj" };
 static struct uverbs_lock_class rule_lock_class = { .name = "RULE-uobj" };
 static struct uverbs_lock_class wq_lock_class = { .name = "WQ-uobj" };
+static struct uverbs_lock_class rwq_ind_table_lock_class = { .name = "IND_TBL-uobj" };
 
 /*
  * The ib_uobject locking scheme is as follows:
@@ -338,6 +339,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 	INIT_LIST_HEAD(&ucontext->srq_list);
 	INIT_LIST_HEAD(&ucontext->ah_list);
 	INIT_LIST_HEAD(&ucontext->wq_list);
+	INIT_LIST_HEAD(&ucontext->rwq_ind_tbl_list);
 	INIT_LIST_HEAD(&ucontext->xrcd_list);
 	INIT_LIST_HEAD(&ucontext->rule_list);
 	rcu_read_lock();
@@ -3299,6 +3301,214 @@ int ib_uverbs_ex_modify_wq(struct ib_uverbs_file *file,
 	return ret;
 }
 
+int ib_uverbs_ex_create_rwq_ind_table(struct ib_uverbs_file *file,
+				      struct ib_device *ib_dev,
+				      struct ib_udata *ucore,
+				      struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_create_rwq_ind_table	  cmd = {};
+	struct ib_uverbs_ex_create_rwq_ind_table_resp  resp = {};
+	struct ib_uobject		  *uobj;
+	int err = 0;
+	struct ib_rwq_ind_table_init_attr init_attr = {};
+	struct ib_rwq_ind_table *rwq_ind_tbl;
+	struct ib_wq	**wqs = NULL;
+	u32 *wqs_handles = NULL;
+	struct ib_wq	*wq = NULL;
+	int i, j, num_read_wqs;
+	u32 num_wq_handles;
+	u32 expected_in_size;
+	size_t required_cmd_sz_header;
+	size_t required_resp_len;
+
+	required_cmd_sz_header = offsetof(typeof(cmd), log_ind_tbl_size) + sizeof(cmd.log_ind_tbl_size);
+	required_resp_len = offsetof(typeof(resp), ind_tbl_num) + sizeof(resp.ind_tbl_num);
+
+	if (ucore->inlen < required_cmd_sz_header)
+		return -EINVAL;
+
+	if (ucore->outlen < required_resp_len)
+		return -ENOSPC;
+
+	err = ib_copy_from_udata(&cmd, ucore, required_cmd_sz_header);
+	if (err)
+		return err;
+
+	ucore->inbuf += required_cmd_sz_header;
+	ucore->inlen -= required_cmd_sz_header;
+
+	if (cmd.comp_mask)
+		return -EOPNOTSUPP;
+
+	if (cmd.log_ind_tbl_size > IB_USER_VERBS_MAX_LOG_IND_TBL_SIZE)
+		return -EINVAL;
+
+	num_wq_handles = 1 << cmd.log_ind_tbl_size;
+	expected_in_size = num_wq_handles * sizeof(__u32);
+	if (num_wq_handles == 1)
+		/* input size for wq handles is u64 aligned */
+		expected_in_size += sizeof(__u32);
+
+	if (ucore->inlen < expected_in_size)
+		return -EINVAL;
+
+	if (ucore->inlen > expected_in_size &&
+	    !ib_is_udata_cleared(ucore, expected_in_size,
+				 ucore->inlen - expected_in_size))
+		return -EOPNOTSUPP;
+
+	wqs_handles = kcalloc(num_wq_handles, sizeof(*wqs_handles),
+			      GFP_KERNEL);
+	if (!wqs_handles)
+		return -ENOMEM;
+
+	err = ib_copy_from_udata(wqs_handles, ucore,
+				 num_wq_handles * sizeof(__u32));
+	if (err)
+		goto err_free;
+
+	wqs = kcalloc(num_wq_handles, sizeof(*wqs), GFP_KERNEL);
+	if (!wqs) {
+		err = -ENOMEM;
+		goto  err_free;
+	}
+
+	for (num_read_wqs = 0; num_read_wqs < num_wq_handles;
+			num_read_wqs++) {
+		wq = idr_read_wq(wqs_handles[num_read_wqs], file->ucontext);
+		if (!wq) {
+			err = -EINVAL;
+			goto put_wqs;
+		}
+
+		wqs[num_read_wqs] = wq;
+	}
+
+	uobj = kmalloc(sizeof(*uobj), GFP_KERNEL);
+	if (!uobj) {
+		err = -ENOMEM;
+		goto put_wqs;
+	}
+
+	init_uobj(uobj, 0, file->ucontext, &rwq_ind_table_lock_class);
+	down_write(&uobj->mutex);
+	init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size;
+	init_attr.ind_tbl = wqs;
+	rwq_ind_tbl = ib_dev->create_rwq_ind_table(ib_dev, &init_attr, uhw);
+
+	if (IS_ERR(rwq_ind_tbl)) {
+		err = PTR_ERR(rwq_ind_tbl);
+		goto err_uobj;
+	}
+
+	rwq_ind_tbl->ind_tbl = wqs;
+	rwq_ind_tbl->log_ind_tbl_size = init_attr.log_ind_tbl_size;
+	rwq_ind_tbl->uobject = uobj;
+	uobj->object = rwq_ind_tbl;
+	rwq_ind_tbl->device = ib_dev;
+	atomic_set(&rwq_ind_tbl->usecnt, 0);
+
+	for (i = 0; i < num_wq_handles; i++)
+		atomic_inc(&wqs[i]->usecnt);
+
+	err = idr_add_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+	if (err)
+		goto destroy_ind_tbl;
+
+	resp.ind_tbl_handle = uobj->id;
+	resp.ind_tbl_num = rwq_ind_tbl->ind_tbl_num;
+	resp.response_length = required_resp_len;
+
+	err = ib_copy_to_udata(ucore,
+			       &resp, resp.response_length);
+	if (err)
+		goto err_copy;
+
+	kfree(wqs_handles);
+
+	for (j = 0; j < num_read_wqs; j++)
+		put_wq_read(wqs[j]);
+
+	mutex_lock(&file->mutex);
+	list_add_tail(&uobj->list, &file->ucontext->rwq_ind_tbl_list);
+	mutex_unlock(&file->mutex);
+
+	uobj->live = 1;
+
+	up_write(&uobj->mutex);
+	return 0;
+
+err_copy:
+	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+destroy_ind_tbl:
+	ib_destroy_rwq_ind_table(rwq_ind_tbl);
+err_uobj:
+	put_uobj_write(uobj);
+put_wqs:
+	for (j = 0; j < num_read_wqs; j++)
+		put_wq_read(wqs[j]);
+err_free:
+	kfree(wqs_handles);
+	kfree(wqs);
+	return err;
+}
+
+int ib_uverbs_ex_destroy_rwq_ind_table(struct ib_uverbs_file *file,
+				       struct ib_device *ib_dev,
+				       struct ib_udata *ucore,
+				       struct ib_udata *uhw)
+{
+	struct ib_uverbs_ex_destroy_rwq_ind_table	cmd = {};
+	struct ib_rwq_ind_table *rwq_ind_tbl;
+	struct ib_uobject		*uobj;
+	int			ret;
+	struct ib_wq	**ind_tbl;
+	size_t required_cmd_sz;
+
+	required_cmd_sz = offsetof(typeof(cmd), ind_tbl_handle) + sizeof(cmd.ind_tbl_handle);
+
+	if (ucore->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (ucore->inlen > sizeof(cmd) &&
+	    !ib_is_udata_cleared(ucore, sizeof(cmd),
+				 ucore->inlen - sizeof(cmd)))
+		return -EOPNOTSUPP;
+
+	ret = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen));
+	if (ret)
+		return ret;
+
+	if (cmd.comp_mask)
+		return -EOPNOTSUPP;
+
+	uobj = idr_write_uobj(&ib_uverbs_rwq_ind_tbl_idr, cmd.ind_tbl_handle,
+			      file->ucontext);
+	if (!uobj)
+		return -EINVAL;
+	rwq_ind_tbl = uobj->object;
+	ind_tbl = rwq_ind_tbl->ind_tbl;
+
+	ret = ib_destroy_rwq_ind_table(rwq_ind_tbl);
+	if (!ret)
+		uobj->live = 0;
+
+	put_uobj_write(uobj);
+
+	if (ret)
+		return ret;
+
+	idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+
+	mutex_lock(&file->mutex);
+	list_del(&uobj->list);
+	mutex_unlock(&file->mutex);
+
+	put_uobj(uobj);
+	kfree(ind_tbl);
+	return ret;
+}
+
 int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 			     struct ib_device *ib_dev,
 			     struct ib_udata *ucore,
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 91cb36f..426e0ac 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -77,6 +77,7 @@ DEFINE_IDR(ib_uverbs_srq_idr);
 DEFINE_IDR(ib_uverbs_xrcd_idr);
 DEFINE_IDR(ib_uverbs_rule_idr);
 DEFINE_IDR(ib_uverbs_wq_idr);
+DEFINE_IDR(ib_uverbs_rwq_ind_tbl_idr);
 
 static DEFINE_SPINLOCK(map_lock);
 static DECLARE_BITMAP(dev_map, IB_UVERBS_MAX_DEVICES);
@@ -134,6 +135,8 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file,
 	[IB_USER_VERBS_EX_CMD_CREATE_WQ]        = ib_uverbs_ex_create_wq,
 	[IB_USER_VERBS_EX_CMD_MODIFY_WQ]        = ib_uverbs_ex_modify_wq,
 	[IB_USER_VERBS_EX_CMD_DESTROY_WQ]       = ib_uverbs_ex_destroy_wq,
+	[IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL] = ib_uverbs_ex_create_rwq_ind_table,
+	[IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL] = ib_uverbs_ex_destroy_rwq_ind_table,
 };
 
 static void ib_uverbs_add_one(struct ib_device *device);
@@ -269,6 +272,16 @@ static int ib_uverbs_cleanup_ucontext(struct ib_uverbs_file *file,
 		kfree(uqp);
 	}
 
+	list_for_each_entry_safe(uobj, tmp, &context->rwq_ind_tbl_list, list) {
+		struct ib_rwq_ind_table *rwq_ind_tbl = uobj->object;
+		struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl;
+
+		idr_remove_uobj(&ib_uverbs_rwq_ind_tbl_idr, uobj);
+		ib_destroy_rwq_ind_table(rwq_ind_tbl);
+		kfree(ind_tbl);
+		kfree(uobj);
+	}
+
 	list_for_each_entry_safe(uobj, tmp, &context->wq_list, list) {
 		struct ib_wq *wq = uobj->object;
 		struct ib_uwq_object *uwq =
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5ca3b1a..a12682e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1327,6 +1327,7 @@ struct ib_ucontext {
 	struct list_head	xrcd_list;
 	struct list_head	rule_list;
 	struct list_head	wq_list;
+	struct list_head	rwq_ind_tbl_list;
 	int			closing;
 
 	struct pid             *tgid;
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index c9470e5..2cf7c95 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -98,6 +98,8 @@ enum {
 	IB_USER_VERBS_EX_CMD_CREATE_WQ,
 	IB_USER_VERBS_EX_CMD_MODIFY_WQ,
 	IB_USER_VERBS_EX_CMD_DESTROY_WQ,
+	IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL,
+	IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL
 };
 
 /*
@@ -987,4 +989,28 @@ struct ib_uverbs_ex_modify_wq  {
 	__u32 curr_wq_state;
 };
 
+/* Prevent memory allocation rather than max expected size */
+#define IB_USER_VERBS_MAX_LOG_IND_TBL_SIZE 0x0d
+struct ib_uverbs_ex_create_rwq_ind_table  {
+	__u32 comp_mask;
+	__u32 log_ind_tbl_size;
+	/* Following are the wq handles according to log_ind_tbl_size
+	 * wq_handle1
+	 * wq_handle2
+	 */
+	__u32 wq_handles[0];
+};
+
+struct ib_uverbs_ex_create_rwq_ind_table_resp {
+	__u32 comp_mask;
+	__u32 response_length;
+	__u32 ind_tbl_handle;
+	__u32 ind_tbl_num;
+};
+
+struct ib_uverbs_ex_destroy_rwq_ind_table  {
+	__u32 comp_mask;
+	__u32 ind_tbl_handle;
+};
+
 #endif /* IB_USER_VERBS_H */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 07/10] IB/mlx5: Add Receive Work Queue Indirection table operations
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 06/10] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table Yishai Hadas
                     ` (4 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Some mlx5 based hardwares support a RQ table object. This RQ table
points to a few RQ objects. We implement the receive work queue
indirection table API (create and destroy) by using this hardware
object.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/main.c    |  6 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 14 +++++++
 drivers/infiniband/hw/mlx5/qp.c      | 78 ++++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/user.h    |  5 +++
 4 files changed, 102 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 1c6036e..1e08094 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2444,12 +2444,16 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 		dev->ib_dev.create_wq	 = mlx5_ib_create_wq;
 		dev->ib_dev.modify_wq	 = mlx5_ib_modify_wq;
 		dev->ib_dev.destroy_wq	 = mlx5_ib_destroy_wq;
+		dev->ib_dev.create_rwq_ind_table = mlx5_ib_create_rwq_ind_table;
+		dev->ib_dev.destroy_rwq_ind_table = mlx5_ib_destroy_rwq_ind_table;
 		dev->ib_dev.uverbs_ex_cmd_mask |=
 			(1ull << IB_USER_VERBS_EX_CMD_CREATE_FLOW) |
 			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW) |
 			(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ) |
 			(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ) |
-			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ);
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ) |
+			(1ull << IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL) |
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL);
 	}
 	err = init_node_data(dev);
 	if (err)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 62d4e13..cd3d620 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -247,6 +247,11 @@ enum {
 	MLX5_WQ_KERNEL
 };
 
+struct mlx5_ib_rwq_ind_table {
+	struct ib_rwq_ind_table ib_rwq_ind_tbl;
+	u32			rqtn;
+};
+
 /*
  * Connect-IB can trigger up to four concurrent pagefaults
  * per-QP.
@@ -657,6 +662,11 @@ static inline struct mlx5_ib_rwq *to_mrwq(struct ib_wq *ibwq)
 	return container_of(ibwq, struct mlx5_ib_rwq, ibwq);
 }
 
+static inline struct mlx5_ib_rwq_ind_table *to_mrwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
+{
+	return container_of(ib_rwq_ind_tbl, struct mlx5_ib_rwq_ind_table, ib_rwq_ind_tbl);
+}
+
 static inline struct mlx5_ib_srq *to_mibsrq(struct mlx5_core_srq *msrq)
 {
 	return container_of(msrq, struct mlx5_ib_srq, msrq);
@@ -797,6 +807,10 @@ struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
 int mlx5_ib_destroy_wq(struct ib_wq *wq);
 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      u32 wq_attr_mask, struct ib_udata *udata);
+struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
+						      struct ib_rwq_ind_table_init_attr *init_attr,
+						      struct ib_udata *udata);
+int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
 
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 extern struct workqueue_struct *mlx5_ib_page_fault_wq;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 9ce4b3a..94faa50 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4406,6 +4406,84 @@ int mlx5_ib_destroy_wq(struct ib_wq *wq)
 	return 0;
 }
 
+struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
+						      struct ib_rwq_ind_table_init_attr *init_attr,
+						      struct ib_udata *udata)
+{
+	struct mlx5_ib_dev *dev = to_mdev(device);
+	struct mlx5_ib_rwq_ind_table *rwq_ind_tbl;
+	int sz = 1 << init_attr->log_ind_tbl_size;
+	struct mlx5_ib_create_rwq_ind_tbl_resp resp = {};
+	size_t min_resp_len;
+	int inlen;
+	int err;
+	int i;
+	u32 *in;
+	void *rqtc;
+
+	if (udata->inlen > 0 &&
+	    !ib_is_udata_cleared(udata, 0,
+				 udata->inlen))
+		return ERR_PTR(-EOPNOTSUPP);
+
+	min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved);
+	if (udata->outlen && udata->outlen < min_resp_len)
+		return ERR_PTR(-EINVAL);
+
+	rwq_ind_tbl = kzalloc(sizeof(*rwq_ind_tbl), GFP_KERNEL);
+	if (!rwq_ind_tbl)
+		return ERR_PTR(-ENOMEM);
+
+	inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + sizeof(u32) * sz;
+	in = mlx5_vzalloc(inlen);
+	if (!in) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);
+
+	MLX5_SET(rqtc, rqtc, rqt_actual_size, sz);
+	MLX5_SET(rqtc, rqtc, rqt_max_size, sz);
+
+	for (i = 0; i < sz; i++)
+		MLX5_SET(rqtc, rqtc, rq_num[i], init_attr->ind_tbl[i]->wq_num);
+
+	err = mlx5_core_create_rqt(dev->mdev, in, inlen, &rwq_ind_tbl->rqtn);
+	kvfree(in);
+
+	if (err)
+		goto err;
+
+	rwq_ind_tbl->ib_rwq_ind_tbl.ind_tbl_num = rwq_ind_tbl->rqtn;
+	if (udata->outlen) {
+		resp.response_length = offsetof(typeof(resp), response_length) +
+					sizeof(resp.response_length);
+		err = ib_copy_to_udata(udata, &resp, resp.response_length);
+		if (err)
+			goto err_copy;
+	}
+
+	return &rwq_ind_tbl->ib_rwq_ind_tbl;
+
+err_copy:
+	mlx5_core_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn);
+err:
+	kfree(rwq_ind_tbl);
+	return ERR_PTR(err);
+}
+
+int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
+{
+	struct mlx5_ib_rwq_ind_table *rwq_ind_tbl = to_mrwq_ind_table(ib_rwq_ind_tbl);
+	struct mlx5_ib_dev *dev = to_mdev(ib_rwq_ind_tbl->device);
+
+	mlx5_core_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn);
+
+	kfree(rwq_ind_tbl);
+	return 0;
+}
+
 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      u32 wq_attr_mask, struct ib_udata *udata)
 {
diff --git a/drivers/infiniband/hw/mlx5/user.h b/drivers/infiniband/hw/mlx5/user.h
index 3e66f93..0f87955 100644
--- a/drivers/infiniband/hw/mlx5/user.h
+++ b/drivers/infiniband/hw/mlx5/user.h
@@ -179,6 +179,11 @@ struct mlx5_ib_create_wq_resp {
 	__u32	reserved;
 };
 
+struct mlx5_ib_create_rwq_ind_tbl_resp {
+	__u32	response_length;
+	__u32	reserved;
+};
+
 struct mlx5_ib_modify_wq {
 	__u32	comp_mask;
 	__u32	reserved;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 07/10] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-9-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 09/10] IB/uverbs: Extend create QP to get RWQ " Yishai Hadas
                     ` (3 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Extend create QP to get Receive Work Queue (WQ) indirection table.

QP can be created with external Receive Work Queue indirection table,
in that case it is ready to receive immediately.


Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/verbs.c | 19 +++++++++++++++++--
 include/rdma/ib_verbs.h         |  2 ++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 6b548d7..6916d5c 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -754,6 +754,12 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 	struct ib_qp *qp;
 	int ret;
 
+	if (qp_init_attr->rwq_ind_tbl &&
+	    (qp_init_attr->recv_cq ||
+	    qp_init_attr->srq || qp_init_attr->cap.max_recv_wr ||
+	    qp_init_attr->cap.max_recv_sge))
+		return ERR_PTR(-EINVAL);
+
 	/*
 	 * If the callers is using the RDMA API calculate the resources
 	 * needed for the RDMA READ/WRITE operations.
@@ -771,6 +777,7 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 	qp->real_qp    = qp;
 	qp->uobject    = NULL;
 	qp->qp_type    = qp_init_attr->qp_type;
+	qp->rwq_ind_tbl = qp_init_attr->rwq_ind_tbl;
 
 	atomic_set(&qp->usecnt, 0);
 	qp->mrs_used = 0;
@@ -788,7 +795,8 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 		qp->srq = NULL;
 	} else {
 		qp->recv_cq = qp_init_attr->recv_cq;
-		atomic_inc(&qp_init_attr->recv_cq->usecnt);
+		if (qp_init_attr->recv_cq)
+			atomic_inc(&qp_init_attr->recv_cq->usecnt);
 		qp->srq = qp_init_attr->srq;
 		if (qp->srq)
 			atomic_inc(&qp_init_attr->srq->usecnt);
@@ -799,7 +807,10 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
 	qp->xrcd    = NULL;
 
 	atomic_inc(&pd->usecnt);
-	atomic_inc(&qp_init_attr->send_cq->usecnt);
+	if (qp_init_attr->send_cq)
+		atomic_inc(&qp_init_attr->send_cq->usecnt);
+	if (qp_init_attr->rwq_ind_tbl)
+		atomic_inc(&qp->rwq_ind_tbl->usecnt);
 
 	if (qp_init_attr->cap.max_rdma_ctxs) {
 		ret = rdma_rw_init_mrs(qp, qp_init_attr);
@@ -1279,6 +1290,7 @@ int ib_destroy_qp(struct ib_qp *qp)
 	struct ib_pd *pd;
 	struct ib_cq *scq, *rcq;
 	struct ib_srq *srq;
+	struct ib_rwq_ind_table *ind_tbl;
 	int ret;
 
 	WARN_ON_ONCE(qp->mrs_used > 0);
@@ -1293,6 +1305,7 @@ int ib_destroy_qp(struct ib_qp *qp)
 	scq  = qp->send_cq;
 	rcq  = qp->recv_cq;
 	srq  = qp->srq;
+	ind_tbl = qp->rwq_ind_tbl;
 
 	if (!qp->uobject)
 		rdma_rw_cleanup_mrs(qp);
@@ -1307,6 +1320,8 @@ int ib_destroy_qp(struct ib_qp *qp)
 			atomic_dec(&rcq->usecnt);
 		if (srq)
 			atomic_dec(&srq->usecnt);
+		if (ind_tbl)
+			atomic_dec(&ind_tbl->usecnt);
 	}
 
 	return ret;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a12682e..3e9653e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1018,6 +1018,7 @@ struct ib_qp_init_attr {
 	 * Only needed for special QP types, or when using the RW API.
 	 */
 	u8			port_num;
+	struct ib_rwq_ind_table *rwq_ind_tbl;
 };
 
 struct ib_qp_open_attr {
@@ -1512,6 +1513,7 @@ struct ib_qp {
 	void		       *qp_context;
 	u32			qp_num;
 	enum ib_qp_type		qp_type;
+	struct ib_rwq_ind_table *rwq_ind_tbl;
 };
 
 struct ib_mr {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 09/10] IB/uverbs: Extend create QP to get RWQ indirection table
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-10-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 12:20   ` [PATCH V4 for-next 10/10] IB/mlx5: Add RSS QP support Yishai Hadas
                     ` (2 subsequent siblings)
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

User applications that want to spread incoming traffic between several WQs
should create a QP which contains an indirection table.

When such a QP is created other receive side parameters are not valid
and should not be given. Its send side is optional and assumed active
based on max_send_wr capability value.

Extend create QP to work accordingly.


Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/uverbs_cmd.c | 75 ++++++++++++++++++++++++++++++------
 include/uapi/rdma/ib_user_verbs.h    | 10 +++++
 2 files changed, 73 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 327a56c..65ab209 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -255,6 +255,17 @@ static void put_wq_read(struct ib_wq *wq)
 	put_uobj_read(wq->uobject);
 }
 
+static struct ib_rwq_ind_table *idr_read_rwq_indirection_table(int ind_table_handle,
+							       struct ib_ucontext *context)
+{
+	return idr_read_obj(&ib_uverbs_rwq_ind_tbl_idr, ind_table_handle, context, 0);
+}
+
+static void put_rwq_indirection_table_read(struct ib_rwq_ind_table *ind_table)
+{
+	put_uobj_read(ind_table->uobject);
+}
+
 static struct ib_qp *idr_write_qp(int qp_handle, struct ib_ucontext *context)
 {
 	struct ib_uobject *uobj;
@@ -1761,9 +1772,11 @@ static int create_qp(struct ib_uverbs_file *file,
 	struct ib_srq			*srq = NULL;
 	struct ib_qp			*qp;
 	char				*buf;
-	struct ib_qp_init_attr		attr;
+	struct ib_qp_init_attr		attr = {};
 	struct ib_uverbs_ex_create_qp_resp resp;
 	int				ret;
+	struct ib_rwq_ind_table *ind_tbl = NULL;
+	bool has_sq = true;
 
 	if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW))
 		return -EPERM;
@@ -1775,6 +1788,32 @@ static int create_qp(struct ib_uverbs_file *file,
 	init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext,
 		  &qp_lock_class);
 	down_write(&obj->uevent.uobject.mutex);
+	if (cmd_sz >= offsetof(typeof(*cmd), rwq_ind_tbl_handle) +
+		      sizeof(cmd->rwq_ind_tbl_handle) &&
+		      (cmd->comp_mask & IB_UVERBS_CREATE_QP_MASK_IND_TABLE)) {
+		ind_tbl = idr_read_rwq_indirection_table(cmd->rwq_ind_tbl_handle,
+							 file->ucontext);
+		if (!ind_tbl) {
+			ret = -EINVAL;
+			goto err_put;
+		}
+
+		attr.rwq_ind_tbl = ind_tbl;
+	}
+
+	if ((cmd_sz >= offsetof(typeof(*cmd), reserved1) +
+		       sizeof(cmd->reserved1)) && cmd->reserved1) {
+		ret = -EOPNOTSUPP;
+		goto err_put;
+	}
+
+	if (ind_tbl && (cmd->max_recv_wr || cmd->max_recv_sge || cmd->is_srq)) {
+		ret = -EINVAL;
+		goto err_put;
+	}
+
+	if (ind_tbl && !cmd->max_send_wr)
+		has_sq = false;
 
 	if (cmd->qp_type == IB_QPT_XRC_TGT) {
 		xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext,
@@ -1798,20 +1837,24 @@ static int create_qp(struct ib_uverbs_file *file,
 				}
 			}
 
-			if (cmd->recv_cq_handle != cmd->send_cq_handle) {
-				rcq = idr_read_cq(cmd->recv_cq_handle,
-						  file->ucontext, 0);
-				if (!rcq) {
-					ret = -EINVAL;
-					goto err_put;
+			if (!ind_tbl) {
+				if (cmd->recv_cq_handle != cmd->send_cq_handle) {
+					rcq = idr_read_cq(cmd->recv_cq_handle,
+							  file->ucontext, 0);
+					if (!rcq) {
+						ret = -EINVAL;
+						goto err_put;
+					}
 				}
 			}
 		}
 
-		scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq);
-		rcq = rcq ?: scq;
+		if (has_sq)
+			scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq);
+		if (!ind_tbl)
+			rcq = rcq ?: scq;
 		pd  = idr_read_pd(cmd->pd_handle, file->ucontext);
-		if (!pd || !scq) {
+		if (!pd || (!scq && has_sq)) {
 			ret = -EINVAL;
 			goto err_put;
 		}
@@ -1878,16 +1921,20 @@ static int create_qp(struct ib_uverbs_file *file,
 		qp->send_cq	  = attr.send_cq;
 		qp->recv_cq	  = attr.recv_cq;
 		qp->srq		  = attr.srq;
+		qp->rwq_ind_tbl	  = ind_tbl;
 		qp->event_handler = attr.event_handler;
 		qp->qp_context	  = attr.qp_context;
 		qp->qp_type	  = attr.qp_type;
 		atomic_set(&qp->usecnt, 0);
 		atomic_inc(&pd->usecnt);
-		atomic_inc(&attr.send_cq->usecnt);
+		if (attr.send_cq)
+			atomic_inc(&attr.send_cq->usecnt);
 		if (attr.recv_cq)
 			atomic_inc(&attr.recv_cq->usecnt);
 		if (attr.srq)
 			atomic_inc(&attr.srq->usecnt);
+		if (ind_tbl)
+			atomic_inc(&ind_tbl->usecnt);
 	}
 	qp->uobject = &obj->uevent.uobject;
 
@@ -1927,6 +1974,8 @@ static int create_qp(struct ib_uverbs_file *file,
 		put_cq_read(rcq);
 	if (srq)
 		put_srq_read(srq);
+	if (ind_tbl)
+		put_rwq_indirection_table_read(ind_tbl);
 
 	mutex_lock(&file->mutex);
 	list_add_tail(&obj->uevent.uobject.list, &file->ucontext->qp_list);
@@ -1954,6 +2003,8 @@ err_put:
 		put_cq_read(rcq);
 	if (srq)
 		put_srq_read(srq);
+	if (ind_tbl)
+		put_rwq_indirection_table_read(ind_tbl);
 
 	put_uobj_write(&obj->uevent.uobject);
 	return ret;
@@ -2047,7 +2098,7 @@ int ib_uverbs_ex_create_qp(struct ib_uverbs_file *file,
 	if (err)
 		return err;
 
-	if (cmd.comp_mask)
+	if (cmd.comp_mask & ~IB_UVERBS_CREATE_QP_SUP_COMP_MASK)
 		return -EINVAL;
 
 	if (cmd.reserved)
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 2cf7c95..2c8bca8 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -523,6 +523,14 @@ struct ib_uverbs_create_qp {
 	__u64 driver_data[0];
 };
 
+enum ib_uverbs_create_qp_mask {
+	IB_UVERBS_CREATE_QP_MASK_IND_TABLE = 1UL << 0,
+};
+
+enum {
+	IB_UVERBS_CREATE_QP_SUP_COMP_MASK = IB_UVERBS_CREATE_QP_MASK_IND_TABLE,
+};
+
 struct ib_uverbs_ex_create_qp {
 	__u64 user_handle;
 	__u32 pd_handle;
@@ -540,6 +548,8 @@ struct ib_uverbs_ex_create_qp {
 	__u8 reserved;
 	__u32 comp_mask;
 	__u32 create_flags;
+	__u32 rwq_ind_tbl_handle;
+	__u32  reserved1;
 };
 
 struct ib_uverbs_open_qp {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 for-next 10/10] IB/mlx5: Add RSS QP support
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 09/10] IB/uverbs: Extend create QP to get RWQ " Yishai Hadas
@ 2016-05-23 12:20   ` Yishai Hadas
       [not found]     ` <1464006056-19653-11-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-29 11:27   ` [PATCH V4 for-next 00/10] Verbs RSS Yuval Shaia
  2016-06-23 15:11   ` Doug Ledford
  11 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-23 12:20 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Add support for Raw Ethernet RX HASH QP. Currently, creation and
destruction of such a QP are supported. This QP is implemented as
a simple TIR object which points to the receive RQ indirection table.
The given hashing configuration is used to configure the TIR and by
that it chooses the right RQ from the RQ indirection table.


Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h |   5 +
 drivers/infiniband/hw/mlx5/qp.c      | 200 +++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/mlx5/user.h    |  34 ++++++
 3 files changed, 239 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index cd3d620..7ac4647 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -295,6 +295,10 @@ struct mlx5_ib_qp_trans {
 	u8			resp_depth;
 };
 
+struct mlx5_ib_rss_qp {
+	u32	tirn;
+};
+
 struct mlx5_ib_rq {
 	struct mlx5_ib_qp_base base;
 	struct mlx5_ib_wq	*rq;
@@ -323,6 +327,7 @@ struct mlx5_ib_qp {
 	union {
 		struct mlx5_ib_qp_trans trans_qp;
 		struct mlx5_ib_raw_packet_qp raw_packet_qp;
+		struct mlx5_ib_rss_qp rss_qp;
 	};
 	struct mlx5_buf		buf;
 
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 94faa50..2d462c6 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -1264,6 +1264,187 @@ static void raw_packet_qp_copy_info(struct mlx5_ib_qp *qp,
 	rq->doorbell = &qp->db;
 }
 
+static void destroy_rss_raw_qp_tir(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
+{
+	mlx5_core_destroy_tir(dev->mdev, qp->rss_qp.tirn);
+}
+
+static int create_rss_raw_qp_tir(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
+				 struct ib_pd *pd,
+				 struct ib_qp_init_attr *init_attr,
+				 struct ib_udata *udata)
+{
+	struct ib_uobject *uobj = pd->uobject;
+	struct ib_ucontext *ucontext = uobj->context;
+	struct mlx5_ib_ucontext *mucontext = to_mucontext(ucontext);
+	struct mlx5_ib_create_qp_resp resp = {};
+	int inlen;
+	int err;
+	u32 *in;
+	void *tirc;
+	void *hfso;
+	u32 selected_fields = 0;
+	size_t min_resp_len;
+	u32 tdn = mucontext->tdn;
+	struct mlx5_ib_create_qp_rss ucmd = {};
+	size_t required_cmd_sz;
+
+	if (init_attr->qp_type != IB_QPT_RAW_PACKET)
+		return -EOPNOTSUPP;
+
+	if (init_attr->create_flags || init_attr->send_cq)
+		return -EINVAL;
+
+	min_resp_len = offsetof(typeof(resp), uuar_index) + sizeof(resp.uuar_index);
+	if (udata->outlen < min_resp_len)
+		return -EINVAL;
+
+	required_cmd_sz = offsetof(typeof(ucmd), reserved1) + sizeof(ucmd.reserved1);
+	if (udata->inlen < required_cmd_sz) {
+		mlx5_ib_dbg(dev, "invalid inlen\n");
+		return -EINVAL;
+	}
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd))) {
+		mlx5_ib_dbg(dev, "inlen is not supported\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+		mlx5_ib_dbg(dev, "copy failed\n");
+		return -EFAULT;
+	}
+
+	if (ucmd.comp_mask) {
+		mlx5_ib_dbg(dev, "invalid comp mask\n");
+		return -EOPNOTSUPP;
+	}
+
+	if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved)) || ucmd.reserved1) {
+		mlx5_ib_dbg(dev, "invalid reserved\n");
+		return -EOPNOTSUPP;
+	}
+
+	err = ib_copy_to_udata(udata, &resp, min_resp_len);
+	if (err) {
+		mlx5_ib_dbg(dev, "copy failed\n");
+		return -EINVAL;
+	}
+
+	inlen = MLX5_ST_SZ_BYTES(create_tir_in);
+	in = mlx5_vzalloc(inlen);
+	if (!in)
+		return -ENOMEM;
+
+	tirc = MLX5_ADDR_OF(create_tir_in, in, ctx);
+	MLX5_SET(tirc, tirc, disp_type,
+		 MLX5_TIRC_DISP_TYPE_INDIRECT);
+	MLX5_SET(tirc, tirc, indirect_table,
+		 init_attr->rwq_ind_tbl->ind_tbl_num);
+	MLX5_SET(tirc, tirc, transport_domain, tdn);
+
+	hfso = MLX5_ADDR_OF(tirc, tirc, rx_hash_field_selector_outer);
+	switch (ucmd.rx_hash_function) {
+	case MLX5_RX_HASH_FUNC_TOEPLITZ:
+	{
+		void *rss_key = MLX5_ADDR_OF(tirc, tirc, rx_hash_toeplitz_key);
+		size_t len = MLX5_FLD_SZ_BYTES(tirc, rx_hash_toeplitz_key);
+
+		if (len != ucmd.rx_key_len) {
+			err = -EINVAL;
+			goto err;
+		}
+
+		MLX5_SET(tirc, tirc, rx_hash_fn, MLX5_RX_HASH_FN_TOEPLITZ);
+		MLX5_SET(tirc, tirc, rx_hash_symmetric, 1);
+		memcpy(rss_key, ucmd.rx_hash_key, len);
+		break;
+	}
+	default:
+		err = -EOPNOTSUPP;
+		goto err;
+	}
+
+	if (!ucmd.rx_hash_fields_mask) {
+		/* special case when this TIR serves as steering entry without hashing */
+		if (!init_attr->rwq_ind_tbl->log_ind_tbl_size)
+			goto create_tir;
+		err = -EINVAL;
+		goto err;
+	}
+
+	if (((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV4) ||
+	     (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV4)) &&
+	     ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV6) ||
+	     (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV6))) {
+		err = -EINVAL;
+		goto err;
+	}
+
+	/* If none of IPV4 & IPV6 SRC/DST was set - this bit field is ignored */
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV4) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV4))
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV4);
+	else if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV6) ||
+		 (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV6))
+		MLX5_SET(rx_hash_field_select, hfso, l3_prot_type,
+			 MLX5_L3_PROT_TYPE_IPV6);
+
+	if (((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_TCP) ||
+	     (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_TCP)) &&
+	     ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_UDP) ||
+	     (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_UDP))) {
+		err = -EINVAL;
+		goto err;
+	}
+
+	/* If none of TCP & UDP SRC/DST was set - this bit field is ignored */
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_TCP) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_TCP))
+		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+			 MLX5_L4_PROT_TYPE_TCP);
+	else if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_UDP) ||
+		 (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_UDP))
+		MLX5_SET(rx_hash_field_select, hfso, l4_prot_type,
+			 MLX5_L4_PROT_TYPE_UDP);
+
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV4) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_IPV6))
+		selected_fields |= MLX5_HASH_FIELD_SEL_SRC_IP;
+
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV4) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_IPV6))
+		selected_fields |= MLX5_HASH_FIELD_SEL_DST_IP;
+
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_TCP) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_SRC_PORT_UDP))
+		selected_fields |= MLX5_HASH_FIELD_SEL_L4_SPORT;
+
+	if ((ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_TCP) ||
+	    (ucmd.rx_hash_fields_mask & MLX5_RX_HASH_DST_PORT_UDP))
+		selected_fields |= MLX5_HASH_FIELD_SEL_L4_DPORT;
+
+	MLX5_SET(rx_hash_field_select, hfso, selected_fields, selected_fields);
+
+create_tir:
+	err = mlx5_core_create_tir(dev->mdev, in, inlen, &qp->rss_qp.tirn);
+
+	if (err)
+		goto err;
+
+	kvfree(in);
+	/* qpn is reserved for that QP */
+	qp->trans_qp.base.mqp.qpn = 0;
+	return 0;
+
+err:
+	kvfree(in);
+	return err;
+}
+
 static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 			    struct ib_qp_init_attr *init_attr,
 			    struct ib_udata *udata, struct mlx5_ib_qp *qp)
@@ -1290,6 +1471,14 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 	spin_lock_init(&qp->sq.lock);
 	spin_lock_init(&qp->rq.lock);
 
+	if (init_attr->rwq_ind_tbl) {
+		if (!udata)
+			return -ENOSYS;
+
+		err = create_rss_raw_qp_tir(dev, qp, pd, init_attr, udata);
+		return err;
+	}
+
 	if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) {
 		if (!MLX5_CAP_GEN(mdev, block_lb_mc)) {
 			mlx5_ib_dbg(dev, "block multicast loopback isn't supported\n");
@@ -1642,6 +1831,11 @@ static void destroy_qp_common(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
 	struct mlx5_modify_qp_mbox_in *in;
 	int err;
 
+	if (qp->ibqp.rwq_ind_tbl) {
+		destroy_rss_raw_qp_tir(dev, qp);
+		return;
+	}
+
 	base = qp->ibqp.qp_type == IB_QPT_RAW_PACKET ?
 	       &qp->raw_packet_qp.rq.base :
 	       &qp->trans_qp.base;
@@ -2498,6 +2692,9 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	int port;
 	enum rdma_link_layer ll = IB_LINK_LAYER_UNSPECIFIED;
 
+	if (ibqp->rwq_ind_tbl)
+		return -ENOSYS;
+
 	if (unlikely(ibqp->qp_type == IB_QPT_GSI))
 		return mlx5_ib_gsi_modify_qp(ibqp, attr, attr_mask);
 
@@ -4112,6 +4309,9 @@ int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 	int err = 0;
 	u8 raw_packet_qp_state;
 
+	if (ibqp->rwq_ind_tbl)
+		return -ENOSYS;
+
 	if (unlikely(ibqp->qp_type == IB_QPT_GSI))
 		return mlx5_ib_gsi_query_qp(ibqp, qp_attr, qp_attr_mask,
 					    qp_init_attr);
diff --git a/drivers/infiniband/hw/mlx5/user.h b/drivers/infiniband/hw/mlx5/user.h
index 0f87955..33c54fb 100644
--- a/drivers/infiniband/hw/mlx5/user.h
+++ b/drivers/infiniband/hw/mlx5/user.h
@@ -152,6 +152,40 @@ struct mlx5_ib_create_qp {
 	__u64	sq_buf_addr;
 };
 
+/* RX Hash function flags */
+enum mlx5_rx_hash_function_flags {
+	MLX5_RX_HASH_FUNC_TOEPLITZ	= 1 << 0,
+};
+
+/*
+ * RX Hash flags, these flags allows to set which incoming packet's field should
+ * participates in RX Hash. Each flag represent certain packet's field,
+ * when the flag is set the field that is represented by the flag will
+ * participate in RX Hash calculation.
+ * Note: *IPV4 and *IPV6 flags can't be enabled together on the same QP
+ * and *TCP and *UDP flags can't be enabled together on the same QP.
+*/
+enum mlx5_rx_hash_fields {
+	MLX5_RX_HASH_SRC_IPV4	= 1 << 0,
+	MLX5_RX_HASH_DST_IPV4	= 1 << 1,
+	MLX5_RX_HASH_SRC_IPV6	= 1 << 2,
+	MLX5_RX_HASH_DST_IPV6	= 1 << 3,
+	MLX5_RX_HASH_SRC_PORT_TCP	= 1 << 4,
+	MLX5_RX_HASH_DST_PORT_TCP	= 1 << 5,
+	MLX5_RX_HASH_SRC_PORT_UDP	= 1 << 6,
+	MLX5_RX_HASH_DST_PORT_UDP	= 1 << 7
+};
+
+struct mlx5_ib_create_qp_rss {
+	__u64 rx_hash_fields_mask; /* enum mlx5_rx_hash_fields */
+	__u8 rx_hash_function; /* enum mlx5_rx_hash_function_flags */
+	__u8 rx_key_len; /* valid only for Toeplitz */
+	__u8 reserved[6];
+	__u8 rx_hash_key[128]; /* valid only for Toeplitz */
+	__u32   comp_mask;
+	__u32   reserved1;
+};
+
 struct mlx5_ib_create_qp_resp {
 	__u32	uuar_index;
 };
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]     ` <1464006056-19653-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-23 16:57       ` Steve Wise
       [not found]         ` <5d0982d2-3b07-11a3-a74c-a52b8e9fb392-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
  2016-05-29 12:51       ` Sagi Grimberg
  1 sibling, 1 reply; 36+ messages in thread
From: Steve Wise @ 2016-05-23 16:57 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/23/2016 7:20 AM, Yishai Hadas wrote:
> Introduce Work Queue object and its create/destroy/modify verbs.
>
> QP can be created without internal WQs "packaged" inside it,
> this QP can be configured to use "external" WQ object as its
> receive/send queue.
> WQ is a necessary component for RSS technology since RSS mechanism
> is supposed to distribute the traffic between multiple
> Receive Work Queues.
>
> WQ associated (many to one) with Completion Queue and it owns WQ
> properties (PD, WQ size, etc.).
> WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
> it may be extend to others such as IB_WQT_SQ. (send queue).
> WQ from type IB_WQT_RQ contains receive work requests.
>
> PD is an attribute of a work queue (i.e. send/receive queue), it's used
> by the hardware for security validation before scattering to a memory
> region which is pointed by the WQ. For that, an external WQ object
> needs a PD, letting the hardware makes that validation.
>
> When accessing a memory region that is pointed by the WQ its PD
> is used and not the QP's PD, this behavior is similar
> to a SRQ and a QP.
>
> WQ context is subject to a well-defined state transitions done by
> the modify_wq verb.
> When WQ is created its initial state becomes IB_WQS_RESET.
> >From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
> >From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
> or to IB_WQS_ERR.
> >From IB_WQS_ERR it can be modified to IB_WQS_RESET.
>
> Note: transition to IB_WQS_ERR might occur implicitly in case there
> was some HW error.

Hey Yishai,

What happens to posted RQ WRs when transitioning from RDY->RESET, 
RDY->ERR, and ERR->RESET?
Shouldn't the state machine be defined more?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table
       [not found]     ` <1464006056-19653-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-23 17:07       ` Steve Wise
  2016-05-24  7:39         ` Yishai Hadas
  2016-05-29 13:06       ` Sagi Grimberg
  1 sibling, 1 reply; 36+ messages in thread
From: Steve Wise @ 2016-05-23 17:07 UTC (permalink / raw)
  To: 'Yishai Hadas', dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

> Introduce Receive Work Queue (WQ) indirection table.
> This object can be used to spread incoming traffic to different
> receive Work Queues.
> 
> A Receive WQ indirection table points to variable size of WQs.
> This table is given to a QP in downstream patches.
> 
> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>  drivers/infiniband/core/verbs.c | 62
> +++++++++++++++++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h         | 23 +++++++++++++++
>  2 files changed, 85 insertions(+)
> 
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index c096cad..6b548d7 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1636,6 +1636,68 @@ int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr
> *wq_attr,
>  }
>  EXPORT_SYMBOL(ib_modify_wq);
> 
> +/*
> + * ib_create_rwq_ind_table - Creates a RQ Indirection Table.
> + * @device: The device on which to create the rwq indirection table.
> + * @ib_rwq_ind_table_init_attr: A list of initial attributes required to
> + * create the Indirection Table.
> + *
> + * Note: The life time of ib_rwq_ind_table_init_attr->ind_tbl is not less
> + *	than the created ib_rwq_ind_table object and the caller is responsible
> + *	for its memory allocation/free.
> + */
> +struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
> +						 struct
ib_rwq_ind_table_init_attr
> *init_attr)
> +{
> +	struct ib_rwq_ind_table *rwq_ind_table;
> +	int i;
> +	u32 table_size;
> +
> +	if (!device->create_rwq_ind_table)
> +		return ERR_PTR(-ENOSYS);
> +
> +	table_size = (1 << init_attr->log_ind_tbl_size);
> +	rwq_ind_table = device->create_rwq_ind_table(device,
> +				init_attr, NULL);
> +	if (IS_ERR(rwq_ind_table))
> +		return rwq_ind_table;
> +
> +	rwq_ind_table->ind_tbl = init_attr->ind_tbl;
> +	rwq_ind_table->log_ind_tbl_size = init_attr->log_ind_tbl_size;
> +	rwq_ind_table->device = device;
> +	rwq_ind_table->uobject = NULL;
> +	atomic_set(&rwq_ind_table->usecnt, 0);
> +
> +	for (i = 0; i < table_size; i++)
> +		atomic_inc(&rwq_ind_table->ind_tbl[i]->usecnt);

Hey Yishai,

Why do you need usecnt on the individual tables allocated by the device?  I see
them inc'd here and dec'd in destroy.  But I don't see them used anywhere else?

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table
  2016-05-23 17:07       ` Steve Wise
@ 2016-05-24  7:39         ` Yishai Hadas
       [not found]           ` <46a7de86-dffe-e44c-3768-10dfb70e8802-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-24  7:39 UTC (permalink / raw)
  To: Steve Wise
  Cc: 'Yishai Hadas',
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/23/2016 8:07 PM, Steve Wise wrote:
>> Introduce Receive Work Queue (WQ) indirection table.
>> This object can be used to spread incoming traffic to different
>> receive Work Queues.
>>
>> A Receive WQ indirection table points to variable size of WQs.
>> This table is given to a QP in downstream patches.
>>
>> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>  drivers/infiniband/core/verbs.c | 62
>> +++++++++++++++++++++++++++++++++++++++++
>>  include/rdma/ib_verbs.h         | 23 +++++++++++++++
>>  2 files changed, 85 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
>> index c096cad..6b548d7 100644
>> --- a/drivers/infiniband/core/verbs.c
>> +++ b/drivers/infiniband/core/verbs.c
>> @@ -1636,6 +1636,68 @@ int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr
>> *wq_attr,
>>  }
>>  EXPORT_SYMBOL(ib_modify_wq);
>>
>> +/*
>> + * ib_create_rwq_ind_table - Creates a RQ Indirection Table.
>> + * @device: The device on which to create the rwq indirection table.
>> + * @ib_rwq_ind_table_init_attr: A list of initial attributes required to
>> + * create the Indirection Table.
>> + *
>> + * Note: The life time of ib_rwq_ind_table_init_attr->ind_tbl is not less
>> + *	than the created ib_rwq_ind_table object and the caller is responsible
>> + *	for its memory allocation/free.
>> + */
>> +struct ib_rwq_ind_table *ib_create_rwq_ind_table(struct ib_device *device,
>> +						 struct
> ib_rwq_ind_table_init_attr
>> *init_attr)
>> +{
>> +	struct ib_rwq_ind_table *rwq_ind_table;
>> +	int i;
>> +	u32 table_size;
>> +
>> +	if (!device->create_rwq_ind_table)
>> +		return ERR_PTR(-ENOSYS);
>> +
>> +	table_size = (1 << init_attr->log_ind_tbl_size);
>> +	rwq_ind_table = device->create_rwq_ind_table(device,
>> +				init_attr, NULL);
>> +	if (IS_ERR(rwq_ind_table))
>> +		return rwq_ind_table;
>> +
>> +	rwq_ind_table->ind_tbl = init_attr->ind_tbl;
>> +	rwq_ind_table->log_ind_tbl_size = init_attr->log_ind_tbl_size;
>> +	rwq_ind_table->device = device;
>> +	rwq_ind_table->uobject = NULL;
>> +	atomic_set(&rwq_ind_table->usecnt, 0);
>> +
>> +	for (i = 0; i < table_size; i++)
>> +		atomic_inc(&rwq_ind_table->ind_tbl[i]->usecnt);
>
> Hey Yishai,
>
> Why do you need usecnt on the individual tables allocated by the device?  I see
> them inc'd here and dec'd in destroy.  But I don't see them used anywhere else?
>
> Steve.

Look at ib_destroy_wq for this reference count usage.

WQ and indirection table are IB objects that are managed by the IB layer 
for both update/cleanup. Specifically, a WQ can't be cleaned up if there 
is an indirection table that points to. From application point of view 
when it has an handle to an indirection table it assumes that all its 
underlying WQ(s) are valid so the IB layer should prevent any 
destruction of any WQ before releasing the indirection table, this is 
done by the ref count mechanism as done for other related IB objects.
You can look also at ib_uverbs_cleanup_ucontext where upon process 
termination the order of cleanup is first destroying the indirection 
table(s) then going over all WQ(s) and destroy them.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]         ` <5d0982d2-3b07-11a3-a74c-a52b8e9fb392-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
@ 2016-05-24  8:50           ` Yishai Hadas
       [not found]             ` <d2401170-0ff4-c39a-d7c6-2a5face00fa2-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-24  8:50 UTC (permalink / raw)
  To: Steve Wise
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/23/2016 7:57 PM, Steve Wise wrote:
> On 5/23/2016 7:20 AM, Yishai Hadas wrote:
>> Introduce Work Queue object and its create/destroy/modify verbs.
>>
>> QP can be created without internal WQs "packaged" inside it,
>> this QP can be configured to use "external" WQ object as its
>> receive/send queue.
>> WQ is a necessary component for RSS technology since RSS mechanism
>> is supposed to distribute the traffic between multiple
>> Receive Work Queues.
>>
>> WQ associated (many to one) with Completion Queue and it owns WQ
>> properties (PD, WQ size, etc.).
>> WQ has a type, this patch introduces the IB_WQT_RQ (i.e.receive queue),
>> it may be extend to others such as IB_WQT_SQ. (send queue).
>> WQ from type IB_WQT_RQ contains receive work requests.
>>
>> PD is an attribute of a work queue (i.e. send/receive queue), it's used
>> by the hardware for security validation before scattering to a memory
>> region which is pointed by the WQ. For that, an external WQ object
>> needs a PD, letting the hardware makes that validation.
>>
>> When accessing a memory region that is pointed by the WQ its PD
>> is used and not the QP's PD, this behavior is similar
>> to a SRQ and a QP.
>>
>> WQ context is subject to a well-defined state transitions done by
>> the modify_wq verb.
>> When WQ is created its initial state becomes IB_WQS_RESET.
>> >From IB_WQS_RESET it can be modified to itself or to IB_WQS_RDY.
>> >From IB_WQS_RDY it can be modified to itself, to IB_WQS_RESET
>> or to IB_WQS_ERR.
>> >From IB_WQS_ERR it can be modified to IB_WQS_RESET.
>>
>> Note: transition to IB_WQS_ERR might occur implicitly in case there
>> was some HW error.
>
> Hey Yishai,
>
> What happens to posted RQ WRs when transitioning from RDY->RESET,
> RDY->ERR, and ERR->RESET?
> Shouldn't the state machine be defined more?

When the external RQ (i.e. WQ from type RQ) is moved to error/reset 
state it will behave similarly to the receive queue part of a QP when it 
enters error/reset state.

Specifically,
When moving to RESET:
RQ attributes are reset to the same values after the RQ was created.
Outstanding Work Requests are removed from the queues without notifying 
the Consumer.

When Moving to ERR:
RQ processing is stopped. Work Requests pending or in process are 
completed in error.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table
       [not found]           ` <46a7de86-dffe-e44c-3768-10dfb70e8802-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-05-24 14:02             ` Steve Wise
  0 siblings, 0 replies; 36+ messages in thread
From: Steve Wise @ 2016-05-24 14:02 UTC (permalink / raw)
  To: 'Yishai Hadas'
  Cc: 'Yishai Hadas',
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

> > Hey Yishai,
> >
> > Why do you need usecnt on the individual tables allocated by the device?  I
see
> > them inc'd here and dec'd in destroy.  But I don't see them used anywhere
else?
> >
> > Steve.
> 
> Look at ib_destroy_wq for this reference count usage.
> 
> WQ and indirection table are IB objects that are managed by the IB layer
> for both update/cleanup. Specifically, a WQ can't be cleaned up if there
> is an indirection table that points to. From application point of view
> when it has an handle to an indirection table it assumes that all its
> underlying WQ(s) are valid so the IB layer should prevent any
> destruction of any WQ before releasing the indirection table, this is
> done by the ref count mechanism as done for other related IB objects.
> You can look also at ib_uverbs_cleanup_ucontext where upon process
> termination the order of cleanup is first destroying the indirection
> table(s) then going over all WQ(s) and destroy them.

I see.  Thanks for the clarification.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]             ` <d2401170-0ff4-c39a-d7c6-2a5face00fa2-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-05-24 14:25               ` Steve Wise
  2016-05-24 14:51                 ` Yishai Hadas
  0 siblings, 1 reply; 36+ messages in thread
From: Steve Wise @ 2016-05-24 14:25 UTC (permalink / raw)
  To: 'Yishai Hadas'
  Cc: 'Yishai Hadas',
	dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

> > Hey Yishai,
> >
> > What happens to posted RQ WRs when transitioning from RDY->RESET,
> > RDY->ERR, and ERR->RESET?
> > Shouldn't the state machine be defined more?
> 
> When the external RQ (i.e. WQ from type RQ) is moved to error/reset
> state it will behave similarly to the receive queue part of a QP when it
> enters error/reset state.
> 
> Specifically,
> When moving to RESET:
> RQ attributes are reset to the same values after the RQ was created.
> Outstanding Work Requests are removed from the queues without notifying
> the Consumer.
> 
> When Moving to ERR:
> RQ processing is stopped. Work Requests pending or in process are
> completed in error.
> 

Thanks for the explanation.  These semantics should be documented somewhere.

This series appears to me to be reasonable.  

Series Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

Thanks,

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
  2016-05-24 14:25               ` Steve Wise
@ 2016-05-24 14:51                 ` Yishai Hadas
  0 siblings, 0 replies; 36+ messages in thread
From: Yishai Hadas @ 2016-05-24 14:51 UTC (permalink / raw)
  To: Steve Wise, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: 'Yishai Hadas',
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/24/2016 5:25 PM, Steve Wise wrote:
>>> Hey Yishai,
>>>
>>> What happens to posted RQ WRs when transitioning from RDY->RESET,
>>> RDY->ERR, and ERR->RESET?
>>> Shouldn't the state machine be defined more?
>>
>> When the external RQ (i.e. WQ from type RQ) is moved to error/reset
>> state it will behave similarly to the receive queue part of a QP when it
>> enters error/reset state.
>>
>> Specifically,
>> When moving to RESET:
>> RQ attributes are reset to the same values after the RQ was created.
>> Outstanding Work Requests are removed from the queues without notifying
>> the Consumer.
>>
>> When Moving to ERR:
>> RQ processing is stopped. Work Requests pending or in process are
>> completed in error.
>>
>
> Thanks for the explanation.  These semantics should be documented somewhere.
>
> This series appears to me to be reasonable.
>
> Series Reviewed-by: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

Thanks Steve for your Reviewed-by on the series, it's really appreciated.

Doug, can you please take it into 4.7 ? it missed few kernel merging 
already, hope that now it will be in.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 00/10] Verbs RSS
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2016-05-23 12:20   ` [PATCH V4 for-next 10/10] IB/mlx5: Add RSS QP support Yishai Hadas
@ 2016-05-29 11:27   ` Yuval Shaia
  2016-06-23 15:11   ` Doug Ledford
  11 siblings, 0 replies; 36+ messages in thread
From: Yuval Shaia @ 2016-05-29 11:27 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On Mon, May 23, 2016 at 03:20:46PM +0300, Yishai Hadas wrote:
> 
> RSS (Receive Side Scaling) technology allows to spread incoming traffic
> between different receive descriptor queues.
> Assigning each queue to different CPU cores allows to better load
> balance the incoming traffic and improve performance.
> 
> This patch-set introduces some new objects and verbs in order to allow
> verbs based solutions to utilize the RSS offload capability which is
> widely supported today by many modern NICs. It extends the IB and uverbs
> layers to support the above functionality and supplies a specific
> implementation for the mlx5_ib driver.
Any plans for mlx4_ib driver?
> 
> The implementation is based on an RFC that was sent to the list some
> months ago and describes the expected verbs and objects.
> RFC: http://www.spinics.net/lists/linux-rdma/msg25012.html
> 
> In addition, below URL can be used as a reference to the motivation and
> the justification to add the new objects that are described below.
> http://lxr.free-electrons.com/source/Documentation/networking/scaling.txt
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 01/10] net/mlx5: Export required core functions to support RSS
       [not found]     ` <1464006056-19653-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 12:48       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 12:48 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks good Yishai,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]     ` <1464006056-19653-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 16:57       ` Steve Wise
@ 2016-05-29 12:51       ` Sagi Grimberg
       [not found]         ` <574AE5C4.5010707-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  1 sibling, 1 reply; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 12:51 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks good,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]         ` <574AE5C4.5010707-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-05-29 12:56           ` Sagi Grimberg
       [not found]             ` <574AE710.3070901-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 12:56 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w


> Looks good,
>
> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>

Wait, I have a question first,

Can you explain why a wq takes a reference on a CQ?
I don't really see why we enforce this association. I cannot
destroy the CQ before I destroy the wq?

Look at SRQs which I find quite similar (the normal one, not the XRC
one), it does not reference the CQ.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 03/10] IB/uverbs: Add WQ support
       [not found]     ` <1464006056-19653-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 12:57       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 12:57 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks good,

Reviewed-by: Sagi Grimberg <sagi-TmH2Wj2nsNJBDLzU/O5InQ@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs
       [not found]     ` <1464006056-19653-5-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:02       ` Sagi Grimberg
  2016-05-30 20:13       ` Or Gerlitz
  1 sibling, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:02 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w



> +	err = set_user_rq_size(dev, init_attr, &ucmd, rwq);
> +	if (err) {
> +		mlx5_ib_dbg(dev, "err %d\n", err);
> +		return err;
> +	}

Nit, would be nice to have a meaningful error print, otherwise looks good,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table
       [not found]     ` <1464006056-19653-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-23 17:07       ` Steve Wise
@ 2016-05-29 13:06       ` Sagi Grimberg
  1 sibling, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:06 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks fine,

Reviewed-by: Sagi Grimberg <sagi-jYi7zNYP+F9BDLzU/O5InQ@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 06/10] IB/uverbs: Introduce RWQ Indirection table
       [not found]     ` <1464006056-19653-7-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:07       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:07 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks fine,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 07/10] IB/mlx5: Add Receive Work Queue Indirection table operations
       [not found]     ` <1464006056-19653-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:08       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:08 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks fine,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table
       [not found]     ` <1464006056-19653-9-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:11       ` Sagi Grimberg
       [not found]         ` <574AEA8D.2010909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:11 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w



On 23/05/16 15:20, Yishai Hadas wrote:
> Extend create QP to get Receive Work Queue (WQ) indirection table.
>
> QP can be created with external Receive Work Queue indirection table,
> in that case it is ready to receive immediately.
>
>
> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> ---
>   drivers/infiniband/core/verbs.c | 19 +++++++++++++++++--
>   include/rdma/ib_verbs.h         |  2 ++
>   2 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 6b548d7..6916d5c 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -754,6 +754,12 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
>   	struct ib_qp *qp;
>   	int ret;
>
> +	if (qp_init_attr->rwq_ind_tbl &&
> +	    (qp_init_attr->recv_cq ||
> +	    qp_init_attr->srq || qp_init_attr->cap.max_recv_wr ||
> +	    qp_init_attr->cap.max_recv_sge))
> +		return ERR_PTR(-EINVAL);
> +

Yishai,

I understand that a qp cannot have indirection table with a srq or a
recv_cq, but can you explain the max_recv_[wr|sge] condition?

Would it make better sense to condition on the QP type?

Other then that,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 09/10] IB/uverbs: Extend create QP to get RWQ indirection table
       [not found]     ` <1464006056-19653-10-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:12       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:12 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

Looks fine,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 10/10] IB/mlx5: Add RSS QP support
       [not found]     ` <1464006056-19653-11-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-05-29 13:16       ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-29 13:16 UTC (permalink / raw)
  To: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w


> +/* RX Hash function flags */
> +enum mlx5_rx_hash_function_flags {
> +	MLX5_RX_HASH_FUNC_TOEPLITZ	= 1 << 0,
> +};
> +
> +/*
> + * RX Hash flags, these flags allows to set which incoming packet's field should
> + * participates in RX Hash. Each flag represent certain packet's field,
> + * when the flag is set the field that is represented by the flag will
> + * participate in RX Hash calculation.
> + * Note: *IPV4 and *IPV6 flags can't be enabled together on the same QP
> + * and *TCP and *UDP flags can't be enabled together on the same QP.
> +*/
> +enum mlx5_rx_hash_fields {
> +	MLX5_RX_HASH_SRC_IPV4	= 1 << 0,
> +	MLX5_RX_HASH_DST_IPV4	= 1 << 1,
> +	MLX5_RX_HASH_SRC_IPV6	= 1 << 2,
> +	MLX5_RX_HASH_DST_IPV6	= 1 << 3,
> +	MLX5_RX_HASH_SRC_PORT_TCP	= 1 << 4,
> +	MLX5_RX_HASH_DST_PORT_TCP	= 1 << 5,
> +	MLX5_RX_HASH_SRC_PORT_UDP	= 1 << 6,
> +	MLX5_RX_HASH_DST_PORT_UDP	= 1 << 7
> +};

I never really understood why these should be driver specific but I
know it was decided so...

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]             ` <574AE710.3070901-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-05-29 13:30               ` Yishai Hadas
       [not found]                 ` <7747beea-5c96-6c17-a87e-a16a45252487-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-29 13:30 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/29/2016 3:56 PM, Sagi Grimberg wrote:
>
>> Looks good,
>>
>> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
>
> Wait, I have a question first,
>
> Can you explain why a wq takes a reference on a CQ?
> I don't really see why we enforce this association. I cannot
> destroy the CQ before I destroy the wq?
>
> Look at SRQs which I find quite similar (the normal one, not the XRC
> one), it does not reference the CQ.

This association WQ->CQ is same as with QP->CQ where reference is taken 
on the CQ. When creating a normal SRQ there is no input CQ, only when 
working with XRC one, in that case a reference is really taken.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table
       [not found]         ` <574AEA8D.2010909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
@ 2016-05-29 13:55           ` Yishai Hadas
       [not found]             ` <d34f7f40-5e05-2eb9-4b81-649b2cb11174-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 36+ messages in thread
From: Yishai Hadas @ 2016-05-29 13:55 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On 5/29/2016 4:11 PM, Sagi Grimberg wrote:
>
>
> On 23/05/16 15:20, Yishai Hadas wrote:
>> Extend create QP to get Receive Work Queue (WQ) indirection table.
>>
>> QP can be created with external Receive Work Queue indirection table,
>> in that case it is ready to receive immediately.
>>
>>
>> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>> ---
>>   drivers/infiniband/core/verbs.c | 19 +++++++++++++++++--
>>   include/rdma/ib_verbs.h         |  2 ++
>>   2 files changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/verbs.c
>> b/drivers/infiniband/core/verbs.c
>> index 6b548d7..6916d5c 100644
>> --- a/drivers/infiniband/core/verbs.c
>> +++ b/drivers/infiniband/core/verbs.c
>> @@ -754,6 +754,12 @@ struct ib_qp *ib_create_qp(struct ib_pd *pd,
>>       struct ib_qp *qp;
>>       int ret;
>>
>> +    if (qp_init_attr->rwq_ind_tbl &&
>> +        (qp_init_attr->recv_cq ||
>> +        qp_init_attr->srq || qp_init_attr->cap.max_recv_wr ||
>> +        qp_init_attr->cap.max_recv_sge))
>> +        return ERR_PTR(-EINVAL);
>> +
>
> Yishai,
>
> I understand that a qp cannot have indirection table with a srq or a
> recv_cq, but can you explain the max_recv_[wr|sge] condition?
>
> Would it make better sense to condition on the QP type?
>
> Other then that,
>
> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>

Thanks Sagi for your reviewing the series.

When a receive WQ indirection table is supplied as an input any other 
receive side parameters are not playing/valid as they were already set 
when the internal WQ(s) were created. Those checks on max_recv[wr|sge] 
are coming to ensure that input is valid, same as done for srq and recv_cq.

Re QP type checking, the supported QP types are going to be part of RSS 
capabilities and may be changed between vendors. Each vendor upon 
creation may condition on. Please see cover letter as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs
       [not found]                 ` <7747beea-5c96-6c17-a87e-a16a45252487-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-05-30  6:38                   ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-30  6:38 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w


>>> Looks good,
>>>
>>> Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
>>
>> Wait, I have a question first,
>>
>> Can you explain why a wq takes a reference on a CQ?
>> I don't really see why we enforce this association. I cannot
>> destroy the CQ before I destroy the wq?
>>
>> Look at SRQs which I find quite similar (the normal one, not the XRC
>> one), it does not reference the CQ.
>
> This association WQ->CQ is same as with QP->CQ where reference is taken
> on the CQ. When creating a normal SRQ there is no input CQ, only when
> working with XRC one, in that case a reference is really taken.

Makes sense. I overlooked the cq parameter. so again,

Reviewed-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table
       [not found]             ` <d34f7f40-5e05-2eb9-4b81-649b2cb11174-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2016-05-30  6:43               ` Sagi Grimberg
  0 siblings, 0 replies; 36+ messages in thread
From: Sagi Grimberg @ 2016-05-30  6:43 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w



> Thanks Sagi for your reviewing the series.

Sure, I think its an important series.

> When a receive WQ indirection table is supplied as an input any other
> receive side parameters are not playing/valid as they were already set
> when the internal WQ(s) were created. Those checks on max_recv[wr|sge]
> are coming to ensure that input is valid, same as done for srq and recv_cq.

OK, makes sense.

> Re QP type checking, the supported QP types are going to be part of RSS
> capabilities and may be changed between vendors. Each vendor upon
> creation may condition on. Please see cover letter as well.

Yea, I guess that raw eth qps are not supported by all the vendors...

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs
       [not found]     ` <1464006056-19653-5-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-05-29 13:02       ` Sagi Grimberg
@ 2016-05-30 20:13       ` Or Gerlitz
  1 sibling, 0 replies; 36+ messages in thread
From: Or Gerlitz @ 2016-05-30 20:13 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Matan Barak,
	Alex Vainman, Tzahi Oved, Majd Dibbiny,
	talal-VPRAkNaXOzVWk0Htik3J/w

On Mon, May 23, 2016 at 3:20 PM, Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:

> +enum {
> +       MLX5_WQ_USER,
> +       MLX5_WQ_KERNEL
> +};

any reason to add MLX5_WQ_KERNEL?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 for-next 00/10] Verbs RSS
       [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2016-05-29 11:27   ` [PATCH V4 for-next 00/10] Verbs RSS Yuval Shaia
@ 2016-06-23 15:11   ` Doug Ledford
  11 siblings, 0 replies; 36+ messages in thread
From: Doug Ledford @ 2016-06-23 15:11 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	alexv-VPRAkNaXOzVWk0Htik3J/w, tzahio-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

[-- Attachment #1: Type: text/plain, Size: 6542 bytes --]

On 05/23/2016 08:20 AM, Yishai Hadas wrote:
> Hi Doug,
> 
> This V4 series addressed the note to move RSS hash capabilities
> from the common udata to be part of vendor specific data.
> As was agreed in the OFAWG meeting, WQ and indirection table
> are common objects. The last 3 patches were changed
> accordingly.
> 
> As the RSS series already missed few kernel merging, would appreciate
> if it can be taken into 4.7.

I've taken this in for 4.8.  It's in my mlx5-4.8 topic branch since it
is so intertwined with mlx5.

> Thanks,
> Yishai
>  
> 
> RSS (Receive Side Scaling) technology allows to spread incoming traffic
> between different receive descriptor queues.
> Assigning each queue to different CPU cores allows to better load
> balance the incoming traffic and improve performance.
> 
> This patch-set introduces some new objects and verbs in order to allow
> verbs based solutions to utilize the RSS offload capability which is
> widely supported today by many modern NICs. It extends the IB and uverbs
> layers to support the above functionality and supplies a specific
> implementation for the mlx5_ib driver.
> 
> The implementation is based on an RFC that was sent to the list some
> months ago and describes the expected verbs and objects.
> RFC: http://www.spinics.net/lists/linux-rdma/msg25012.html
> 
> In addition, below URL can be used as a reference to the motivation and
> the justification to add the new objects that are described below.
> http://lxr.free-electrons.com/source/Documentation/networking/scaling.txt
> 
> Overview of the changes:
> - Add new objects: Work Queue and Receive Work Queues Indirection Table.
> - Add new verbs that are required to handle the new objects:
>   ib_create_wq(), ib_modify_wq(), ib_destory_wq(),
>   ib_create_rwq_ind_table(), ib_destroy_rwq_ind_table().
> - Extend ib_create_qp() to get Receive Work Queues Indirection Table.
> 
> Work Queue: (ib_wq)
> - Work Queue is associated (many to one) with  Completion Queue.
> - It owns Work Queue properties (PD, WQ size etc.).
> - Currently Work Queue type can be IB_WQT_RQ (receive queue), other ones
>   may be added in the future. (e.g. IB_WQT_SQ, send queue)
> - Work Queue from type IB_WQT_RQ contains receive work requests.
> - Work Queue context is subject to a well-defined state transitions done
>   by the modify_wq verb.
> - Work Queue is a necessary component for RSS technology since RSS
>   mechanism is supposed to distribute the traffic between multiple
>   Receive Work Queues.
> 
> Receive Work Queue Indirection Table: (ib_rwq_ind_tbl)
> - Serves to spread traffic between Work Queues from type RQ.
> - Can be modified dynamically to give different queues different relative
>   weights.
> - The receive queue for a packet is determined by computed hash for the
>   incoming packet.
> - Receive Work Queue Indirection Table is associated (one to many) with QPs.
> 
> RSS hashing configuration:
> - Should be used to compute the required RQ entry for the incoming packet.
> - It includes:
>     * The hashing function used to choose the WQ from this table.
>     * The packet's properties that the hashing function should use.
>     * Was changed to be vendor specific based on OFAWG verbs discussion.
> 
> Future extensions to this patch-set:
> - Add ib_modify_rwq_ind_table() verb to enable a dynamic RQ mapping change.
> - Reflect RSS capabilities by the query device verb.
> - User space support (i.e. libibverbs/vendor drivers) to expose the new verbs
>   and objects.
> 
> Patches:
> #1:  Exposes the required APIs from mlx5_core to be used in coming patches
>      by mlx5_ib driver.
> #2:  Introduces the Work Queue object and its verbs in the IB layer.
> #3:  Adds uverbs support for the Work Queue verbs.
> #4:  Implements the Work Queue verbs in mlx5_ib driver.
> #5:  Introduces Receive Work Queue indirection table and its verbs in
>      the IB layer.
> #6:  Adds uverbs support for the Receive Work Queue indirection table verbs.
> #7:  Implements the Receive Work Queue indirection table verbs in mlx5_ib driver.
> #8:  Extends create QP to get indirection table in the IB layer.
> #9:  Extends create QP to get indirection table in the uverbs layer.
> #10: Adds RSS QP support in mlx5_ib driver.
> 
> Changes from V3:
> Move hash properties from the common udata to be vendor specific data.
> 
> Changes from V2:
> - Improve the new verbs to enable clean future extensions in both uverbs
>   and mlx5_ib layers.
> - Add the last three patches to enable RSS QP creation.
> 
> Changes from V1:
> #patch #2: Change ib_modify_wq to use u32 instead of enum for bit wise values.
> #patch #3: Improve usage of attr_mask/comp_mask.
> #patch #4: Fix driver issue in mlx5_ib in PPC.
> #patch #6: Limit un-expected memory allocation.
> 
> Changes from V0:
> patch #2: Move the new verbs documentation to be in the C file,
>           improve the commit message.
> patch #5: Move the new verbs documentation to be in the C file.
> 
> Yishai Hadas (10):
>   net/mlx5: Export required core functions to support RSS
>   IB/core: Introduce Work Queue object and its verbs
>   IB/uverbs: Add WQ support
>   IB/mlx5: Add receive Work Queue verbs
>   IB/core: Introduce Receive Work Queue indirection table
>   IB/uverbs: Introduce RWQ Indirection table
>   IB/mlx5: Add Receive Work Queue Indirection table operations
>   IB/core: Extend create QP to get indirection table
>   IB/uverbs: Extend create QP to get RWQ indirection table
>   IB/mlx5: Add RSS QP support
> 
>  drivers/infiniband/core/uverbs.h                   |  12 +
>  drivers/infiniband/core/uverbs_cmd.c               | 528 ++++++++++++++++++-
>  drivers/infiniband/core/uverbs_main.c              |  38 ++
>  drivers/infiniband/core/verbs.c                    | 163 +++++-
>  drivers/infiniband/hw/mlx5/main.c                  |  12 +-
>  drivers/infiniband/hw/mlx5/mlx5_ib.h               |  54 ++
>  drivers/infiniband/hw/mlx5/qp.c                    | 584 +++++++++++++++++++++
>  drivers/infiniband/hw/mlx5/user.h                  |  64 +++
>  drivers/net/ethernet/mellanox/mlx5/core/transobj.c |   4 +
>  include/rdma/ib_verbs.h                            |  85 ++-
>  include/uapi/rdma/ib_user_verbs.h                  |  77 +++
>  11 files changed, 1605 insertions(+), 16 deletions(-)
> 


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-06-23 15:11 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-23 12:20 [PATCH V4 for-next 00/10] Verbs RSS Yishai Hadas
     [not found] ` <1464006056-19653-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-23 12:20   ` [PATCH V4 for-next 01/10] net/mlx5: Export required core functions to support RSS Yishai Hadas
     [not found]     ` <1464006056-19653-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 12:48       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 02/10] IB/core: Introduce Work Queue object and its verbs Yishai Hadas
     [not found]     ` <1464006056-19653-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-23 16:57       ` Steve Wise
     [not found]         ` <5d0982d2-3b07-11a3-a74c-a52b8e9fb392-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2016-05-24  8:50           ` Yishai Hadas
     [not found]             ` <d2401170-0ff4-c39a-d7c6-2a5face00fa2-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-05-24 14:25               ` Steve Wise
2016-05-24 14:51                 ` Yishai Hadas
2016-05-29 12:51       ` Sagi Grimberg
     [not found]         ` <574AE5C4.5010707-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-05-29 12:56           ` Sagi Grimberg
     [not found]             ` <574AE710.3070901-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-05-29 13:30               ` Yishai Hadas
     [not found]                 ` <7747beea-5c96-6c17-a87e-a16a45252487-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-05-30  6:38                   ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 03/10] IB/uverbs: Add WQ support Yishai Hadas
     [not found]     ` <1464006056-19653-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 12:57       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 04/10] IB/mlx5: Add receive Work Queue verbs Yishai Hadas
     [not found]     ` <1464006056-19653-5-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:02       ` Sagi Grimberg
2016-05-30 20:13       ` Or Gerlitz
2016-05-23 12:20   ` [PATCH V4 for-next 05/10] IB/core: Introduce Receive Work Queue indirection table Yishai Hadas
     [not found]     ` <1464006056-19653-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-23 17:07       ` Steve Wise
2016-05-24  7:39         ` Yishai Hadas
     [not found]           ` <46a7de86-dffe-e44c-3768-10dfb70e8802-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-05-24 14:02             ` Steve Wise
2016-05-29 13:06       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 06/10] IB/uverbs: Introduce RWQ Indirection table Yishai Hadas
     [not found]     ` <1464006056-19653-7-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:07       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 07/10] IB/mlx5: Add Receive Work Queue Indirection table operations Yishai Hadas
     [not found]     ` <1464006056-19653-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:08       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 08/10] IB/core: Extend create QP to get indirection table Yishai Hadas
     [not found]     ` <1464006056-19653-9-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:11       ` Sagi Grimberg
     [not found]         ` <574AEA8D.2010909-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-05-29 13:55           ` Yishai Hadas
     [not found]             ` <d34f7f40-5e05-2eb9-4b81-649b2cb11174-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2016-05-30  6:43               ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 09/10] IB/uverbs: Extend create QP to get RWQ " Yishai Hadas
     [not found]     ` <1464006056-19653-10-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:12       ` Sagi Grimberg
2016-05-23 12:20   ` [PATCH V4 for-next 10/10] IB/mlx5: Add RSS QP support Yishai Hadas
     [not found]     ` <1464006056-19653-11-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-05-29 13:16       ` Sagi Grimberg
2016-05-29 11:27   ` [PATCH V4 for-next 00/10] Verbs RSS Yuval Shaia
2016-06-23 15:11   ` Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.