All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2
@ 2016-02-02 19:29 Majd Dibbiny
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w

Hi Yishai,

This patch set adds support for Raw Packet QP for libmlx5.

Raw Packet QP enables the user to send and receive raw packets. The user is
responsible of building the packet including the headers.

Raw Packet QP works with non-default CQE version, and in order to tie
CQE to Raw Packet QP, we need to provide a user-index in the creation of QPs
when working with non-default CQE version.

The first 5 patches add support for CQE version 1 for QPs and XSRQs. The later
patches add support for Raw Packet QP (control and data path).

Changes from v1:
1. Fix compilation errors when compiling in debug mode

Haggai Abramovsky (5):
  Add infrastructure for resource identification
  Add resource tracking database
  Add new poll_cq according to the new CQE format
  Add QP and XSRQ create/destroy flow with user index
  Work with CQE version 1

Majd Dibbiny (2):
  Allocate separate RQ and SQ buffers for Raw Packet QP
  Add Raw Packet QP data-path functionality

 src/cq.c       | 271 +++++++++++++++++++++++++++++++++++++++++++++++----------
 src/mlx5-abi.h |  20 ++++-
 src/mlx5.c     |  90 +++++++++++++++++++
 src/mlx5.h     |  63 +++++++++++++-
 src/qp.c       | 114 ++++++++++++++++++++----
 src/verbs.c    | 210 +++++++++++++++++++++++++++++++++++---------
 src/wqe.h      |  20 +++++
 7 files changed, 686 insertions(+), 102 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 1/7] Add infrastructure for resource identification
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 2/7] Add resource tracking database Majd Dibbiny
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w, Haggai Abramovsky

From: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add new struct member, mlx5_resource, for each tracked object,
mlx5_srq and mlx5_qp, to allow retrieving the object in the
poll_cq once we need it.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/mlx5.h | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/mlx5.h b/src/mlx5.h
index 9181ec5..a930bac 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -228,6 +228,18 @@ enum mlx5_alloc_type {
 	MLX5_ALLOC_TYPE_ALL
 };
 
+enum mlx5_rsc_type {
+	MLX5_RSC_TYPE_QP,
+	MLX5_RSC_TYPE_XSRQ,
+	MLX5_RSC_TYPE_SRQ,
+	MLX5_RSC_TYPE_INVAL,
+};
+
+struct mlx5_resource {
+	enum mlx5_rsc_type	type;
+	uint32_t		rsn;
+};
+
 struct mlx5_device {
 	struct verbs_device	verbs_dev;
 	int			page_size;
@@ -341,6 +353,7 @@ struct mlx5_cq {
 };
 
 struct mlx5_srq {
+	struct mlx5_resource            rsc;  /* This struct must be first */
 	struct verbs_srq		vsrq;
 	struct mlx5_buf			buf;
 	struct mlx5_spinlock		lock;
@@ -392,6 +405,7 @@ struct mlx5_mr {
 };
 
 struct mlx5_qp {
+	struct mlx5_resource            rsc; /* This struct must be first */
 	struct verbs_qp			verbs_qp;
 	struct ibv_qp		       *ibv_qp;
 	struct mlx5_buf                 buf;
@@ -491,7 +505,9 @@ static inline struct mlx5_cq *to_mcq(struct ibv_cq *ibcq)
 
 static inline struct mlx5_srq *to_msrq(struct ibv_srq *ibsrq)
 {
-	return (struct mlx5_srq *)ibsrq;
+	struct verbs_srq *vsrq = (struct verbs_srq *)ibsrq;
+
+	return container_of(vsrq, struct mlx5_srq, vsrq);
 }
 
 static inline struct mlx5_qp *to_mqp(struct ibv_qp *ibqp)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 2/7] Add resource tracking database
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-02-02 19:29   ` [PATCH libmlx5 v2 1/7] Add infrastructure for resource identification Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 3/7] Add new poll_cq according to the new CQE format Majd Dibbiny
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w, Haggai Abramovsky

From: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Add new database that stores all the QPs and XSRQs context.
Insertions and deletions to the database are done using the
object's user-index.

This database will allow us to retrieve the objects; QPs and
XSRQs; by their user-index in the poll_one.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/mlx5.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 src/mlx5.h | 24 ++++++++++++++++++++
 2 files changed, 100 insertions(+)

diff --git a/src/mlx5.c b/src/mlx5.c
index e44898a..877df12 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -127,6 +127,81 @@ static int read_number_from_line(const char *line, int *value)
 	*value = atoi(ptr);
 	return 0;
 }
+/**
+ * The function looks for the first free user-index in all the
+ * user-index tables. If all are used, returns -1, otherwise
+ * a valid user-index.
+ * In case the reference count of the table is zero, it means the
+ * table is not in use and wasn't allocated yet, therefore the
+ * mlx5_store_uidx allocates the table, and increment the reference
+ * count on the table.
+ */
+static int32_t get_free_uidx(struct mlx5_context *ctx)
+{
+	int32_t tind;
+	int32_t i;
+
+	for (tind = 0; tind < MLX5_UIDX_TABLE_SIZE; tind++) {
+		if (ctx->uidx_table[tind].refcnt < MLX5_UIDX_TABLE_MASK)
+			break;
+	}
+
+	if (tind == MLX5_UIDX_TABLE_SIZE)
+		return -1;
+
+	if (!ctx->uidx_table[tind].refcnt)
+		return tind << MLX5_UIDX_TABLE_SHIFT;
+
+	for (i = 0; i < MLX5_UIDX_TABLE_MASK + 1; i++) {
+		if (!ctx->uidx_table[tind].table[i])
+			break;
+	}
+
+	return (tind << MLX5_UIDX_TABLE_SHIFT) | i;
+}
+
+int32_t mlx5_store_uidx(struct mlx5_context *ctx, void *rsc)
+{
+	int32_t tind;
+	int32_t ret = -1;
+	int32_t uidx;
+
+	pthread_mutex_lock(&ctx->uidx_table_mutex);
+	uidx = get_free_uidx(ctx);
+	if (uidx < 0)
+		goto out;
+
+	tind = uidx >> MLX5_UIDX_TABLE_SHIFT;
+
+	if (!ctx->uidx_table[tind].refcnt) {
+		ctx->uidx_table[tind].table = calloc(MLX5_UIDX_TABLE_MASK + 1,
+						     sizeof(void *));
+		if (!ctx->uidx_table[tind].table)
+			goto out;
+	}
+
+	++ctx->uidx_table[tind].refcnt;
+	ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK] = rsc;
+	ret = uidx;
+
+out:
+	pthread_mutex_unlock(&ctx->uidx_table_mutex);
+	return ret;
+}
+
+void mlx5_clear_uidx(struct mlx5_context *ctx, uint32_t uidx)
+{
+	int tind = uidx >> MLX5_UIDX_TABLE_SHIFT;
+
+	pthread_mutex_lock(&ctx->uidx_table_mutex);
+
+	if (!--ctx->uidx_table[tind].refcnt)
+		free(ctx->uidx_table[tind].table);
+	else
+		ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK] = NULL;
+
+	pthread_mutex_unlock(&ctx->uidx_table_mutex);
+}
 
 static int mlx5_is_sandy_bridge(int *num_cores)
 {
@@ -535,6 +610,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
 
 	pthread_mutex_init(&context->qp_table_mutex, NULL);
 	pthread_mutex_init(&context->srq_table_mutex, NULL);
+	pthread_mutex_init(&context->uidx_table_mutex, NULL);
 	for (i = 0; i < MLX5_QP_TABLE_SIZE; ++i)
 		context->qp_table[i].refcnt = 0;
 
diff --git a/src/mlx5.h b/src/mlx5.h
index a930bac..2691031 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -165,6 +165,12 @@ enum {
 };
 
 enum {
+	MLX5_UIDX_TABLE_SHIFT		= 12,
+	MLX5_UIDX_TABLE_MASK		= (1 << MLX5_UIDX_TABLE_SHIFT) - 1,
+	MLX5_UIDX_TABLE_SIZE		= 1 << (24 - MLX5_UIDX_TABLE_SHIFT),
+};
+
+enum {
 	MLX5_SRQ_TABLE_SHIFT		= 12,
 	MLX5_SRQ_TABLE_MASK		= (1 << MLX5_SRQ_TABLE_SHIFT) - 1,
 	MLX5_SRQ_TABLE_SIZE		= 1 << (24 - MLX5_SRQ_TABLE_SHIFT),
@@ -275,6 +281,12 @@ struct mlx5_context {
 	}				srq_table[MLX5_SRQ_TABLE_SIZE];
 	pthread_mutex_t			srq_table_mutex;
 
+	struct {
+		struct mlx5_resource  **table;
+		int                     refcnt;
+	}				uidx_table[MLX5_UIDX_TABLE_SIZE];
+	pthread_mutex_t                 uidx_table_mutex;
+
 	void			       *uar[MLX5_MAX_UAR_PAGES];
 	struct mlx5_spinlock		lock32;
 	struct mlx5_db_page	       *db_list;
@@ -616,6 +628,8 @@ void mlx5_set_sq_sizes(struct mlx5_qp *qp, struct ibv_qp_cap *cap,
 struct mlx5_qp *mlx5_find_qp(struct mlx5_context *ctx, uint32_t qpn);
 int mlx5_store_qp(struct mlx5_context *ctx, uint32_t qpn, struct mlx5_qp *qp);
 void mlx5_clear_qp(struct mlx5_context *ctx, uint32_t qpn);
+int32_t mlx5_store_uidx(struct mlx5_context *ctx, void *rsc);
+void mlx5_clear_uidx(struct mlx5_context *ctx, uint32_t uidx);
 struct mlx5_srq *mlx5_find_srq(struct mlx5_context *ctx, uint32_t srqn);
 int mlx5_store_srq(struct mlx5_context *ctx, uint32_t srqn,
 		   struct mlx5_srq *srq);
@@ -640,6 +654,16 @@ int mlx5_close_xrcd(struct ibv_xrcd *ib_xrcd);
 struct ibv_srq *mlx5_create_srq_ex(struct ibv_context *context,
 				   struct ibv_srq_init_attr_ex *attr);
 
+static inline void *mlx5_find_uidx(struct mlx5_context *ctx, uint32_t uidx)
+{
+	int tind = uidx >> MLX5_UIDX_TABLE_SHIFT;
+
+	if (likely(ctx->uidx_table[tind].refcnt))
+		return ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK];
+
+	return NULL;
+}
+
 static inline int mlx5_spin_lock(struct mlx5_spinlock *lock)
 {
 	if (!mlx5_single_threaded)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 3/7] Add new poll_cq according to the new CQE format
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-02-02 19:29   ` [PATCH libmlx5 v2 1/7] Add infrastructure for resource identification Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 2/7] Add resource tracking database Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 4/7] Add QP and XSRQ create/destroy flow with user index Majd Dibbiny
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w, Haggai Abramovsky

From: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

When working with CQE version 1, the CQE format is different,
therefore we need a new poll_cq function that is compatible with
the new CQE format.

In the coming patches in this series, the library will decide
on the cqe version when allocating the user-context and accordingly
decide which poll_cq to use for this user-context.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c   | 226 +++++++++++++++++++++++++++++++++++++++++++++++++------------
 src/mlx5.h |  11 +++
 2 files changed, 194 insertions(+), 43 deletions(-)

diff --git a/src/cq.c b/src/cq.c
index 231b278..eadc2c7 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -116,7 +116,7 @@ struct mlx5_cqe64 {
 	uint16_t	slid;
 	uint32_t	flags_rqpn;
 	uint8_t		rsvd28[4];
-	uint32_t	srqn;
+	uint32_t	srqn_uidx;
 	uint32_t	imm_inval_pkey;
 	uint8_t		rsvd40[4];
 	uint32_t	byte_cnt;
@@ -362,26 +362,153 @@ static void mlx5_get_cycles(uint64_t *cycles)
 }
 #endif
 
-static int mlx5_poll_one(struct mlx5_cq *cq,
-			 struct mlx5_qp **cur_qp,
+static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
+					      struct mlx5_resource **cur_rsc,
+					      uint32_t rsn, int cqe_ver)
+					      __attribute__((always_inline));
+static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
+					      struct mlx5_resource **cur_rsc,
+					      uint32_t rsn, int cqe_ver)
+{
+	if (!*cur_rsc || (rsn != (*cur_rsc)->rsn))
+		*cur_rsc = cqe_ver ? mlx5_find_uidx(mctx, rsn) :
+				      (struct mlx5_resource *)mlx5_find_qp(mctx, rsn);
+
+	return rsc_to_mqp(*cur_rsc);
+}
+
+static inline int get_resp_ctx_v1(struct mlx5_context *mctx,
+				  struct mlx5_resource **cur_rsc,
+				  struct mlx5_srq **cur_srq,
+				  uint32_t uidx, uint8_t *is_srq)
+				  __attribute__((always_inline));
+static inline int get_resp_ctx_v1(struct mlx5_context *mctx,
+				  struct mlx5_resource **cur_rsc,
+				  struct mlx5_srq **cur_srq,
+				  uint32_t uidx, uint8_t *is_srq)
+{
+	struct mlx5_qp *mqp;
+
+	if (!*cur_rsc || (uidx != (*cur_rsc)->rsn)) {
+		*cur_rsc = mlx5_find_uidx(mctx, uidx);
+		if (unlikely(!*cur_rsc))
+			return CQ_POLL_ERR;
+	}
+
+	switch ((*cur_rsc)->type) {
+	case MLX5_RSC_TYPE_QP:
+		mqp = rsc_to_mqp(*cur_rsc);
+		if (mqp->verbs_qp.qp.srq) {
+			*cur_srq = to_msrq(mqp->verbs_qp.qp.srq);
+			*is_srq = 1;
+		}
+		break;
+	case MLX5_RSC_TYPE_XSRQ:
+		*cur_srq = rsc_to_msrq(*cur_rsc);
+		*is_srq = 1;
+		break;
+	default:
+		return CQ_POLL_ERR;
+	}
+
+	return CQ_OK;
+}
+
+static inline int get_qp_ctx(struct mlx5_context *mctx,
+			     struct mlx5_resource **cur_rsc,
+			     uint32_t qpn)
+			       __attribute__((always_inline));
+static inline int get_qp_ctx(struct mlx5_context *mctx,
+			     struct mlx5_resource **cur_rsc,
+			     uint32_t qpn)
+{
+	if (!*cur_rsc || (qpn != (*cur_rsc)->rsn)) {
+		/*
+		 * We do not have to take the QP table lock here,
+		 * because CQs will be locked while QPs are removed
+		 * from the table.
+		 */
+		*cur_rsc = (struct mlx5_resource *)mlx5_find_qp(mctx, qpn);
+		if (unlikely(!*cur_rsc))
+			return CQ_POLL_ERR;
+	}
+
+	return CQ_OK;
+}
+
+static inline int get_srq_ctx(struct mlx5_context *mctx,
+			      struct mlx5_srq **cur_srq,
+			      uint32_t srqn_uidx)
+			      __attribute__((always_inline));
+static inline int get_srq_ctx(struct mlx5_context *mctx,
+			      struct mlx5_srq **cur_srq,
+			      uint32_t srqn)
+{
+	if (!*cur_srq || (srqn != (*cur_srq)->srqn)) {
+		*cur_srq = mlx5_find_srq(mctx, srqn);
+		if (unlikely(!*cur_srq))
+			return CQ_POLL_ERR;
+	}
+
+	return CQ_OK;
+}
+
+static inline int get_cur_rsc(struct mlx5_context *mctx,
+			      int cqe_ver,
+			      uint32_t qpn,
+			      uint32_t srqn_uidx,
+			      struct mlx5_resource **cur_rsc,
+			      struct mlx5_srq **cur_srq,
+			      uint8_t *is_srq)
+{
+	int err;
+
+	if (cqe_ver) {
+		err = get_resp_ctx_v1(mctx, cur_rsc, cur_srq, srqn_uidx,
+				      is_srq);
+	} else {
+		if (srqn_uidx) {
+			*is_srq = 1;
+			err = get_srq_ctx(mctx, cur_srq, srqn_uidx);
+		} else {
+			err = get_qp_ctx(mctx, cur_rsc, qpn);
+		}
+	}
+
+	return err;
+
+}
+
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+			 struct mlx5_resource **cur_rsc,
+			 struct mlx5_srq **cur_srq,
+			 struct ibv_wc *wc, int cqe_ver)
+			 __attribute__((always_inline));
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+			 struct mlx5_resource **cur_rsc,
 			 struct mlx5_srq **cur_srq,
-			 struct ibv_wc *wc)
+			 struct ibv_wc *wc, int cqe_ver)
 {
 	struct mlx5_cqe64 *cqe64;
 	struct mlx5_wq *wq;
 	uint16_t wqe_ctr;
 	void *cqe;
 	uint32_t qpn;
-	uint32_t srqn;
+	uint32_t srqn_uidx;
 	int idx;
 	uint8_t opcode;
 	struct mlx5_err_cqe *ecqe;
 	int err;
+	struct mlx5_qp *mqp;
+	struct mlx5_context *mctx;
+	uint8_t is_srq = 0;
 
 	cqe = next_cqe_sw(cq);
 	if (!cqe)
 		return CQ_EMPTY;
 
+	mctx = to_mctx(cq->ibv_cq.context);
+
 	cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
 
 	++cq->cons_index;
@@ -396,51 +523,33 @@ static int mlx5_poll_one(struct mlx5_cq *cq,
 
 #ifdef MLX5_DEBUG
 	if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) {
-		FILE *fp = to_mctx(cq->ibv_cq.context)->dbg_fp;
+		FILE *fp = mctx->dbg_fp;
 
 		mlx5_dbg(fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn);
 		dump_cqe(fp, cqe64);
 	}
 #endif
 
-	srqn = ntohl(cqe64->srqn) & 0xffffff;
 	qpn = ntohl(cqe64->sop_drop_qpn) & 0xffffff;
-	if (srqn) {
-		if (!*cur_srq || (srqn != (*cur_srq)->srqn)) {
-			*cur_srq = mlx5_find_srq(to_mctx(cq->ibv_cq.context),
-						 srqn);
-			if (unlikely(!*cur_srq))
-				return CQ_POLL_ERR;
-		}
-	} else {
-		if (!*cur_qp || (qpn != (*cur_qp)->ibv_qp->qp_num)) {
-			/*
-			 * We do not have to take the QP table lock here,
-			 * because CQs will be locked while QPs are removed
-			 * from the table.
-			 */
-			*cur_qp = mlx5_find_qp(to_mctx(cq->ibv_cq.context),
-					       qpn);
-			if (unlikely(!*cur_qp))
-				return CQ_POLL_ERR;
-		}
-	}
-
 	wc->wc_flags = 0;
 	wc->qp_num = qpn;
-
 	opcode = cqe64->op_own >> 4;
 	switch (opcode) {
 	case MLX5_CQE_REQ:
-		wq = &(*cur_qp)->sq;
+		mqp = get_req_context(mctx, cur_rsc,
+				      (cqe_ver ? (ntohl(cqe64->srqn_uidx) & 0xffffff) : qpn),
+				      cqe_ver);
+		if (unlikely(!mqp))
+			return CQ_POLL_ERR;
+		wq = &mqp->sq;
 		wqe_ctr = ntohs(cqe64->wqe_counter);
 		idx = wqe_ctr & (wq->wqe_cnt - 1);
 		handle_good_req(wc, cqe64);
 		if (cqe64->op_own & MLX5_INLINE_SCATTER_32)
-			err = mlx5_copy_to_send_wqe(*cur_qp, wqe_ctr, cqe,
+			err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe,
 						    wc->byte_len);
 		else if (cqe64->op_own & MLX5_INLINE_SCATTER_64)
-			err = mlx5_copy_to_send_wqe(*cur_qp, wqe_ctr, cqe - 1,
+			err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe - 1,
 						    wc->byte_len);
 		else
 			err = 0;
@@ -453,20 +562,27 @@ static int mlx5_poll_one(struct mlx5_cq *cq,
 	case MLX5_CQE_RESP_SEND:
 	case MLX5_CQE_RESP_SEND_IMM:
 	case MLX5_CQE_RESP_SEND_INV:
-		wc->status = handle_responder(wc, cqe64, *cur_qp,
-					      srqn ? *cur_srq : NULL);
+		srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
+		err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx, cur_rsc,
+				  cur_srq, &is_srq);
+		if (unlikely(err))
+			return CQ_POLL_ERR;
+
+		wc->status = handle_responder(wc, cqe64, rsc_to_mqp(*cur_rsc),
+					      is_srq ? *cur_srq : NULL);
 		break;
 	case MLX5_CQE_RESIZE_CQ:
 		break;
 	case MLX5_CQE_REQ_ERR:
 	case MLX5_CQE_RESP_ERR:
+		srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
 		ecqe = (struct mlx5_err_cqe *)cqe64;
 		mlx5_handle_error_cqe(ecqe, wc);
 		if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR &&
 			     ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR)) {
-			FILE *fp = to_mctx(cq->ibv_cq.context)->dbg_fp;
+			FILE *fp = mctx->dbg_fp;
 			fprintf(fp, PFX "%s: got completion with error:\n",
-				to_mctx(cq->ibv_cq.context)->hostname);
+				mctx->hostname);
 			dump_cqe(fp, ecqe);
 			if (mlx5_freeze_on_error_cqe) {
 				fprintf(fp, PFX "freezing at poll cq...");
@@ -476,18 +592,28 @@ static int mlx5_poll_one(struct mlx5_cq *cq,
 		}
 
 		if (opcode == MLX5_CQE_REQ_ERR) {
-			wq = &(*cur_qp)->sq;
+			mqp = get_req_context(mctx, cur_rsc,
+					      (cqe_ver ? srqn_uidx : qpn), cqe_ver);
+			if (unlikely(!mqp))
+				return CQ_POLL_ERR;
+			wq = &mqp->sq;
 			wqe_ctr = ntohs(cqe64->wqe_counter);
 			idx = wqe_ctr & (wq->wqe_cnt - 1);
 			wc->wr_id = wq->wrid[idx];
 			wq->tail = wq->wqe_head[idx] + 1;
 		} else {
-			if (*cur_srq) {
+			err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx,
+					  cur_rsc, cur_srq, &is_srq);
+			if (unlikely(err))
+				return CQ_POLL_ERR;
+
+			if (is_srq) {
 				wqe_ctr = ntohs(cqe64->wqe_counter);
 				wc->wr_id = (*cur_srq)->wrid[wqe_ctr];
 				mlx5_free_srq_wqe(*cur_srq, wqe_ctr);
 			} else {
-				wq = &(*cur_qp)->rq;
+				mqp = rsc_to_mqp(*cur_rsc);
+				wq = &mqp->rq;
 				wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
 				++wq->tail;
 			}
@@ -498,10 +624,14 @@ static int mlx5_poll_one(struct mlx5_cq *cq,
 	return CQ_OK;
 }
 
-int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
+static inline int poll_cq(struct ibv_cq *ibcq, int ne,
+		      struct ibv_wc *wc, int cqe_ver)
+		      __attribute__((always_inline));
+static inline int poll_cq(struct ibv_cq *ibcq, int ne,
+		      struct ibv_wc *wc, int cqe_ver)
 {
 	struct mlx5_cq *cq = to_mcq(ibcq);
-	struct mlx5_qp *qp = NULL;
+	struct mlx5_resource *rsc = NULL;
 	struct mlx5_srq *srq = NULL;
 	int npolled;
 	int err = CQ_OK;
@@ -519,7 +649,7 @@ int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
 	mlx5_spin_lock(&cq->lock);
 
 	for (npolled = 0; npolled < ne; ++npolled) {
-		err = mlx5_poll_one(cq, &qp, &srq, wc + npolled);
+		err = mlx5_poll_one(cq, &rsc, &srq, wc + npolled, cqe_ver);
 		if (err != CQ_OK)
 			break;
 	}
@@ -551,6 +681,16 @@ int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
 	return err == CQ_POLL_ERR ? err : npolled;
 }
 
+int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
+{
+	return poll_cq(ibcq, ne, wc, 0);
+}
+
+int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
+{
+	return poll_cq(ibcq, ne, wc, 1);
+}
+
 int mlx5_arm_cq(struct ibv_cq *ibvcq, int solicited)
 {
 	struct mlx5_cq *cq = to_mcq(ibvcq);
@@ -622,7 +762,7 @@ void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t rsn, struct mlx5_srq *srq)
 		cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
 		cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
 		if (is_equal_rsn(cqe64, rsn)) {
-			if (srq && (ntohl(cqe64->srqn) & 0xffffff))
+			if (srq && (ntohl(cqe64->srqn_uidx) & 0xffffff))
 				mlx5_free_srq_wqe(srq, ntohs(cqe64->wqe_counter));
 			++nfreed;
 		} else if (nfreed) {
diff --git a/src/mlx5.h b/src/mlx5.h
index 2691031..8190062 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -544,6 +544,16 @@ static inline int max_int(int a, int b)
 	return a > b ? a : b;
 }
 
+static inline struct mlx5_qp *rsc_to_mqp(struct mlx5_resource *rsc)
+{
+	return (struct mlx5_qp *)rsc;
+}
+
+static inline struct mlx5_srq *rsc_to_msrq(struct mlx5_resource *rsc)
+{
+	return (struct mlx5_srq *)rsc;
+}
+
 int mlx5_alloc_buf(struct mlx5_buf *buf, size_t size, int page_size);
 void mlx5_free_buf(struct mlx5_buf *buf);
 int mlx5_alloc_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf,
@@ -590,6 +600,7 @@ int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf);
 int mlx5_resize_cq(struct ibv_cq *cq, int cqe);
 int mlx5_destroy_cq(struct ibv_cq *cq);
 int mlx5_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc);
+int mlx5_poll_cq_v1(struct ibv_cq *cq, int ne, struct ibv_wc *wc);
 int mlx5_arm_cq(struct ibv_cq *cq, int solicited);
 void mlx5_cq_event(struct ibv_cq *cq);
 void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t qpn, struct mlx5_srq *srq);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 4/7] Add QP and XSRQ create/destroy flow with user index
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-02-02 19:29   ` [PATCH libmlx5 v2 3/7] Add new poll_cq according to the new CQE format Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 5/7] Work with CQE version 1 Majd Dibbiny
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w, Haggai Abramovsky

From: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

When working with CQE version 1, the library allocates a user-index
for each new QP/XSRQ, and this user-index is passed to the kernel.

Also in the destruction of a QP/XSRQ, the library needs to free the
user-index, so it can be reused.

As part of enforcing the user-space to use the new CQE version:
1. If both the device and the user-space support CQE version 1,
then the kernel expects a valid user-index when creating QPs and
XSRQs, otherwise the creation
fails.
2. If the user-space supports CQE version 1 (has the uidx field
in the commands), but the device doesn't, then the user-space
should pass 0xffffff to make sure that the user is aware of
working with CQE version 0.

In this stage, the library still doesn't work with CQE version 1,
and therefore it passes user-index that equals to 0xffffff.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c       |  47 ++++++++++++++++++--
 src/mlx5-abi.h |   5 +++
 src/mlx5.c     |  10 +++++
 src/mlx5.h     |   1 +
 src/verbs.c    | 137 +++++++++++++++++++++++++++++++++++++++++++--------------
 5 files changed, 163 insertions(+), 37 deletions(-)

diff --git a/src/cq.c b/src/cq.c
index eadc2c7..79a200f 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -732,6 +732,47 @@ static int is_equal_rsn(struct mlx5_cqe64 *cqe64, uint32_t rsn)
 	return rsn == (ntohl(cqe64->sop_drop_qpn) & 0xffffff);
 }
 
+static inline int is_equal_uidx(struct mlx5_cqe64 *cqe64, uint32_t uidx)
+{
+	return uidx == (ntohl(cqe64->srqn_uidx) & 0xffffff);
+}
+
+static inline int is_responder(uint8_t opcode)
+{
+	switch (opcode) {
+	case MLX5_CQE_RESP_WR_IMM:
+	case MLX5_CQE_RESP_SEND:
+	case MLX5_CQE_RESP_SEND_IMM:
+	case MLX5_CQE_RESP_SEND_INV:
+	case MLX5_CQE_RESP_ERR:
+		return 1;
+	}
+
+	return 0;
+}
+
+static inline int free_res_cqe(struct mlx5_cqe64 *cqe64, uint32_t rsn,
+			       struct mlx5_srq *srq, int cqe_version)
+{
+	if (cqe_version) {
+		if (is_equal_uidx(cqe64, rsn)) {
+			if (srq && is_responder(cqe64->op_own >> 4))
+				mlx5_free_srq_wqe(srq,
+						  ntohs(cqe64->wqe_counter));
+			return 1;
+		}
+	} else {
+		if (is_equal_rsn(cqe64, rsn)) {
+			if (srq && (ntohl(cqe64->srqn_uidx) & 0xffffff))
+				mlx5_free_srq_wqe(srq,
+						  ntohs(cqe64->wqe_counter));
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
 void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t rsn, struct mlx5_srq *srq)
 {
 	uint32_t prod_index;
@@ -739,6 +780,7 @@ void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t rsn, struct mlx5_srq *srq)
 	struct mlx5_cqe64 *cqe64, *dest64;
 	void *cqe, *dest;
 	uint8_t owner_bit;
+	int cqe_version;
 
 	if (!cq)
 		return;
@@ -758,12 +800,11 @@ void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t rsn, struct mlx5_srq *srq)
 	 * Now sweep backwards through the CQ, removing CQ entries
 	 * that match our QP by copying older entries on top of them.
 	 */
+	cqe_version = (to_mctx(cq->ibv_cq.context))->cqe_version;
 	while ((int) --prod_index - (int) cq->cons_index >= 0) {
 		cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
 		cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
-		if (is_equal_rsn(cqe64, rsn)) {
-			if (srq && (ntohl(cqe64->srqn_uidx) & 0xffffff))
-				mlx5_free_srq_wqe(srq, ntohs(cqe64->wqe_counter));
+		if (free_res_cqe(cqe64, rsn, srq, cqe_version)) {
 			++nfreed;
 		} else if (nfreed) {
 			dest = get_cqe(cq, (prod_index + nfreed) & cq->ibv_cq.cqe);
diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index 31bbc8f..4b21a17 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -108,6 +108,9 @@ struct mlx5_create_srq_ex {
 	__u64				buf_addr;
 	__u64				db_addr;
 	__u32				flags;
+	__u32				reserved;
+	__u32                           uidx;
+	__u32                           reserved1;
 };
 
 struct mlx5_create_qp {
@@ -118,6 +121,8 @@ struct mlx5_create_qp {
 	__u32				rq_wqe_count;
 	__u32				rq_wqe_shift;
 	__u32				flags;
+	__u32                           uidx;
+	__u32                           reserved;
 };
 
 struct mlx5_create_qp_resp {
diff --git a/src/mlx5.c b/src/mlx5.c
index 877df12..b942815 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -608,12 +608,22 @@ static int mlx5_init_context(struct verbs_device *vdev,
 	context->max_recv_wr	= resp.max_recv_wr;
 	context->max_srq_recv_wr = resp.max_srq_recv_wr;
 
+	if (context->cqe_version) {
+		if (context->cqe_version == 1)
+			mlx5_ctx_ops.poll_cq = mlx5_poll_cq_v1;
+		else
+			 goto err_free_bf;
+	}
+
 	pthread_mutex_init(&context->qp_table_mutex, NULL);
 	pthread_mutex_init(&context->srq_table_mutex, NULL);
 	pthread_mutex_init(&context->uidx_table_mutex, NULL);
 	for (i = 0; i < MLX5_QP_TABLE_SIZE; ++i)
 		context->qp_table[i].refcnt = 0;
 
+	for (i = 0; i < MLX5_QP_TABLE_SIZE; ++i)
+		context->uidx_table[i].refcnt = 0;
+
 	context->db_list = NULL;
 
 	pthread_mutex_init(&context->db_list_mutex, NULL);
diff --git a/src/mlx5.h b/src/mlx5.h
index 8190062..cda7c71 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -306,6 +306,7 @@ struct mlx5_context {
 	char				hostname[40];
 	struct mlx5_spinlock            hugetlb_lock;
 	struct list_head                hugetlb_list;
+	int				cqe_version;
 };
 
 struct mlx5_bitmap {
diff --git a/src/verbs.c b/src/verbs.c
index 68b31f7..64f7694 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -53,6 +53,11 @@
 
 int mlx5_single_threaded = 0;
 
+static inline int is_xrc_tgt(int type)
+{
+	return type == IBV_QPT_XRC_RECV;
+}
+
 int mlx5_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
 {
 	struct ibv_query_device cmd;
@@ -505,6 +510,8 @@ struct ibv_srq *mlx5_create_srq(struct ibv_pd *pd,
 	pthread_mutex_unlock(&ctx->srq_table_mutex);
 
 	srq->srqn = resp.srqn;
+	srq->rsc.rsn = resp.srqn;
+	srq->rsc.type = MLX5_RSC_TYPE_SRQ;
 
 	return ibsrq;
 
@@ -545,16 +552,22 @@ int mlx5_query_srq(struct ibv_srq *srq,
 int mlx5_destroy_srq(struct ibv_srq *srq)
 {
 	int ret;
+	struct mlx5_srq *msrq = to_msrq(srq);
+	struct mlx5_context *ctx = to_mctx(srq->context);
 
 	ret = ibv_cmd_destroy_srq(srq);
 	if (ret)
 		return ret;
 
-	mlx5_clear_srq(to_mctx(srq->context), to_msrq(srq)->srqn);
-	mlx5_free_db(to_mctx(srq->context), to_msrq(srq)->db);
-	mlx5_free_buf(&to_msrq(srq)->buf);
-	free(to_msrq(srq)->wrid);
-	free(to_msrq(srq));
+	if (ctx->cqe_version && msrq->rsc.type == MLX5_RSC_TYPE_XSRQ)
+		mlx5_clear_uidx(ctx, msrq->rsc.rsn);
+	else
+		mlx5_clear_srq(ctx, msrq->srqn);
+
+	mlx5_free_db(ctx, msrq->db);
+	mlx5_free_buf(&msrq->buf);
+	free(msrq->wrid);
+	free(msrq);
 
 	return 0;
 }
@@ -882,6 +895,7 @@ struct ibv_qp *create_qp(struct ibv_context *context,
 	int				ret;
 	struct mlx5_context	       *ctx = to_mctx(context);
 	struct ibv_qp		       *ibqp;
+	uint32_t			usr_idx = 0;
 #ifdef MLX5_DEBUG
 	FILE *fp = ctx->dbg_fp;
 #endif
@@ -938,24 +952,38 @@ struct ibv_qp *create_qp(struct ibv_context *context,
 	cmd.rq_wqe_count = qp->rq.wqe_cnt;
 	cmd.rq_wqe_shift = qp->rq.wqe_shift;
 
-	pthread_mutex_lock(&ctx->qp_table_mutex);
+	if (!ctx->cqe_version) {
+		cmd.uidx = 0xffffff;
+		pthread_mutex_lock(&ctx->qp_table_mutex);
+	} else if (!is_xrc_tgt(attr->qp_type)) {
+		usr_idx = mlx5_store_uidx(ctx, qp);
+		if (usr_idx < 0) {
+			mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't find free user index\n");
+			goto err_rq_db;
+		}
+
+		cmd.uidx = usr_idx;
+	}
 
 	ret = ibv_cmd_create_qp_ex(context, &qp->verbs_qp, sizeof(qp->verbs_qp),
 				   attr, &cmd.ibv_cmd, sizeof(cmd),
 				   &resp.ibv_resp, sizeof(resp));
 	if (ret) {
 		mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret);
-		goto err_rq_db;
+		goto err_free_uidx;
 	}
 
-	if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) {
-		ret = mlx5_store_qp(ctx, ibqp->qp_num, qp);
-		if (ret) {
-			mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret);
-			goto err_destroy;
+	if (!ctx->cqe_version) {
+		if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) {
+			ret = mlx5_store_qp(ctx, ibqp->qp_num, qp);
+			if (ret) {
+				mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret);
+				goto err_destroy;
+			}
 		}
+
+		pthread_mutex_unlock(&ctx->qp_table_mutex);
 	}
-	pthread_mutex_unlock(&ctx->qp_table_mutex);
 
 	map_uuar(context, qp, resp.uuar_index);
 
@@ -969,13 +997,22 @@ struct ibv_qp *create_qp(struct ibv_context *context,
 	attr->cap.max_recv_wr = qp->rq.max_post;
 	attr->cap.max_recv_sge = qp->rq.max_gs;
 
+	qp->rsc.type = MLX5_RSC_TYPE_QP;
+	qp->rsc.rsn = (ctx->cqe_version && !is_xrc_tgt(attr->qp_type)) ?
+		      usr_idx : ibqp->qp_num;
+
 	return ibqp;
 
 err_destroy:
 	ibv_cmd_destroy_qp(ibqp);
 
+err_free_uidx:
+	if (!ctx->cqe_version)
+		pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex);
+	else if (!is_xrc_tgt(attr->qp_type))
+		mlx5_clear_uidx(ctx, usr_idx);
+
 err_rq_db:
-	pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex);
 	mlx5_free_db(to_mctx(context), qp->db);
 
 err_free_qp_buf:
@@ -1046,29 +1083,38 @@ static void mlx5_unlock_cqs(struct ibv_qp *qp)
 int mlx5_destroy_qp(struct ibv_qp *ibqp)
 {
 	struct mlx5_qp *qp = to_mqp(ibqp);
+	struct mlx5_context *ctx = to_mctx(ibqp->context);
 	int ret;
 
-	pthread_mutex_lock(&to_mctx(ibqp->context)->qp_table_mutex);
+	if (!ctx->cqe_version)
+		pthread_mutex_lock(&ctx->qp_table_mutex);
+
 	ret = ibv_cmd_destroy_qp(ibqp);
 	if (ret) {
-		pthread_mutex_unlock(&to_mctx(ibqp->context)->qp_table_mutex);
+		if (!ctx->cqe_version)
+			pthread_mutex_unlock(&ctx->qp_table_mutex);
 		return ret;
 	}
 
 	mlx5_lock_cqs(ibqp);
 
-	__mlx5_cq_clean(to_mcq(ibqp->recv_cq), ibqp->qp_num,
+	__mlx5_cq_clean(to_mcq(ibqp->recv_cq), qp->rsc.rsn,
 			ibqp->srq ? to_msrq(ibqp->srq) : NULL);
 	if (ibqp->send_cq != ibqp->recv_cq)
-		__mlx5_cq_clean(to_mcq(ibqp->send_cq), ibqp->qp_num, NULL);
+		__mlx5_cq_clean(to_mcq(ibqp->send_cq), qp->rsc.rsn, NULL);
 
-	if (qp->sq.wqe_cnt || qp->rq.wqe_cnt)
-		mlx5_clear_qp(to_mctx(ibqp->context), ibqp->qp_num);
+	if (!ctx->cqe_version) {
+		if (qp->sq.wqe_cnt || qp->rq.wqe_cnt)
+			mlx5_clear_qp(ctx, ibqp->qp_num);
+	}
 
 	mlx5_unlock_cqs(ibqp);
-	pthread_mutex_unlock(&to_mctx(ibqp->context)->qp_table_mutex);
+	if (!ctx->cqe_version)
+		pthread_mutex_unlock(&ctx->qp_table_mutex);
+	else if (!is_xrc_tgt(ibqp->qp_type))
+		mlx5_clear_uidx(ctx, qp->rsc.rsn);
 
-	mlx5_free_db(to_mctx(ibqp->context), qp->db);
+	mlx5_free_db(ctx, qp->db);
 	mlx5_free_qp_buf(qp);
 	free(qp);
 
@@ -1108,11 +1154,12 @@ int mlx5_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
 	    (attr_mask & IBV_QP_STATE) &&
 	    attr->qp_state == IBV_QPS_RESET) {
 		if (qp->recv_cq) {
-			mlx5_cq_clean(to_mcq(qp->recv_cq), qp->qp_num,
+			mlx5_cq_clean(to_mcq(qp->recv_cq), to_mqp(qp)->rsc.rsn,
 				      qp->srq ? to_msrq(qp->srq) : NULL);
 		}
 		if (qp->send_cq != qp->recv_cq && qp->send_cq)
-			mlx5_cq_clean(to_mcq(qp->send_cq), qp->qp_num, NULL);
+			mlx5_cq_clean(to_mcq(qp->send_cq),
+				      to_mqp(qp)->rsc.rsn, NULL);
 
 		mlx5_init_qp_indices(to_mqp(qp));
 		db = to_mqp(qp)->db;
@@ -1231,9 +1278,13 @@ mlx5_create_xrc_srq(struct ibv_context *context,
 	struct mlx5_create_srq_ex cmd;
 	struct mlx5_create_srq_resp resp;
 	struct mlx5_srq *msrq;
-	struct mlx5_context *ctx;
+	struct mlx5_context *ctx = to_mctx(context);
 	int max_sge;
 	struct ibv_srq *ibsrq;
+	int uidx;
+#ifdef MLX5_DEBUG
+	FILE *fp = ctx->dbg_fp;
+#endif
 
 	msrq = calloc(1, sizeof(*msrq));
 	if (!msrq)
@@ -1244,8 +1295,6 @@ mlx5_create_xrc_srq(struct ibv_context *context,
 	memset(&cmd, 0, sizeof(cmd));
 	memset(&resp, 0, sizeof(resp));
 
-	ctx = to_mctx(context);
-
 	if (mlx5_spinlock_init(&msrq->lock)) {
 		fprintf(stderr, "%s-%d:\n", __func__, __LINE__);
 		goto err;
@@ -1297,28 +1346,48 @@ mlx5_create_xrc_srq(struct ibv_context *context,
 		cmd.flags = MLX5_SRQ_FLAG_SIGNATURE;
 
 	attr->attr.max_sge = msrq->max_gs;
-	pthread_mutex_lock(&ctx->srq_table_mutex);
+	if (ctx->cqe_version) {
+		uidx = mlx5_store_uidx(ctx, msrq);
+		if (uidx < 0) {
+			mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't find free user index\n");
+			goto err_free_db;
+		}
+		cmd.uidx = uidx;
+	} else {
+		cmd.uidx = 0xffffff;
+		pthread_mutex_lock(&ctx->srq_table_mutex);
+	}
+
 	err = ibv_cmd_create_srq_ex(context, &msrq->vsrq, sizeof(msrq->vsrq),
 				    attr, &cmd.ibv_cmd, sizeof(cmd),
 				    &resp.ibv_resp, sizeof(resp));
 	if (err)
-		goto err_free_db;
+		goto err_free_uidx;
 
-	err = mlx5_store_srq(to_mctx(context), resp.srqn, msrq);
-	if (err)
-		goto err_destroy;
+	if (!ctx->cqe_version) {
+		err = mlx5_store_srq(to_mctx(context), resp.srqn, msrq);
+		if (err)
+			goto err_destroy;
 
-	pthread_mutex_unlock(&ctx->srq_table_mutex);
+		pthread_mutex_unlock(&ctx->srq_table_mutex);
+	}
 
 	msrq->srqn = resp.srqn;
+	msrq->rsc.type = MLX5_RSC_TYPE_XSRQ;
+	msrq->rsc.rsn = ctx->cqe_version ? cmd.uidx : resp.srqn;
 
 	return ibsrq;
 
 err_destroy:
 	ibv_cmd_destroy_srq(ibsrq);
 
+err_free_uidx:
+	if (ctx->cqe_version)
+		mlx5_clear_uidx(ctx, cmd.uidx);
+	else
+		pthread_mutex_unlock(&ctx->srq_table_mutex);
+
 err_free_db:
-	pthread_mutex_unlock(&ctx->srq_table_mutex);
 	mlx5_free_db(ctx, msrq->db);
 
 err_free:
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 5/7] Work with CQE version 1
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-02-02 19:29   ` [PATCH libmlx5 v2 4/7] Add QP and XSRQ create/destroy flow with user index Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 6/7] Allocate separate RQ and SQ buffers for Raw Packet QP Majd Dibbiny
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w, Haggai Abramovsky

From: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

>From now on, if the kernel supports CQE version 1, the library will
choose to work with it.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Haggai Abramovsky <hagaya-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/mlx5-abi.h | 13 +++++++++++--
 src/mlx5.c     |  4 ++++
 src/mlx5.h     |  5 +++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index 4b21a17..6427cca 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -55,7 +55,11 @@ struct mlx5_alloc_ucontext {
 	__u32				total_num_uuars;
 	__u32				num_low_latency_uuars;
 	__u32				flags;
-	__u32				reserved;
+	__u32				comp_mask;
+	__u8				cqe_version;
+	__u8				reserved0;
+	__u16				reserved1;
+	__u32				reserved2;
 };
 
 struct mlx5_alloc_ucontext_resp {
@@ -70,7 +74,12 @@ struct mlx5_alloc_ucontext_resp {
 	__u32				max_recv_wr;
 	__u32				max_srq_recv_wr;
 	__u16				num_ports;
-	__u16				reserved;
+	__u16				reserved1;
+	__u32				comp_mask;
+	__u32				response_length;
+	__u8				cqe_version;
+	__u8				reserved2;
+	__u16				reserved3;
 };
 
 struct mlx5_alloc_pd_resp {
diff --git a/src/mlx5.c b/src/mlx5.c
index b942815..0469a78 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -590,8 +590,11 @@ static int mlx5_init_context(struct verbs_device *vdev,
 	}
 
 	memset(&req, 0, sizeof(req));
+	memset(&resp, 0, sizeof(resp));
+
 	req.total_num_uuars = tot_uuars;
 	req.num_low_latency_uuars = low_lat_uuars;
+	req.cqe_version = MLX5_CQE_VERSION_V1;
 	if (ibv_cmd_get_context(&context->ibv_ctx, &req.ibv_req, sizeof req,
 				&resp.ibv_resp, sizeof resp))
 		goto err_free_bf;
@@ -608,6 +611,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
 	context->max_recv_wr	= resp.max_recv_wr;
 	context->max_srq_recv_wr = resp.max_srq_recv_wr;
 
+	context->cqe_version = resp.cqe_version;
 	if (context->cqe_version) {
 		if (context->cqe_version == 1)
 			mlx5_ctx_ops.poll_cq = mlx5_poll_cq_v1;
diff --git a/src/mlx5.h b/src/mlx5.h
index cda7c71..6b982c4 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -120,6 +120,11 @@ enum {
 	MLX5_MMAP_GET_CONTIGUOUS_PAGES_CMD = 1
 };
 
+enum {
+	MLX5_CQE_VERSION_V0	= 0,
+	MLX5_CQE_VERSION_V1	= 1,
+};
+
 #define MLX5_CQ_PREFIX "MLX_CQ"
 #define MLX5_QP_PREFIX "MLX_QP"
 #define MLX5_MR_PREFIX "MLX_MR"
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 6/7] Allocate separate RQ and SQ buffers for Raw Packet QP
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-02-02 19:29   ` [PATCH libmlx5 v2 5/7] Work with CQE version 1 Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-02 19:29   ` [PATCH libmlx5 v2 7/7] Add Raw Packet QP data-path functionality Majd Dibbiny
  2016-02-03 17:21   ` [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2 Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w

In case of Raw Packet QP, the RQ and SQ aren't
contiguous, therefore we need to allocate each one of them
separately and pass the buffer address to the kernel.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/mlx5-abi.h |  2 ++
 src/mlx5.h     |  4 +++
 src/qp.c       |  4 +--
 src/verbs.c    | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
 src/wqe.h      | 11 ++++++++
 5 files changed, 92 insertions(+), 11 deletions(-)

diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index 6427cca..f4247e0 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -132,6 +132,8 @@ struct mlx5_create_qp {
 	__u32				flags;
 	__u32                           uidx;
 	__u32                           reserved;
+	/* SQ buffer address - used for Raw Packet QP */
+	__u64                           sq_buf_addr;
 };
 
 struct mlx5_create_qp_resp {
diff --git a/src/mlx5.h b/src/mlx5.h
index 6b982c4..4fa0f46 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -427,8 +427,12 @@ struct mlx5_qp {
 	struct verbs_qp			verbs_qp;
 	struct ibv_qp		       *ibv_qp;
 	struct mlx5_buf                 buf;
+	void				*sq_start;
 	int                             max_inline_data;
 	int                             buf_size;
+	/* For Raw Packet QP, use different buffers for the SQ and RQ */
+	struct mlx5_buf                 sq_buf;
+	int				sq_buf_size;
 	struct mlx5_bf		       *bf;
 
 	uint8_t	                        sq_signal_bits;
diff --git a/src/qp.c b/src/qp.c
index 67ded0d..8556714 100644
--- a/src/qp.c
+++ b/src/qp.c
@@ -145,7 +145,7 @@ int mlx5_copy_to_send_wqe(struct mlx5_qp *qp, int idx, void *buf, int size)
 
 void *mlx5_get_send_wqe(struct mlx5_qp *qp, int n)
 {
-	return qp->buf.buf + qp->sq.offset + (n << MLX5_SEND_WQE_SHIFT);
+	return qp->sq_start + (n << MLX5_SEND_WQE_SHIFT);
 }
 
 void mlx5_init_qp_indices(struct mlx5_qp *qp)
@@ -214,7 +214,7 @@ static void mlx5_bf_copy(unsigned long long *dst, unsigned long long *src,
 		*dst++ = *src++;
 		bytecnt -= 8 * sizeof(unsigned long long);
 		if (unlikely(src == qp->sq.qend))
-			src = qp->buf.buf + qp->sq.offset;
+			src = qp->sq_start;
 	}
 }
 
diff --git a/src/verbs.c b/src/verbs.c
index 64f7694..ead4540 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -600,6 +600,11 @@ static int sq_overhead(enum ibv_qp_type	qp_type)
 			sizeof(struct mlx5_wqe_raddr_seg);
 		break;
 
+	case IBV_QPT_RAW_PACKET:
+		size = sizeof(struct mlx5_wqe_ctrl_seg) +
+			sizeof(struct mlx5_wqe_eth_seg);
+		break;
+
 	default:
 		return -EINVAL;
 	}
@@ -802,7 +807,8 @@ static const char *qptype2key(enum ibv_qp_type type)
 }
 
 static int mlx5_alloc_qp_buf(struct ibv_context *context,
-			     struct ibv_qp_cap *cap, struct mlx5_qp *qp,
+			     struct ibv_qp_init_attr_ex *attr,
+			     struct mlx5_qp *qp,
 			     int size)
 {
 	int err;
@@ -856,8 +862,26 @@ static int mlx5_alloc_qp_buf(struct ibv_context *context,
 
 	memset(qp->buf.buf, 0, qp->buf_size);
 
-	return 0;
+	if (attr->qp_type == IBV_QPT_RAW_PACKET) {
+		size_t aligned_sq_buf_size = align(qp->sq_buf_size,
+						   to_mdev(context->device)->page_size);
+		/* For Raw Packet QP, allocate a separate buffer for the SQ */
+		err = mlx5_alloc_prefered_buf(to_mctx(context), &qp->sq_buf,
+					      aligned_sq_buf_size,
+					      to_mdev(context->device)->page_size,
+					      alloc_type,
+					      MLX5_QP_PREFIX);
+		if (err) {
+			err = -ENOMEM;
+			goto rq_buf;
+		}
 
+		memset(qp->sq_buf.buf, 0, aligned_sq_buf_size);
+	}
+
+	return 0;
+rq_buf:
+	mlx5_free_actual_buf(to_mctx(qp->verbs_qp.qp.context), &qp->buf);
 ex_wrid:
 	if (qp->rq.wrid)
 		free(qp->rq.wrid);
@@ -876,6 +900,10 @@ static void mlx5_free_qp_buf(struct mlx5_qp *qp)
 	struct mlx5_context *ctx = to_mctx(qp->ibv_qp->context);
 
 	mlx5_free_actual_buf(ctx, &qp->buf);
+
+	if (qp->sq_buf.buf)
+		mlx5_free_actual_buf(ctx, &qp->sq_buf);
+
 	if (qp->rq.wrid)
 		free(qp->rq.wrid);
 
@@ -922,15 +950,31 @@ struct ibv_qp *create_qp(struct ibv_context *context,
 		errno = -ret;
 		goto err;
 	}
-	qp->buf_size = ret;
 
-	if (mlx5_alloc_qp_buf(context, &attr->cap, qp, ret)) {
+	if (attr->qp_type == IBV_QPT_RAW_PACKET) {
+		qp->buf_size = qp->sq.offset;
+		qp->sq_buf_size = ret - qp->buf_size;
+		qp->sq.offset = 0;
+	} else {
+		qp->buf_size = ret;
+		qp->sq_buf_size = 0;
+	}
+
+	if (mlx5_alloc_qp_buf(context, attr, qp, ret)) {
 		mlx5_dbg(fp, MLX5_DBG_QP, "\n");
 		goto err;
 	}
 
-	qp->sq.qend = qp->buf.buf + qp->sq.offset +
-		(qp->sq.wqe_cnt << qp->sq.wqe_shift);
+	if (attr->qp_type == IBV_QPT_RAW_PACKET) {
+		qp->sq_start = qp->sq_buf.buf;
+		qp->sq.qend = qp->sq_buf.buf +
+				(qp->sq.wqe_cnt << qp->sq.wqe_shift);
+	} else {
+		qp->sq_start = qp->buf.buf + qp->sq.offset;
+		qp->sq.qend = qp->buf.buf + qp->sq.offset +
+				(qp->sq.wqe_cnt << qp->sq.wqe_shift);
+	}
+
 	mlx5_init_qp_indices(qp);
 
 	if (mlx5_spinlock_init(&qp->sq.lock) ||
@@ -947,6 +991,8 @@ struct ibv_qp *create_qp(struct ibv_context *context,
 	qp->db[MLX5_SND_DBR] = 0;
 
 	cmd.buf_addr = (uintptr_t) qp->buf.buf;
+	cmd.sq_buf_addr = (attr->qp_type == IBV_QPT_RAW_PACKET) ?
+			  (uintptr_t) qp->sq_buf.buf : 0;
 	cmd.db_addr  = (uintptr_t) qp->db;
 	cmd.sq_wqe_count = qp->sq.wqe_cnt;
 	cmd.rq_wqe_count = qp->rq.wqe_cnt;
@@ -1145,6 +1191,7 @@ int mlx5_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
 		   int attr_mask)
 {
 	struct ibv_modify_qp cmd;
+	struct mlx5_qp *mqp = to_mqp(qp);
 	int ret;
 	uint32_t *db;
 
@@ -1154,19 +1201,36 @@ int mlx5_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr,
 	    (attr_mask & IBV_QP_STATE) &&
 	    attr->qp_state == IBV_QPS_RESET) {
 		if (qp->recv_cq) {
-			mlx5_cq_clean(to_mcq(qp->recv_cq), to_mqp(qp)->rsc.rsn,
+			mlx5_cq_clean(to_mcq(qp->recv_cq), mqp->rsc.rsn,
 				      qp->srq ? to_msrq(qp->srq) : NULL);
 		}
 		if (qp->send_cq != qp->recv_cq && qp->send_cq)
 			mlx5_cq_clean(to_mcq(qp->send_cq),
 				      to_mqp(qp)->rsc.rsn, NULL);
 
-		mlx5_init_qp_indices(to_mqp(qp));
-		db = to_mqp(qp)->db;
+		mlx5_init_qp_indices(mqp);
+		db = mqp->db;
 		db[MLX5_RCV_DBR] = 0;
 		db[MLX5_SND_DBR] = 0;
 	}
 
+	/*
+	 * When the Raw Packet QP is in INIT state, its RQ
+	 * underneath is already in RDY, which means it can
+	 * receive packets. According to the IB spec, a QP can't
+	 * receive packets until moved to RTR state. To achieve this,
+	 * for Raw Packet QPs, we update the doorbell record
+	 * once the QP is moved to RTR.
+	 */
+	if (!ret &&
+	    (attr_mask & IBV_QP_STATE) &&
+	    attr->qp_state == IBV_QPS_RTR &&
+	    qp->qp_type == IBV_QPT_RAW_PACKET) {
+		mlx5_spin_lock(&mqp->rq.lock);
+		mqp->db[MLX5_RCV_DBR] = htonl(mqp->rq.head & 0xffff);
+		mlx5_spin_unlock(&mqp->rq.lock);
+	}
+
 	return ret;
 }
 
diff --git a/src/wqe.h b/src/wqe.h
index bd50d9a..b875104 100644
--- a/src/wqe.h
+++ b/src/wqe.h
@@ -70,6 +70,17 @@ struct mlx5_eqe_qp_srq {
 	uint32_t	qp_srq_n;
 };
 
+struct mlx5_wqe_eth_seg {
+	uint32_t	rsvd0;
+	uint8_t		cs_flags;
+	uint8_t		rsvd1;
+	uint16_t	mss;
+	uint32_t	rsvd2;
+	uint16_t	inline_hdr_sz;
+	uint8_t		inline_hdr_start[2];
+	uint8_t		inline_hdr[16];
+};
+
 struct mlx5_wqe_ctrl_seg {
 	uint32_t	opmod_idx_opcode;
 	uint32_t	qpn_ds;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH libmlx5 v2 7/7] Add Raw Packet QP data-path functionality
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-02-02 19:29   ` [PATCH libmlx5 v2 6/7] Allocate separate RQ and SQ buffers for Raw Packet QP Majd Dibbiny
@ 2016-02-02 19:29   ` Majd Dibbiny
  2016-02-03 17:21   ` [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2 Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Majd Dibbiny @ 2016-02-02 19:29 UTC (permalink / raw)
  To: yishaih-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, majd-VPRAkNaXOzVWk0Htik3J/w,
	cl-vYTEC60ixJUAvxtiuMwx3w

Raw Ethernet WQE is composed of the following segments:
1. Control segment
2. Eth segment
3. Data segment

The Eth segment contains packet headers and information for stateless
offloading. When posting a Raw Ethernet WQE, the library should copy
the L2 headers into the Eth segment, otherwise the packet is dropped
in the TX. The Data segment should include all the payload except the
headers that were copied to the Eth segment.

Raw Packet QP is composed of RQ and SQ in the hardware. Once the QP's
state is modified to INIT, the RQ is already in RDY state and thus
can receive packets. To avoid this behavior that contradicts the IB
spec, we don't update the doorbell record as long as the QP isn't in
RTR state, and once the QP is modified to RTR, the doorbell record is
update, which allows receiving packets.

Reviewed-by:: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Majd Dibbiny <majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/qp.c  | 114 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 src/wqe.h |   9 +++++
 2 files changed, 110 insertions(+), 13 deletions(-)

diff --git a/src/qp.c b/src/qp.c
index 8556714..60f2bdc 100644
--- a/src/qp.c
+++ b/src/qp.c
@@ -188,11 +188,12 @@ static void set_datagram_seg(struct mlx5_wqe_datagram_seg *dseg,
 	dseg->av.key.qkey.qkey = htonl(wr->wr.ud.remote_qkey);
 }
 
-static void set_data_ptr_seg(struct mlx5_wqe_data_seg *dseg, struct ibv_sge *sg)
+static void set_data_ptr_seg(struct mlx5_wqe_data_seg *dseg, struct ibv_sge *sg,
+			     int offset)
 {
-	dseg->byte_count = htonl(sg->length);
+	dseg->byte_count = htonl(sg->length - offset);
 	dseg->lkey       = htonl(sg->lkey);
-	dseg->addr       = htonll(sg->addr);
+	dseg->addr       = htonll(sg->addr + offset);
 }
 
 /*
@@ -230,7 +231,8 @@ static uint32_t send_ieth(struct ibv_send_wr *wr)
 }
 
 static int set_data_inl_seg(struct mlx5_qp *qp, struct ibv_send_wr *wr,
-			    void *wqe, int *sz)
+			    void *wqe, int *sz,
+			    struct mlx5_sg_copy_ptr *sg_copy_ptr)
 {
 	struct mlx5_wqe_inline_seg *seg;
 	void *addr;
@@ -239,13 +241,15 @@ static int set_data_inl_seg(struct mlx5_qp *qp, struct ibv_send_wr *wr,
 	int inl = 0;
 	void *qend = qp->sq.qend;
 	int copy;
+	int offset = sg_copy_ptr->offset;
 
 	seg = wqe;
 	wqe += sizeof *seg;
-	for (i = 0; i < wr->num_sge; ++i) {
-		addr = (void *) (unsigned long)(wr->sg_list[i].addr);
-		len  = wr->sg_list[i].length;
+	for (i = sg_copy_ptr->index; i < wr->num_sge; ++i) {
+		addr = (void *) (unsigned long)(wr->sg_list[i].addr + offset);
+		len  = wr->sg_list[i].length - offset;
 		inl += len;
+		offset = 0;
 
 		if (unlikely(inl > qp->max_inline_data)) {
 			errno = ENOMEM;
@@ -317,6 +321,63 @@ void *mlx5_get_atomic_laddr(struct mlx5_qp *qp, uint16_t idx, int *byte_count)
 	return addr;
 }
 
+static inline int copy_eth_inline_headers(struct ibv_qp *ibqp,
+					  struct ibv_send_wr *wr,
+					  struct mlx5_wqe_eth_seg *eseg,
+					  struct mlx5_sg_copy_ptr *sg_copy_ptr)
+{
+	int inl_hdr_size = MLX5_ETH_L2_INLINE_HEADER_SIZE;
+	int inl_hdr_copy_size = 0;
+	int j = 0;
+#ifdef MLX5_DEBUG
+	FILE *fp = to_mctx(ibqp->context)->dbg_fp;
+#endif
+
+	if (unlikely(wr->num_sge < 1)) {
+		mlx5_dbg(fp, MLX5_DBG_QP_SEND, "illegal num_sge: %d, minimum is 1\n",
+			 wr->num_sge);
+		return EINVAL;
+	}
+
+	if (likely(wr->sg_list[0].length >= MLX5_ETH_L2_INLINE_HEADER_SIZE)) {
+		inl_hdr_copy_size = MLX5_ETH_L2_INLINE_HEADER_SIZE;
+		memcpy(eseg->inline_hdr_start,
+		       (void *)(uintptr_t)wr->sg_list[0].addr,
+		       inl_hdr_copy_size);
+	} else {
+		for (j = 0; j < wr->num_sge && inl_hdr_size > 0; ++j) {
+			inl_hdr_copy_size = min(wr->sg_list[j].length,
+						inl_hdr_size);
+			memcpy(eseg->inline_hdr_start +
+			       (MLX5_ETH_L2_INLINE_HEADER_SIZE - inl_hdr_size),
+			       (void *)(uintptr_t)wr->sg_list[j].addr,
+			       inl_hdr_copy_size);
+			inl_hdr_size -= inl_hdr_copy_size;
+		}
+		if (unlikely(inl_hdr_size)) {
+			mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Ethernet headers < 16 bytes\n");
+			return EINVAL;
+		}
+		--j;
+	}
+
+
+	eseg->inline_hdr_sz = htons(MLX5_ETH_L2_INLINE_HEADER_SIZE);
+
+	/* If we copied all the sge into the inline-headers, then we need to
+	 * start copying from the next sge into the data-segment.
+	 */
+	if (unlikely(wr->sg_list[j].length == inl_hdr_copy_size)) {
+		++j;
+		inl_hdr_copy_size = 0;
+	}
+
+	sg_copy_ptr->index = j;
+	sg_copy_ptr->offset = inl_hdr_copy_size;
+
+	return 0;
+}
+
 int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 			  struct ibv_send_wr **bad_wr)
 {
@@ -325,6 +386,7 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 	void *seg;
 	struct mlx5_wqe_ctrl_seg *ctrl = NULL;
 	struct mlx5_wqe_data_seg *dpseg;
+	struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0};
 	int nreq;
 	int inl = 0;
 	int err = 0;
@@ -438,6 +500,22 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 				seg = mlx5_get_send_wqe(qp, 0);
 			break;
 
+		case IBV_QPT_RAW_PACKET:
+			memset(seg, 0, sizeof(struct mlx5_wqe_eth_seg));
+
+			err = copy_eth_inline_headers(ibqp, wr, seg, &sg_copy_ptr);
+			if (unlikely(err)) {
+				*bad_wr = wr;
+				mlx5_dbg(fp, MLX5_DBG_QP_SEND,
+					 "copy_eth_inline_headers failed, err: %d\n",
+					 err);
+				goto out;
+			}
+
+			seg += sizeof(struct mlx5_wqe_eth_seg);
+			size += sizeof(struct mlx5_wqe_eth_seg) / 16;
+			break;
+
 		default:
 			break;
 		}
@@ -445,7 +523,7 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 		if (wr->send_flags & IBV_SEND_INLINE && wr->num_sge) {
 			int uninitialized_var(sz);
 
-			err = set_data_inl_seg(qp, wr, seg, &sz);
+			err = set_data_inl_seg(qp, wr, seg, &sz, &sg_copy_ptr);
 			if (unlikely(err)) {
 				*bad_wr = wr;
 				mlx5_dbg(fp, MLX5_DBG_QP_SEND,
@@ -456,13 +534,15 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 			size += sz;
 		} else {
 			dpseg = seg;
-			for (i = 0; i < wr->num_sge; ++i) {
+			for (i = sg_copy_ptr.index; i < wr->num_sge; ++i) {
 				if (unlikely(dpseg == qend)) {
 					seg = mlx5_get_send_wqe(qp, 0);
 					dpseg = seg;
 				}
 				if (likely(wr->sg_list[i].length)) {
-					set_data_ptr_seg(dpseg, wr->sg_list + i);
+					set_data_ptr_seg(dpseg, wr->sg_list + i,
+							 sg_copy_ptr.offset);
+					sg_copy_ptr.offset = 0;
 					++dpseg;
 					size += sizeof(struct mlx5_wqe_data_seg) / 16;
 				}
@@ -586,7 +666,7 @@ int mlx5_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr,
 		for (i = 0, j = 0; i < wr->num_sge; ++i) {
 			if (unlikely(!wr->sg_list[i].length))
 				continue;
-			set_data_ptr_seg(scat + j++, wr->sg_list + i);
+			set_data_ptr_seg(scat + j++, wr->sg_list + i, 0);
 		}
 
 		if (j < qp->rq.max_gs) {
@@ -613,8 +693,16 @@ out:
 		 * doorbell record.
 		 */
 		wmb();
-
-		qp->db[MLX5_RCV_DBR] = htonl(qp->rq.head & 0xffff);
+		/*
+		 * For Raw Packet QP, avoid updating the doorbell record
+		 * as long as the QP isn't in RTR state, to avoid receiving
+		 * packets in illegal states.
+		 * This is only for Raw Packet QPs since they are represented
+		 * differently in the hardware.
+		 */
+		if (likely(!(ibqp->qp_type == IBV_QPT_RAW_PACKET &&
+			     ibqp->state < IBV_QPS_RTR)))
+			qp->db[MLX5_RCV_DBR] = htonl(qp->rq.head & 0xffff);
 	}
 
 	mlx5_spin_unlock(&qp->rq.lock);
diff --git a/src/wqe.h b/src/wqe.h
index b875104..eaaf7a6 100644
--- a/src/wqe.h
+++ b/src/wqe.h
@@ -60,6 +60,11 @@ struct mlx5_wqe_data_seg {
 	uint64_t		addr;
 };
 
+struct mlx5_sg_copy_ptr {
+	int	index;
+	int	offset;
+};
+
 struct mlx5_eqe_comp {
 	uint32_t	reserved[6];
 	uint32_t	cqn;
@@ -70,6 +75,10 @@ struct mlx5_eqe_qp_srq {
 	uint32_t	qp_srq_n;
 };
 
+enum {
+	MLX5_ETH_L2_INLINE_HEADER_SIZE	= 18,
+};
+
 struct mlx5_wqe_eth_seg {
 	uint32_t	rsvd0;
 	uint8_t		cs_flags;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2
       [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-02-02 19:29   ` [PATCH libmlx5 v2 7/7] Add Raw Packet QP data-path functionality Majd Dibbiny
@ 2016-02-03 17:21   ` Yishai Hadas
  7 siblings, 0 replies; 9+ messages in thread
From: Yishai Hadas @ 2016-02-03 17:21 UTC (permalink / raw)
  To: Majd Dibbiny
  Cc: yishaih-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w, cl-vYTEC60ixJUAvxtiuMwx3w

On 2/2/2016 9:29 PM, Majd Dibbiny wrote:
> Hi Yishai,
>
> This patch set adds support for Raw Packet QP for libmlx5.
>
> Raw Packet QP enables the user to send and receive raw packets. The user is
> responsible of building the packet including the headers.
>
> Raw Packet QP works with non-default CQE version, and in order to tie
> CQE to Raw Packet QP, we need to provide a user-index in the creation of QPs
> when working with non-default CQE version.
>
> The first 5 patches add support for CQE version 1 for QPs and XSRQs. The later
> patches add support for Raw Packet QP (control and data path).
>
> Changes from v1:
> 1. Fix compilation errors when compiling in debug mode
>
> Haggai Abramovsky (5):
>    Add infrastructure for resource identification
>    Add resource tracking database
>    Add new poll_cq according to the new CQE format
>    Add QP and XSRQ create/destroy flow with user index
>    Work with CQE version 1
>
> Majd Dibbiny (2):
>    Allocate separate RQ and SQ buffers for Raw Packet QP
>    Add Raw Packet QP data-path functionality
>
>   src/cq.c       | 271 +++++++++++++++++++++++++++++++++++++++++++++++----------
>   src/mlx5-abi.h |  20 ++++-
>   src/mlx5.c     |  90 +++++++++++++++++++
>   src/mlx5.h     |  63 +++++++++++++-
>   src/qp.c       | 114 ++++++++++++++++++++----
>   src/verbs.c    | 210 +++++++++++++++++++++++++++++++++++---------
>   src/wqe.h      |  20 +++++
>   7 files changed, 686 insertions(+), 102 deletions(-)
>

Series was applied, thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-03 17:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-02 19:29 [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2 Majd Dibbiny
     [not found] ` <1454441391-30494-1-git-send-email-majd-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-02-02 19:29   ` [PATCH libmlx5 v2 1/7] Add infrastructure for resource identification Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 2/7] Add resource tracking database Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 3/7] Add new poll_cq according to the new CQE format Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 4/7] Add QP and XSRQ create/destroy flow with user index Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 5/7] Work with CQE version 1 Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 6/7] Allocate separate RQ and SQ buffers for Raw Packet QP Majd Dibbiny
2016-02-02 19:29   ` [PATCH libmlx5 v2 7/7] Add Raw Packet QP data-path functionality Majd Dibbiny
2016-02-03 17:21   ` [PATCH libmlx5 v2 0/7] Raw Packet QP for mlx5 v2 Yishai Hadas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.