All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH libmlx5 0/7] Completion timestamping
@ 2016-06-01 13:47 Yishai Hadas
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:47 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

Hi Doug,

This series from Matan and me implements the libibverbs
'Completion timestamping' API.

It can serve as some vendor code to justify the new API,
from both clarity and performance aspects.

As already pointed,
Benchmarks we ran in our test lab found that this new approach generally
equals to current API but *not* worse than. As the new API enables
extending the polled fields we can overall say that it's a better API than
the legacy one.

Yishai
 


Matan Barak (4):
  Refactor mlx5_poll_one
  Add support for creating an extended CQ
  Add ibv_query_rt_values support
  Use configuration symbol for always in-line

Yishai Hadas (3):
  Add lazy CQ polling
  Add inline functions to read completion's attributes
  Add ability to poll CQs through iterator's style API

 Makefile.am                 |   1 +
 configure.ac                |   3 +
 m4/ax_gcc_func_attribute.m4 | 223 +++++++++++++
 src/cq.c                    | 745 ++++++++++++++++++++++++++++++++++++++++----
 src/mlx5-abi.h              |   5 +
 src/mlx5.c                  |  38 +++
 src/mlx5.h                  |  34 +-
 src/verbs.c                 | 139 ++++++++-
 8 files changed, 1106 insertions(+), 82 deletions(-)
 create mode 100644 m4/ax_gcc_func_attribute.m4

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 1/7] Refactor mlx5_poll_one
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 13:47   ` Yishai Hadas
       [not found]     ` <1464788882-1876-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-06-01 13:47   ` [PATCH libmlx5 2/7] Add lazy CQ polling Yishai Hadas
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:47 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Since downstream patches aim to provide lazy CQE polling, which
let the user poll the CQE's attribute via inline function, we
refactor poll_one:
* Pass out pointer instead of writing to the wc directly as part of
  handle_error_cqe.
* Introduce mlx5_get_next_cqe which will be used to advance the
  CQE iterator.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c | 112 ++++++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 67 insertions(+), 45 deletions(-)

diff --git a/src/cq.c b/src/cq.c
index ce18ac9..d3f2ada 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -298,54 +298,52 @@ static void dump_cqe(FILE *fp, void *buf)
 }
 
 static void mlx5_handle_error_cqe(struct mlx5_err_cqe *cqe,
-				  struct ibv_wc *wc)
+				  enum ibv_wc_status *pstatus)
 {
 	switch (cqe->syndrome) {
 	case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
-		wc->status = IBV_WC_LOC_LEN_ERR;
+		*pstatus = IBV_WC_LOC_LEN_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_LOCAL_QP_OP_ERR:
-		wc->status = IBV_WC_LOC_QP_OP_ERR;
+		*pstatus = IBV_WC_LOC_QP_OP_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_LOCAL_PROT_ERR:
-		wc->status = IBV_WC_LOC_PROT_ERR;
+		*pstatus = IBV_WC_LOC_PROT_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_WR_FLUSH_ERR:
-		wc->status = IBV_WC_WR_FLUSH_ERR;
+		*pstatus = IBV_WC_WR_FLUSH_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_MW_BIND_ERR:
-		wc->status = IBV_WC_MW_BIND_ERR;
+		*pstatus = IBV_WC_MW_BIND_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_BAD_RESP_ERR:
-		wc->status = IBV_WC_BAD_RESP_ERR;
+		*pstatus = IBV_WC_BAD_RESP_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_LOCAL_ACCESS_ERR:
-		wc->status = IBV_WC_LOC_ACCESS_ERR;
+		*pstatus = IBV_WC_LOC_ACCESS_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR:
-		wc->status = IBV_WC_REM_INV_REQ_ERR;
+		*pstatus = IBV_WC_REM_INV_REQ_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_REMOTE_ACCESS_ERR:
-		wc->status = IBV_WC_REM_ACCESS_ERR;
+		*pstatus = IBV_WC_REM_ACCESS_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_REMOTE_OP_ERR:
-		wc->status = IBV_WC_REM_OP_ERR;
+		*pstatus = IBV_WC_REM_OP_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR:
-		wc->status = IBV_WC_RETRY_EXC_ERR;
+		*pstatus = IBV_WC_RETRY_EXC_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_RNR_RETRY_EXC_ERR:
-		wc->status = IBV_WC_RNR_RETRY_EXC_ERR;
+		*pstatus = IBV_WC_RNR_RETRY_EXC_ERR;
 		break;
 	case MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR:
-		wc->status = IBV_WC_REM_ABORT_ERR;
+		*pstatus = IBV_WC_REM_ABORT_ERR;
 		break;
 	default:
-		wc->status = IBV_WC_GENERAL_ERR;
+		*pstatus = IBV_WC_GENERAL_ERR;
 		break;
 	}
-
-	wc->vendor_err = cqe->vendor_err_synd;
 }
 
 #if defined(__x86_64__) || defined (__i386__)
@@ -504,36 +502,21 @@ static inline int get_cur_rsc(struct mlx5_context *mctx,
 
 }
 
-static inline int mlx5_poll_one(struct mlx5_cq *cq,
-			 struct mlx5_resource **cur_rsc,
-			 struct mlx5_srq **cur_srq,
-			 struct ibv_wc *wc, int cqe_ver)
-			 __attribute__((always_inline));
-static inline int mlx5_poll_one(struct mlx5_cq *cq,
-			 struct mlx5_resource **cur_rsc,
-			 struct mlx5_srq **cur_srq,
-			 struct ibv_wc *wc, int cqe_ver)
+static inline int mlx5_get_next_cqe(struct mlx5_cq *cq,
+				    struct mlx5_cqe64 **pcqe64,
+				    void **pcqe)
+				    __attribute__((always_inline));
+static inline int mlx5_get_next_cqe(struct mlx5_cq *cq,
+				    struct mlx5_cqe64 **pcqe64,
+				    void **pcqe)
 {
-	struct mlx5_cqe64 *cqe64;
-	struct mlx5_wq *wq;
-	uint16_t wqe_ctr;
 	void *cqe;
-	uint32_t qpn;
-	uint32_t srqn_uidx;
-	int idx;
-	uint8_t opcode;
-	struct mlx5_err_cqe *ecqe;
-	int err;
-	struct mlx5_qp *mqp;
-	struct mlx5_context *mctx;
-	uint8_t is_srq = 0;
+	struct mlx5_cqe64 *cqe64;
 
 	cqe = next_cqe_sw(cq);
 	if (!cqe)
 		return CQ_EMPTY;
 
-	mctx = to_mctx(cq->ibv_cq.context);
-
 	cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
 
 	++cq->cons_index;
@@ -547,14 +530,52 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 	rmb();
 
 #ifdef MLX5_DEBUG
-	if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) {
-		FILE *fp = mctx->dbg_fp;
+	{
+		struct mlx5_context *mctx = to_mctx(cq->ibv_cq.context);
+
+		if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) {
+			FILE *fp = mctx->dbg_fp;
 
-		mlx5_dbg(fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn);
-		dump_cqe(fp, cqe64);
+			mlx5_dbg(fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn);
+			dump_cqe(fp, cqe64);
+		}
 	}
 #endif
+	*pcqe64 = cqe64;
+	*pcqe = cqe;
+
+	return CQ_OK;
+}
+
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+				struct mlx5_resource **cur_rsc,
+				struct mlx5_srq **cur_srq,
+				struct ibv_wc *wc, int cqe_ver)
+				__attribute__((always_inline));
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+				struct mlx5_resource **cur_rsc,
+				struct mlx5_srq **cur_srq,
+				struct ibv_wc *wc, int cqe_ver)
+{
+	struct mlx5_cqe64 *cqe64;
+	struct mlx5_wq *wq;
+	uint16_t wqe_ctr;
+	void *cqe;
+	uint32_t qpn;
+	uint32_t srqn_uidx;
+	int idx;
+	uint8_t opcode;
+	struct mlx5_err_cqe *ecqe;
+	int err;
+	struct mlx5_qp *mqp;
+	struct mlx5_context *mctx;
+	uint8_t is_srq = 0;
 
+	err = mlx5_get_next_cqe(cq, &cqe64, &cqe);
+	if (err == CQ_EMPTY)
+		return err;
+
+	mctx = to_mctx(cq->ibv_cq.context);
 	qpn = ntohl(cqe64->sop_drop_qpn) & 0xffffff;
 	wc->wc_flags = 0;
 	wc->qp_num = qpn;
@@ -602,7 +623,8 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 	case MLX5_CQE_RESP_ERR:
 		srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
 		ecqe = (struct mlx5_err_cqe *)cqe64;
-		mlx5_handle_error_cqe(ecqe, wc);
+		mlx5_handle_error_cqe(ecqe, &wc->status);
+		wc->vendor_err = ecqe->vendor_err_synd;
 		if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR &&
 			     ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR)) {
 			FILE *fp = mctx->dbg_fp;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 2/7] Add lazy CQ polling
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-06-01 13:47   ` [PATCH libmlx5 1/7] Refactor mlx5_poll_one Yishai Hadas
@ 2016-06-01 13:47   ` Yishai Hadas
  2016-06-01 13:47   ` [PATCH libmlx5 3/7] Add inline functions to read completion's attributes Yishai Hadas
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:47 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

Currently, when a user wants to poll a CQ for completion, he has no
choice but to get the whole work completion (WC). This has several
implications - for example:
* Extending the WC is limited, as adding new fields makes the WC
  larger and could take more cache lines.
* Every field is being copied to the WC - even fields that the user
  doesn't care about.

This patch adds some support for handling the CQE in a lazing manner.
The new lazy mode is going to be called in downstream patches.

We only parse fields that are mandatory in order to figure out the CQE
such as type, status, wr_id, etc.

To share code with the legacy mode without having a performance penalty
the legacy code was refactored and the 'always_inline' mechanism was
used so that branch conditions will be dropped at compile time.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c    | 176 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
 src/mlx5.h  |  11 +++-
 src/verbs.c |   6 +--
 3 files changed, 157 insertions(+), 36 deletions(-)

diff --git a/src/cq.c b/src/cq.c
index d3f2ada..a056787 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -219,6 +219,54 @@ static void handle_good_req(struct ibv_wc *wc, struct mlx5_cqe64 *cqe)
 	}
 }
 
+static inline void handle_good_req_lazy(struct mlx5_cqe64 *cqe, uint32_t *pwc_byte_len)
+{
+	switch (ntohl(cqe->sop_drop_qpn) >> 24) {
+	case MLX5_OPCODE_RDMA_READ:
+		*pwc_byte_len  = ntohl(cqe->byte_cnt);
+		break;
+	case MLX5_OPCODE_ATOMIC_CS:
+	case MLX5_OPCODE_ATOMIC_FA:
+		*pwc_byte_len  = 8;
+		break;
+	}
+}
+
+static inline int handle_responder_lazy(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe,
+					struct mlx5_qp *qp, struct mlx5_srq *srq)
+{
+	uint16_t	wqe_ctr;
+	struct mlx5_wq *wq;
+	int err = IBV_WC_SUCCESS;
+
+	if (srq) {
+		wqe_ctr = ntohs(cqe->wqe_counter);
+		cq->ibv_cq.wr_id = srq->wrid[wqe_ctr];
+		mlx5_free_srq_wqe(srq, wqe_ctr);
+		if (cqe->op_own & MLX5_INLINE_SCATTER_32)
+			err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe,
+						    ntohl(cqe->byte_cnt));
+		else if (cqe->op_own & MLX5_INLINE_SCATTER_64)
+			err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe - 1,
+						    ntohl(cqe->byte_cnt));
+	} else {
+		wq	  = &qp->rq;
+		wqe_ctr = wq->tail & (wq->wqe_cnt - 1);
+		cq->ibv_cq.wr_id = wq->wrid[wqe_ctr];
+		++wq->tail;
+		if (qp->qp_cap_cache & MLX5_RX_CSUM_VALID)
+			cq->flags |= MLX5_CQ_FLAGS_RX_CSUM_VALID;
+		if (cqe->op_own & MLX5_INLINE_SCATTER_32)
+			err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe,
+						    ntohl(cqe->byte_cnt));
+		else if (cqe->op_own & MLX5_INLINE_SCATTER_64)
+			err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe - 1,
+						    ntohl(cqe->byte_cnt));
+	}
+
+	return err;
+}
+
 static int handle_responder(struct ibv_wc *wc, struct mlx5_cqe64 *cqe,
 			    struct mlx5_qp *qp, struct mlx5_srq *srq)
 {
@@ -547,41 +595,49 @@ static inline int mlx5_get_next_cqe(struct mlx5_cq *cq,
 	return CQ_OK;
 }
 
-static inline int mlx5_poll_one(struct mlx5_cq *cq,
-				struct mlx5_resource **cur_rsc,
-				struct mlx5_srq **cur_srq,
-				struct ibv_wc *wc, int cqe_ver)
-				__attribute__((always_inline));
-static inline int mlx5_poll_one(struct mlx5_cq *cq,
-				struct mlx5_resource **cur_rsc,
-				struct mlx5_srq **cur_srq,
-				struct ibv_wc *wc, int cqe_ver)
+static inline int mlx5_parse_cqe(struct mlx5_cq *cq,
+				 struct mlx5_cqe64 *cqe64,
+				 void *cqe,
+				 struct mlx5_resource **cur_rsc,
+				 struct mlx5_srq **cur_srq,
+				 struct ibv_wc *wc,
+				 int cqe_ver, int lazy)
+				 __attribute__((always_inline));
+static inline int mlx5_parse_cqe(struct mlx5_cq *cq,
+				 struct mlx5_cqe64 *cqe64,
+				 void *cqe,
+				 struct mlx5_resource **cur_rsc,
+				 struct mlx5_srq **cur_srq,
+				 struct ibv_wc *wc,
+				 int cqe_ver, int lazy)
 {
-	struct mlx5_cqe64 *cqe64;
 	struct mlx5_wq *wq;
 	uint16_t wqe_ctr;
-	void *cqe;
 	uint32_t qpn;
 	uint32_t srqn_uidx;
 	int idx;
 	uint8_t opcode;
 	struct mlx5_err_cqe *ecqe;
-	int err;
+	int err = 0;
 	struct mlx5_qp *mqp;
 	struct mlx5_context *mctx;
 	uint8_t is_srq = 0;
 
-	err = mlx5_get_next_cqe(cq, &cqe64, &cqe);
-	if (err == CQ_EMPTY)
-		return err;
-
-	mctx = to_mctx(cq->ibv_cq.context);
+	mctx = to_mctx(ibv_cq_ex_to_cq(&cq->ibv_cq)->context);
 	qpn = ntohl(cqe64->sop_drop_qpn) & 0xffffff;
-	wc->wc_flags = 0;
-	wc->qp_num = qpn;
+	if (lazy) {
+		cq->cqe64 = cqe64;
+		cq->flags &= (~MLX5_CQ_FLAGS_RX_CSUM_VALID);
+	} else {
+		wc->wc_flags = 0;
+		wc->qp_num = qpn;
+	}
+
 	opcode = cqe64->op_own >> 4;
 	switch (opcode) {
 	case MLX5_CQE_REQ:
+	{
+		uint32_t uninitialized_var(wc_byte_len);
 		mqp = get_req_context(mctx, cur_rsc,
 				      (cqe_ver ? (ntohl(cqe64->srqn_uidx) & 0xffffff) : qpn),
 				      cqe_ver);
@@ -590,20 +646,29 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 		wq = &mqp->sq;
 		wqe_ctr = ntohs(cqe64->wqe_counter);
 		idx = wqe_ctr & (wq->wqe_cnt - 1);
-		handle_good_req(wc, cqe64);
+		if (lazy)
+			handle_good_req_lazy(cqe64, &wc_byte_len);
+		else
+			handle_good_req(wc, cqe64);
+
 		if (cqe64->op_own & MLX5_INLINE_SCATTER_32)
 			err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe,
-						    wc->byte_len);
+						    lazy ? wc_byte_len : wc->byte_len);
 		else if (cqe64->op_own & MLX5_INLINE_SCATTER_64)
 			err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe - 1,
-						    wc->byte_len);
-		else
-			err = 0;
+						     lazy ? wc_byte_len : wc->byte_len);
+
+		if (lazy) {
+			cq->ibv_cq.wr_id = wq->wrid[idx];
+			cq->ibv_cq.status = err;
+		} else {
+			wc->wr_id = wq->wrid[idx];
+			wc->status = err;
+		}
 
-		wc->wr_id = wq->wrid[idx];
 		wq->tail = wq->wqe_head[idx] + 1;
-		wc->status = err;
 		break;
+	}
 	case MLX5_CQE_RESP_WR_IMM:
 	case MLX5_CQE_RESP_SEND:
 	case MLX5_CQE_RESP_SEND_IMM:
@@ -614,7 +679,12 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 		if (unlikely(err))
 			return CQ_POLL_ERR;
 
-		wc->status = handle_responder(wc, cqe64, rsc_to_mqp(*cur_rsc),
+		if (lazy)
+			cq->ibv_cq.status = handle_responder_lazy(cq, cqe64,
+							      rsc_to_mqp(*cur_rsc),
+							      is_srq ? *cur_srq : NULL);
+		else
+			wc->status = handle_responder(wc, cqe64, rsc_to_mqp(*cur_rsc),
 					      is_srq ? *cur_srq : NULL);
 		break;
 	case MLX5_CQE_RESIZE_CQ:
@@ -623,8 +693,9 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 	case MLX5_CQE_RESP_ERR:
 		srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
 		ecqe = (struct mlx5_err_cqe *)cqe64;
-		mlx5_handle_error_cqe(ecqe, &wc->status);
-		wc->vendor_err = ecqe->vendor_err_synd;
+		mlx5_handle_error_cqe(ecqe, lazy ? &cq->ibv_cq.status : &wc->status);
+		if (!lazy)
+			wc->vendor_err = ecqe->vendor_err_synd;
 		if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR &&
 			     ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR)) {
 			FILE *fp = mctx->dbg_fp;
@@ -646,7 +717,10 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 			wq = &mqp->sq;
 			wqe_ctr = ntohs(cqe64->wqe_counter);
 			idx = wqe_ctr & (wq->wqe_cnt - 1);
-			wc->wr_id = wq->wrid[idx];
+			if (lazy)
+				cq->ibv_cq.wr_id = wq->wrid[idx];
+			else
+				wc->wr_id = wq->wrid[idx];
 			wq->tail = wq->wqe_head[idx] + 1;
 		} else {
 			err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx,
@@ -656,12 +730,18 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 
 			if (is_srq) {
 				wqe_ctr = ntohs(cqe64->wqe_counter);
-				wc->wr_id = (*cur_srq)->wrid[wqe_ctr];
+				if (lazy)
+					cq->ibv_cq.wr_id = (*cur_srq)->wrid[wqe_ctr];
+				else
+					wc->wr_id = (*cur_srq)->wrid[wqe_ctr];
 				mlx5_free_srq_wqe(*cur_srq, wqe_ctr);
 			} else {
 				mqp = rsc_to_mqp(*cur_rsc);
 				wq = &mqp->rq;
-				wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+				if (lazy)
+					cq->ibv_cq.wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+				else
+					wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
 				++wq->tail;
 			}
 		}
@@ -671,6 +751,38 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 	return CQ_OK;
 }
 
+static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq,
+				      struct mlx5_cqe64 *cqe64,
+				      void *cqe, int cqe_ver)
+				      __attribute__((always_inline));
+static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq,
+				      struct mlx5_cqe64 *cqe64,
+				      void *cqe, int cqe_ver)
+{
+	return mlx5_parse_cqe(cq, cqe64, cqe, &cq->cur_rsc, &cq->cur_srq, NULL, cqe_ver, 1);
+}
+
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+				struct mlx5_resource **cur_rsc,
+				struct mlx5_srq **cur_srq,
+				struct ibv_wc *wc, int cqe_ver)
+				__attribute__((always_inline));
+static inline int mlx5_poll_one(struct mlx5_cq *cq,
+				struct mlx5_resource **cur_rsc,
+				struct mlx5_srq **cur_srq,
+				struct ibv_wc *wc, int cqe_ver)
+{
+	struct mlx5_cqe64 *cqe64;
+	void *cqe;
+	int err;
+
+	err = mlx5_get_next_cqe(cq, &cqe64, &cqe);
+	if (err == CQ_EMPTY)
+		return err;
+
+	return mlx5_parse_cqe(cq, cqe64, cqe, cur_rsc, cur_srq, wc, cqe_ver, 0);
+}
+
 static inline int poll_cq(struct ibv_cq *ibcq, int ne,
 		      struct ibv_wc *wc, int cqe_ver)
 		      __attribute__((always_inline));
diff --git a/src/mlx5.h b/src/mlx5.h
index e91e519..99bee10 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -364,8 +364,13 @@ enum {
 	MLX5_CQ_ARM_DB	= 1,
 };
 
+enum {
+	MLX5_CQ_FLAGS_RX_CSUM_VALID = 1 << 0,
+};
+
 struct mlx5_cq {
-	struct ibv_cq			ibv_cq;
+	/* ibv_cq should always be subset of ibv_cq_ex */
+	struct ibv_cq_ex		ibv_cq;
 	struct mlx5_buf			buf_a;
 	struct mlx5_buf			buf_b;
 	struct mlx5_buf		       *active_buf;
@@ -384,6 +389,10 @@ struct mlx5_cq {
 	uint64_t			stall_last_count;
 	int				stall_adaptive_enable;
 	int				stall_cycles;
+	struct mlx5_resource		*cur_rsc;
+	struct mlx5_srq			*cur_srq;
+	struct mlx5_cqe64		*cqe64;
+	uint32_t			flags;
 };
 
 struct mlx5_srq {
diff --git a/src/verbs.c b/src/verbs.c
index e7aad5f..e78d2a5 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -328,8 +328,8 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	cmd.cqe_size = cqe_sz;
 
 	ret = ibv_cmd_create_cq(context, ncqe - 1, channel, comp_vector,
-				&cq->ibv_cq, &cmd.ibv_cmd, sizeof cmd,
-				&resp.ibv_resp, sizeof resp);
+				ibv_cq_ex_to_cq(&cq->ibv_cq), &cmd.ibv_cmd,
+				sizeof(cmd), &resp.ibv_resp, sizeof(resp));
 	if (ret) {
 		mlx5_dbg(fp, MLX5_DBG_CQ, "ret %d\n", ret);
 		goto err_db;
@@ -342,7 +342,7 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	cq->stall_adaptive_enable = to_mctx(context)->stall_adaptive_enable;
 	cq->stall_cycles = to_mctx(context)->stall_cycles;
 
-	return &cq->ibv_cq;
+	return ibv_cq_ex_to_cq(&cq->ibv_cq);
 
 err_db:
 	mlx5_free_db(to_mctx(context), cq->dbrec);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 3/7] Add inline functions to read completion's attributes
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-06-01 13:47   ` [PATCH libmlx5 1/7] Refactor mlx5_poll_one Yishai Hadas
  2016-06-01 13:47   ` [PATCH libmlx5 2/7] Add lazy CQ polling Yishai Hadas
@ 2016-06-01 13:47   ` Yishai Hadas
       [not found]     ` <1464788882-1876-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-06-01 13:47   ` [PATCH libmlx5 4/7] Add ability to poll CQs through iterator's style API Yishai Hadas
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:47 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

Add inline functions in order to read various completion's
attributes. These functions will be assigned in the ibv_cq_ex
structure in order to allow the user to read the completion's
attributes.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

diff --git a/src/cq.c b/src/cq.c
index a056787..188b34e 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -850,6 +850,127 @@ int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
 	return poll_cq(ibcq, ne, wc, 1);
 }
 
+static inline enum ibv_wc_opcode mlx5_cq_read_wc_opcode(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	switch (cq->cqe64->op_own >> 4) {
+	case MLX5_CQE_RESP_WR_IMM:
+		return IBV_WC_RECV_RDMA_WITH_IMM;
+	case MLX5_CQE_RESP_SEND:
+		return IBV_WC_RECV;
+	case MLX5_CQE_RESP_SEND_IMM:
+		return IBV_WC_RECV;
+	case MLX5_CQE_REQ:
+		switch (ntohl(cq->cqe64->sop_drop_qpn) >> 24) {
+		case MLX5_OPCODE_RDMA_WRITE_IMM:
+		case MLX5_OPCODE_RDMA_WRITE:
+			return IBV_WC_RDMA_WRITE;
+		case MLX5_OPCODE_SEND_IMM:
+		case MLX5_OPCODE_SEND:
+		case MLX5_OPCODE_SEND_INVAL:
+			return IBV_WC_SEND;
+		case MLX5_OPCODE_RDMA_READ:
+			return IBV_WC_RDMA_READ;
+		case MLX5_OPCODE_ATOMIC_CS:
+			return IBV_WC_COMP_SWAP;
+		case MLX5_OPCODE_ATOMIC_FA:
+			return IBV_WC_FETCH_ADD;
+		case MLX5_OPCODE_BIND_MW:
+			return IBV_WC_BIND_MW;
+		}
+	}
+	fprintf(stderr, "un-expected opcode in cqe\n");
+	return 0;
+}
+
+static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return ntohl(cq->cqe64->sop_drop_qpn) & 0xffffff;
+}
+
+static inline int mlx5_cq_read_wc_flags(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+	int wc_flags = 0;
+
+	if (cq->flags & MLX5_CQ_FLAGS_RX_CSUM_VALID)
+		wc_flags = (!!(cq->cqe64->hds_ip_ext & MLX5_CQE_L4_OK) &
+				 !!(cq->cqe64->hds_ip_ext & MLX5_CQE_L3_OK) &
+				 (get_cqe_l3_hdr_type(cq->cqe64) ==
+				  MLX5_CQE_L3_HDR_TYPE_IPV4)) <<
+				IBV_WC_IP_CSUM_OK_SHIFT;
+
+	switch (cq->cqe64->op_own >> 4) {
+	case MLX5_CQE_RESP_WR_IMM:
+	case MLX5_CQE_RESP_SEND_IMM:
+		wc_flags	|= IBV_WC_WITH_IMM;
+		break;
+	}
+
+	wc_flags |= ((ntohl(cq->cqe64->flags_rqpn) >> 28) & 3) ? IBV_WC_GRH : 0;
+	return wc_flags;
+}
+
+static inline uint32_t mlx5_cq_read_wc_byte_len(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return ntohl(cq->cqe64->byte_cnt);
+}
+
+static inline uint32_t mlx5_cq_read_wc_vendor_err(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+	struct mlx5_err_cqe *ecqe = (struct mlx5_err_cqe *)cq->cqe64;
+
+	return ecqe->vendor_err_synd;
+}
+
+static inline uint32_t mlx5_cq_read_wc_imm_data(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return cq->cqe64->imm_inval_pkey;
+}
+
+static inline uint32_t mlx5_cq_read_wc_slid(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return (uint32_t)ntohs(cq->cqe64->slid);
+}
+
+static inline uint8_t mlx5_cq_read_wc_sl(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return (ntohl(cq->cqe64->flags_rqpn) >> 24) & 0xf;
+}
+
+static inline uint32_t mlx5_cq_read_wc_src_qp(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return ntohl(cq->cqe64->flags_rqpn) & 0xffffff;
+}
+
+static inline uint8_t mlx5_cq_read_wc_dlid_path_bits(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return cq->cqe64->ml_path & 0x7f;
+}
+
+static inline uint64_t mlx5_cq_read_wc_completion_ts(struct ibv_cq_ex *ibcq)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	return ntohll(cq->cqe64->timestamp);
+}
+
 int mlx5_arm_cq(struct ibv_cq *ibvcq, int solicited)
 {
 	struct mlx5_cq *cq = to_mcq(ibvcq);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 4/7] Add ability to poll CQs through iterator's style API
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-06-01 13:47   ` [PATCH libmlx5 3/7] Add inline functions to read completion's attributes Yishai Hadas
@ 2016-06-01 13:47   ` Yishai Hadas
  2016-06-01 13:48   ` [PATCH libmlx5 5/7] Add support for creating an extended CQ Yishai Hadas
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:47 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

The new poll CQ API is based on an iterator's style API.
The user calls start_poll_cq and next_poll_cq, query whatever valid
and initialized (initialized attributes are attributes which were
stated when the CQ was created) attributes and call end_poll_cq at
the end.

This patch implements this methodology in mlx5 user space vendor
driver. In order to make start and end efficient, we use specialized
functions for every case - locked, adapttive- stall and their
combination. This spares if conditions in the data path.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c   | 255 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 src/mlx5.h |   2 +
 2 files changed, 257 insertions(+)

diff --git a/src/cq.c b/src/cq.c
index 188b34e..4fa0cf1 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -840,6 +840,261 @@ static inline int poll_cq(struct ibv_cq *ibcq, int ne,
 	return err == CQ_POLL_ERR ? err : npolled;
 }
 
+enum  polling_mode {
+	POLLING_MODE_NO_STALL,
+	POLLING_MODE_STALL,
+	POLLING_MODE_STALL_ADAPTIVE
+};
+
+static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq,
+				 int lock, enum polling_mode stall)
+				 __attribute__((always_inline));
+static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq,
+				 int lock, enum polling_mode stall)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+
+	update_cons_index(cq);
+
+	if (lock)
+		mlx5_spin_unlock(&cq->lock);
+
+	if (stall) {
+		if (stall == POLLING_MODE_STALL_ADAPTIVE) {
+			if (!(cq->flags & MLX5_CQ_FLAGS_FOUND_CQES)) {
+				cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step,
+						       mlx5_stall_cq_poll_min);
+				mlx5_get_cycles(&cq->stall_last_count);
+			} else if (cq->flags & MLX5_CQ_FLAGS_EMPTY_DURING_POLL) {
+				cq->stall_cycles = min(cq->stall_cycles + mlx5_stall_cq_inc_step,
+						       mlx5_stall_cq_poll_max);
+				mlx5_get_cycles(&cq->stall_last_count);
+			} else {
+				cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step,
+						       mlx5_stall_cq_poll_min);
+				cq->stall_last_count = 0;
+			}
+		} else if (!(cq->flags & MLX5_CQ_FLAGS_FOUND_CQES)) {
+			cq->stall_next_poll = 1;
+		}
+
+		cq->flags &= ~(MLX5_CQ_FLAGS_FOUND_CQES | MLX5_CQ_FLAGS_EMPTY_DURING_POLL);
+	}
+}
+
+static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr,
+				  int lock, enum polling_mode stall, int cqe_version)
+				  __attribute__((always_inline));
+static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr,
+				  int lock, enum polling_mode stall, int cqe_version)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+	struct mlx5_cqe64 *cqe64;
+	void *cqe;
+	int err;
+
+	if (unlikely(attr->comp_mask))
+		return EINVAL;
+
+	if (stall) {
+		if (stall == POLLING_MODE_STALL_ADAPTIVE) {
+			if (cq->stall_last_count)
+				mlx5_stall_cycles_poll_cq(cq->stall_last_count + cq->stall_cycles);
+		} else if (cq->stall_next_poll) {
+			cq->stall_next_poll = 0;
+			mlx5_stall_poll_cq();
+		}
+	}
+
+	if (lock)
+		mlx5_spin_lock(&cq->lock);
+
+	cq->cur_rsc = NULL;
+	cq->cur_srq = NULL;
+
+	err = mlx5_get_next_cqe(cq, &cqe64, &cqe);
+	if (err == CQ_EMPTY) {
+		if (lock)
+			mlx5_spin_unlock(&cq->lock);
+
+		if (stall) {
+			if (stall == POLLING_MODE_STALL_ADAPTIVE) {
+				cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step,
+						mlx5_stall_cq_poll_min);
+				mlx5_get_cycles(&cq->stall_last_count);
+			} else {
+				cq->stall_next_poll = 1;
+			}
+		}
+
+		return ENOENT;
+	}
+
+	if (stall)
+		cq->flags |= MLX5_CQ_FLAGS_FOUND_CQES;
+
+	err = mlx5_parse_lazy_cqe(cq, cqe64, cqe, cqe_version);
+	if (lock && err)
+		mlx5_spin_unlock(&cq->lock);
+
+	if (stall && err) {
+		if (stall == POLLING_MODE_STALL_ADAPTIVE) {
+			cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step,
+						mlx5_stall_cq_poll_min);
+			cq->stall_last_count = 0;
+		}
+
+		cq->flags &= ~(MLX5_CQ_FLAGS_FOUND_CQES);
+	}
+
+	return err;
+}
+
+static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq,
+				 enum polling_mode stall, int cqe_version)
+				 __attribute__((always_inline));
+static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq,
+				 enum polling_mode stall,
+				 int cqe_version)
+{
+	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
+	struct mlx5_cqe64 *cqe64;
+	void *cqe;
+	int err;
+
+	err = mlx5_get_next_cqe(cq, &cqe64, &cqe);
+	if (err == CQ_EMPTY) {
+		if (stall == POLLING_MODE_STALL_ADAPTIVE)
+			cq->flags |= MLX5_CQ_FLAGS_EMPTY_DURING_POLL;
+
+		return ENOENT;
+	}
+
+	return mlx5_parse_lazy_cqe(cq, cqe64, cqe, cqe_version);
+}
+
+static inline int mlx5_next_poll_adaptive_stall_enable_v0(struct ibv_cq_ex *ibcq)
+{
+	return mlx5_next_poll(ibcq, POLLING_MODE_STALL_ADAPTIVE, 0);
+}
+
+static inline int mlx5_next_poll_adaptive_stall_enable_v1(struct ibv_cq_ex *ibcq)
+{
+	return mlx5_next_poll(ibcq, POLLING_MODE_STALL_ADAPTIVE, 1);
+}
+
+static inline int mlx5_next_poll_v0(struct ibv_cq_ex *ibcq)
+{
+	return mlx5_next_poll(ibcq, 0, 0);
+}
+
+static inline int mlx5_next_poll_v1(struct ibv_cq_ex *ibcq)
+{
+	return mlx5_next_poll(ibcq, 0, 1);
+}
+
+static inline int mlx5_start_poll_v0(struct ibv_cq_ex *ibcq,
+				     struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, 0, 0);
+}
+
+static inline int mlx5_start_poll_v1(struct ibv_cq_ex *ibcq,
+				     struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, 0, 1);
+}
+
+static inline int mlx5_start_poll_v0_lock(struct ibv_cq_ex *ibcq,
+					  struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, 0, 0);
+}
+
+static inline int mlx5_start_poll_v1_lock(struct ibv_cq_ex *ibcq,
+					  struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, 0, 1);
+}
+
+static inline int mlx5_start_poll_adaptive_stall_enable_v0_lock(struct ibv_cq_ex *ibcq,
+								struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 0);
+}
+
+static inline int mlx5_start_poll_nonadaptive_stall_enable_v0_lock(struct ibv_cq_ex *ibcq,
+								   struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 0);
+}
+
+static inline int mlx5_start_poll_adaptive_stall_enable_v1_lock(struct ibv_cq_ex *ibcq,
+								struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 1);
+}
+
+static inline int mlx5_start_poll_nonadaptive_stall_enable_v1_lock(struct ibv_cq_ex *ibcq,
+								   struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 1);
+}
+
+static inline int mlx5_start_poll_nonadaptive_stall_enable_v0(struct ibv_cq_ex *ibcq,
+							      struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 0);
+}
+
+static inline int mlx5_start_poll_adaptive_stall_enable_v0(struct ibv_cq_ex *ibcq,
+							   struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 0);
+}
+
+static inline int mlx5_start_poll_adaptive_stall_enable_v1(struct ibv_cq_ex *ibcq,
+							   struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 1);
+}
+
+static inline int mlx5_start_poll_nonadaptive_stall_enable_v1(struct ibv_cq_ex *ibcq,
+							      struct ibv_poll_cq_attr *attr)
+{
+	return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 1);
+}
+
+static inline void mlx5_end_poll_adaptive_stall_enable_unlock(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 1, POLLING_MODE_STALL_ADAPTIVE);
+}
+
+static inline void mlx5_end_poll_nonadaptive_stall_enable_unlock(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 1, POLLING_MODE_STALL);
+}
+
+static inline void mlx5_end_poll_adaptive_stall_enable(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 0, POLLING_MODE_STALL_ADAPTIVE);
+}
+
+static inline void mlx5_end_poll_nonadaptive_stall_enable(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 0, POLLING_MODE_STALL);
+}
+
+static inline void mlx5_end_poll_nop(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 0, 0);
+}
+
+static inline void mlx5_end_poll_unlock(struct ibv_cq_ex *ibcq)
+{
+	mlx5_end_poll(ibcq, 1, 0);
+}
+
 int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
 {
 	return poll_cq(ibcq, ne, wc, 0);
diff --git a/src/mlx5.h b/src/mlx5.h
index 99bee10..15510cf 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -366,6 +366,8 @@ enum {
 
 enum {
 	MLX5_CQ_FLAGS_RX_CSUM_VALID = 1 << 0,
+	MLX5_CQ_FLAGS_EMPTY_DURING_POLL = 1 << 1,
+	MLX5_CQ_FLAGS_FOUND_CQES = 1 << 2,
 };
 
 struct mlx5_cq {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 5/7] Add support for creating an extended CQ
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-06-01 13:47   ` [PATCH libmlx5 4/7] Add ability to poll CQs through iterator's style API Yishai Hadas
@ 2016-06-01 13:48   ` Yishai Hadas
       [not found]     ` <1464788882-1876-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2016-06-01 13:48   ` [PATCH libmlx5 6/7] Add ibv_query_rt_values support Yishai Hadas
  2016-06-01 13:48   ` [PATCH libmlx5 7/7] Use configuration symbol for always in-line Yishai Hadas
  6 siblings, 1 reply; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:48 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

This patch adds the support for creating an extended CQ.
This means we support:
- The new polling mechanism.
- A CQ which is single threaded and by thus doesn't waste CPU cycles on locking.
- Getting completion timestamp from the CQ.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/cq.c    | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 src/mlx5.c  |   1 +
 src/mlx5.h  |   5 +++
 src/verbs.c |  91 +++++++++++++++++++++++++++++++++++++++++---------
 4 files changed, 190 insertions(+), 16 deletions(-)

diff --git a/src/cq.c b/src/cq.c
index 4fa0cf1..de91f07 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -1226,6 +1226,115 @@ static inline uint64_t mlx5_cq_read_wc_completion_ts(struct ibv_cq_ex *ibcq)
 	return ntohll(cq->cqe64->timestamp);
 }
 
+void mlx5_cq_fill_pfns(struct mlx5_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr)
+{
+	struct mlx5_context *mctx = to_mctx(ibv_cq_ex_to_cq(&cq->ibv_cq)->context);
+
+	if (mctx->cqe_version) {
+		if (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED) {
+			if (cq->stall_enable) {
+				if (cq->stall_adaptive_enable) {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_adaptive_stall_enable_v1;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_adaptive_stall_enable;
+				} else {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_nonadaptive_stall_enable_v1;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_nonadaptive_stall_enable;
+				}
+			} else {
+				cq->ibv_cq.start_poll = mlx5_start_poll_v1;
+				cq->ibv_cq.end_poll = mlx5_end_poll_nop;
+			}
+		} else {
+			if (cq->stall_enable) {
+				if (cq->stall_adaptive_enable) {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_adaptive_stall_enable_v1_lock;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_adaptive_stall_enable_unlock;
+				} else {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_nonadaptive_stall_enable_v1_lock;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_nonadaptive_stall_enable_unlock;
+				}
+			} else {
+				cq->ibv_cq.start_poll = mlx5_start_poll_v1_lock;
+				cq->ibv_cq.end_poll = mlx5_end_poll_unlock;
+			}
+		}
+
+		if (!cq->stall_adaptive_enable)
+			cq->ibv_cq.next_poll = mlx5_next_poll_v1;
+		else
+			cq->ibv_cq.next_poll = mlx5_next_poll_adaptive_stall_enable_v1;
+	} else {
+		if (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED) {
+			if (cq->stall_enable) {
+				if (cq->stall_adaptive_enable) {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_adaptive_stall_enable_v0;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_adaptive_stall_enable;
+				} else {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_nonadaptive_stall_enable_v0;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_nonadaptive_stall_enable;
+				}
+			} else {
+				cq->ibv_cq.start_poll = mlx5_start_poll_v0;
+				cq->ibv_cq.end_poll = mlx5_end_poll_nop;
+			}
+		} else {
+			if (cq->stall_enable) {
+				if (cq->stall_adaptive_enable) {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_adaptive_stall_enable_v0_lock;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_adaptive_stall_enable_unlock;
+				} else {
+					cq->ibv_cq.start_poll =
+						mlx5_start_poll_nonadaptive_stall_enable_v0_lock;
+					cq->ibv_cq.end_poll =
+						mlx5_end_poll_nonadaptive_stall_enable_unlock;
+				}
+			} else {
+				cq->ibv_cq.start_poll = mlx5_start_poll_v0_lock;
+				cq->ibv_cq.end_poll = mlx5_end_poll_unlock;
+			}
+		}
+
+		if (!cq->stall_adaptive_enable)
+			cq->ibv_cq.next_poll = mlx5_next_poll_v0;
+		else
+			cq->ibv_cq.next_poll = mlx5_next_poll_adaptive_stall_enable_v0;
+	}
+
+	cq->ibv_cq.read_opcode = mlx5_cq_read_wc_opcode;
+	cq->ibv_cq.read_vendor_err = mlx5_cq_read_wc_vendor_err;
+	cq->ibv_cq.read_wc_flags = mlx5_cq_read_wc_flags;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+		cq->ibv_cq.read_byte_len = mlx5_cq_read_wc_byte_len;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_IMM)
+		cq->ibv_cq.read_imm_data = mlx5_cq_read_wc_imm_data;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_QP_NUM)
+		cq->ibv_cq.read_qp_num = mlx5_cq_read_wc_qp_num;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_SRC_QP)
+		cq->ibv_cq.read_src_qp = mlx5_cq_read_wc_src_qp;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_SLID)
+		cq->ibv_cq.read_slid = mlx5_cq_read_wc_slid;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_SL)
+		cq->ibv_cq.read_sl = mlx5_cq_read_wc_sl;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
+		cq->ibv_cq.read_dlid_path_bits = mlx5_cq_read_wc_dlid_path_bits;
+	if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP)
+		cq->ibv_cq.read_completion_ts = mlx5_cq_read_wc_completion_ts;
+}
+
 int mlx5_arm_cq(struct ibv_cq *ibvcq, int solicited)
 {
 	struct mlx5_cq *cq = to_mcq(ibvcq);
diff --git a/src/mlx5.c b/src/mlx5.c
index 7bd01bd..d7a6a8f 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -702,6 +702,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
 	verbs_set_ctx_op(v_ctx, query_device_ex, mlx5_query_device_ex);
 	verbs_set_ctx_op(v_ctx, ibv_create_flow, ibv_cmd_create_flow);
 	verbs_set_ctx_op(v_ctx, ibv_destroy_flow, ibv_cmd_destroy_flow);
+	verbs_set_ctx_op(v_ctx, create_cq_ex, mlx5_create_cq_ex);
 
 	memset(&device_attr, 0, sizeof(device_attr));
 	if (!mlx5_query_device(ctx, &device_attr)) {
diff --git a/src/mlx5.h b/src/mlx5.h
index 15510cf..506ec0a 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -368,6 +368,8 @@ enum {
 	MLX5_CQ_FLAGS_RX_CSUM_VALID = 1 << 0,
 	MLX5_CQ_FLAGS_EMPTY_DURING_POLL = 1 << 1,
 	MLX5_CQ_FLAGS_FOUND_CQES = 1 << 2,
+	MLX5_CQ_FLAGS_EXTENDED = 1 << 3,
+	MLX5_CQ_FLAGS_SINGLE_THREADED = 1 << 4,
 };
 
 struct mlx5_cq {
@@ -635,6 +637,9 @@ int mlx5_dereg_mr(struct ibv_mr *mr);
 struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 			       struct ibv_comp_channel *channel,
 			       int comp_vector);
+struct ibv_cq_ex *mlx5_create_cq_ex(struct ibv_context *context,
+				    struct ibv_cq_init_attr_ex *cq_attr);
+void mlx5_cq_fill_pfns(struct mlx5_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr);
 int mlx5_alloc_cq_buf(struct mlx5_context *mctx, struct mlx5_cq *cq,
 		      struct mlx5_buf *buf, int nent, int cqe_sz);
 int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf);
diff --git a/src/verbs.c b/src/verbs.c
index e78d2a5..6f2ef00 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -254,9 +254,22 @@ static int qp_sig_enabled(void)
 	return 0;
 }
 
-struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
-			      struct ibv_comp_channel *channel,
-			      int comp_vector)
+enum {
+	CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS	|
+				       IBV_WC_EX_WITH_COMPLETION_TIMESTAMP
+};
+
+enum {
+	CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CQ_INIT_ATTR_MASK_FLAGS
+};
+
+enum {
+	CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_SINGLE_THREADED
+};
+
+static struct ibv_cq_ex *create_cq(struct ibv_context *context,
+				   const struct ibv_cq_init_attr_ex *cq_attr,
+				   int cq_alloc_flags)
 {
 	struct mlx5_create_cq		cmd;
 	struct mlx5_create_cq_resp	resp;
@@ -268,12 +281,33 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	FILE *fp = to_mctx(context)->dbg_fp;
 #endif
 
-	if (!cqe) {
-		mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
+	if (!cq_attr->cqe) {
+		mlx5_dbg(fp, MLX5_DBG_CQ, "CQE invalid\n");
+		errno = EINVAL;
+		return NULL;
+	}
+
+	if (cq_attr->comp_mask & ~CREATE_CQ_SUPPORTED_COMP_MASK) {
+		mlx5_dbg(fp, MLX5_DBG_CQ,
+			 "Unsupported comp_mask for create_cq\n");
+		errno = EINVAL;
+		return NULL;
+	}
+
+	if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS &&
+	    cq_attr->flags & ~CREATE_CQ_SUPPORTED_FLAGS) {
+		mlx5_dbg(fp, MLX5_DBG_CQ,
+			 "Unsupported creation flags requested for create_cq\n");
 		errno = EINVAL;
 		return NULL;
 	}
 
+	if (cq_attr->wc_flags & ~CREATE_CQ_SUPPORTED_WC_FLAGS) {
+		mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
+		errno = ENOTSUP;
+		return NULL;
+	}
+
 	cq =  calloc(1, sizeof *cq);
 	if (!cq) {
 		mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
@@ -286,15 +320,8 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	if (mlx5_spinlock_init(&cq->lock))
 		goto err;
 
-	/* The additional entry is required for resize CQ */
-	if (cqe <= 0) {
-		mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
-		errno = EINVAL;
-		goto err_spl;
-	}
-
-	ncqe = align_queue_size(cqe + 1);
-	if ((ncqe > (1 << 24)) || (ncqe < (cqe + 1))) {
+	ncqe = align_queue_size(cq_attr->cqe + 1);
+	if ((ncqe > (1 << 24)) || (ncqe < (cq_attr->cqe + 1))) {
 		mlx5_dbg(fp, MLX5_DBG_CQ, "ncqe %d\n", ncqe);
 		errno = EINVAL;
 		goto err_spl;
@@ -322,12 +349,17 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	cq->dbrec[MLX5_CQ_ARM_DB]	= 0;
 	cq->arm_sn			= 0;
 	cq->cqe_sz			= cqe_sz;
+	cq->flags			= cq_alloc_flags;
 
+	if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS &&
+	    cq_attr->flags & IBV_CREATE_CQ_ATTR_SINGLE_THREADED)
+		cq->flags |= MLX5_CQ_FLAGS_SINGLE_THREADED;
 	cmd.buf_addr = (uintptr_t) cq->buf_a.buf;
 	cmd.db_addr  = (uintptr_t) cq->dbrec;
 	cmd.cqe_size = cqe_sz;
 
-	ret = ibv_cmd_create_cq(context, ncqe - 1, channel, comp_vector,
+	ret = ibv_cmd_create_cq(context, ncqe - 1, cq_attr->channel,
+				cq_attr->comp_vector,
 				ibv_cq_ex_to_cq(&cq->ibv_cq), &cmd.ibv_cmd,
 				sizeof(cmd), &resp.ibv_resp, sizeof(resp));
 	if (ret) {
@@ -342,7 +374,10 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
 	cq->stall_adaptive_enable = to_mctx(context)->stall_adaptive_enable;
 	cq->stall_cycles = to_mctx(context)->stall_cycles;
 
-	return ibv_cq_ex_to_cq(&cq->ibv_cq);
+	if (cq_alloc_flags & MLX5_CQ_FLAGS_EXTENDED)
+		mlx5_cq_fill_pfns(cq, cq_attr);
+
+	return &cq->ibv_cq;
 
 err_db:
 	mlx5_free_db(to_mctx(context), cq->dbrec);
@@ -359,6 +394,30 @@ err:
 	return NULL;
 }
 
+struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
+			      struct ibv_comp_channel *channel,
+			      int comp_vector)
+{
+	struct ibv_cq_ex *cq;
+	struct ibv_cq_init_attr_ex cq_attr = {.cqe = cqe, .channel = channel,
+						.comp_vector = comp_vector,
+						.wc_flags = IBV_WC_STANDARD_FLAGS};
+
+	if (cqe <= 0) {
+		errno = EINVAL;
+		return NULL;
+	}
+
+	cq = create_cq(context, &cq_attr, 0);
+	return cq ? ibv_cq_ex_to_cq(cq) : NULL;
+}
+
+struct ibv_cq_ex *mlx5_create_cq_ex(struct ibv_context *context,
+				    struct ibv_cq_init_attr_ex *cq_attr)
+{
+	return create_cq(context, cq_attr, MLX5_CQ_FLAGS_EXTENDED);
+}
+
 int mlx5_resize_cq(struct ibv_cq *ibcq, int cqe)
 {
 	struct mlx5_cq *cq = to_mcq(ibcq);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 6/7] Add ibv_query_rt_values support
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-06-01 13:48   ` [PATCH libmlx5 5/7] Add support for creating an extended CQ Yishai Hadas
@ 2016-06-01 13:48   ` Yishai Hadas
  2016-06-01 13:48   ` [PATCH libmlx5 7/7] Use configuration symbol for always in-line Yishai Hadas
  6 siblings, 0 replies; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:48 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

In order to query the current HCA's core clock, libmlx5 should
support ibv_query_rt_values verb. Querying the hardware's cycles
register is done by mmaping this register to user-space.
Therefore, when libmlx5 initializes we mmap the cycles register.
This assumes the machine's architecture places the PCI and memory in
the same address space.
The page offset is passed through init_context vendor's data.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 src/mlx5-abi.h |  5 +++++
 src/mlx5.c     | 37 +++++++++++++++++++++++++++++++++++++
 src/mlx5.h     | 10 +++++++++-
 src/verbs.c    | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 97 insertions(+), 1 deletion(-)

diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index e2815c0..b57fd55 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -62,6 +62,10 @@ struct mlx5_alloc_ucontext {
 	__u32				reserved2;
 };
 
+enum mlx5_ib_alloc_ucontext_resp_mask {
+	MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET = 1UL << 0,
+};
+
 struct mlx5_alloc_ucontext_resp {
 	struct ibv_get_context_resp	ibv_resp;
 	__u32				qp_tab_size;
@@ -80,6 +84,7 @@ struct mlx5_alloc_ucontext_resp {
 	__u8				cqe_version;
 	__u8				reserved2;
 	__u16				reserved3;
+	__u64				hca_core_clock_offset;
 };
 
 struct mlx5_alloc_pd_resp {
diff --git a/src/mlx5.c b/src/mlx5.c
index d7a6a8f..2d3f9d9 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -555,6 +555,30 @@ static int mlx5_cmd_get_context(struct mlx5_context *context,
 				   &resp->ibv_resp, resp_len);
 }
 
+static int mlx5_map_internal_clock(struct mlx5_device *mdev,
+				   struct ibv_context *ibv_ctx)
+{
+	struct mlx5_context *context = to_mctx(ibv_ctx);
+	void *hca_clock_page;
+	off_t offset = 0;
+
+	set_command(MLX5_MMAP_GET_CORE_CLOCK_CMD, &offset);
+	hca_clock_page = mmap(NULL, mdev->page_size,
+			      PROT_READ, MAP_SHARED, ibv_ctx->cmd_fd,
+			      mdev->page_size * offset);
+
+	if (hca_clock_page == MAP_FAILED) {
+		fprintf(stderr, PFX
+			"Warning: Timestamp available,\n"
+			"but failed to mmap() hca core clock page.\n");
+		return -1;
+	}
+
+	context->hca_core_clock = hca_clock_page +
+		(context->core_clock.offset & (mdev->page_size - 1));
+	return 0;
+}
+
 static int mlx5_init_context(struct verbs_device *vdev,
 			     struct ibv_context *ctx, int cmd_fd)
 {
@@ -683,6 +707,15 @@ static int mlx5_init_context(struct verbs_device *vdev,
 		context->bfs[j].uuarn = j;
 	}
 
+	context->hca_core_clock = NULL;
+	if (resp.response_length + sizeof(resp.ibv_resp) >=
+	    offsetof(struct mlx5_alloc_ucontext_resp, hca_core_clock_offset) +
+	    sizeof(resp.hca_core_clock_offset) &&
+	    resp.comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET) {
+		context->core_clock.offset = resp.hca_core_clock_offset;
+		mlx5_map_internal_clock(mdev, ctx);
+	}
+
 	mlx5_spinlock_init(&context->lock32);
 
 	context->prefer_bf = get_always_bf();
@@ -700,6 +733,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
 	verbs_set_ctx_op(v_ctx, create_srq_ex, mlx5_create_srq_ex);
 	verbs_set_ctx_op(v_ctx, get_srq_num, mlx5_get_srq_num);
 	verbs_set_ctx_op(v_ctx, query_device_ex, mlx5_query_device_ex);
+	verbs_set_ctx_op(v_ctx, query_rt_values, mlx5_query_rt_values);
 	verbs_set_ctx_op(v_ctx, ibv_create_flow, ibv_cmd_create_flow);
 	verbs_set_ctx_op(v_ctx, ibv_destroy_flow, ibv_cmd_destroy_flow);
 	verbs_set_ctx_op(v_ctx, create_cq_ex, mlx5_create_cq_ex);
@@ -742,6 +776,9 @@ static void mlx5_cleanup_context(struct verbs_device *device,
 		if (context->uar[i])
 			munmap(context->uar[i], page_size);
 	}
+	if (context->hca_core_clock)
+		munmap(context->hca_core_clock - context->core_clock.offset,
+		       page_size);
 	close_debug_file(context);
 }
 
diff --git a/src/mlx5.h b/src/mlx5.h
index 506ec0a..78357d3 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -117,7 +117,8 @@ enum {
 
 enum {
 	MLX5_MMAP_GET_REGULAR_PAGES_CMD    = 0,
-	MLX5_MMAP_GET_CONTIGUOUS_PAGES_CMD = 1
+	MLX5_MMAP_GET_CONTIGUOUS_PAGES_CMD = 1,
+	MLX5_MMAP_GET_CORE_CLOCK_CMD    = 5
 };
 
 enum {
@@ -328,6 +329,11 @@ struct mlx5_context {
 	uint8_t				cached_link_layer[MLX5_MAX_PORTS_NUM];
 	int				cached_device_cap_flags;
 	enum ibv_atomic_cap		atomic_cap;
+	struct {
+		uint64_t                offset;
+		uint64_t                mask;
+	} core_clock;
+	void			       *hca_core_clock;
 };
 
 struct mlx5_bitmap {
@@ -620,6 +626,8 @@ int mlx5_query_device_ex(struct ibv_context *context,
 			 const struct ibv_query_device_ex_input *input,
 			 struct ibv_device_attr_ex *attr,
 			 size_t attr_size);
+int mlx5_query_rt_values(struct ibv_context *context,
+			 struct ibv_values_ex *values);
 struct ibv_qp *mlx5_create_qp_ex(struct ibv_context *context,
 				 struct ibv_qp_init_attr_ex *attr);
 int mlx5_query_port(struct ibv_context *context, uint8_t port,
diff --git a/src/verbs.c b/src/verbs.c
index 6f2ef00..e8873da 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -79,6 +79,52 @@ int mlx5_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
 	return 0;
 }
 
+#define READL(ptr) (*((uint32_t *)(ptr)))
+static int mlx5_read_clock(struct ibv_context *context, uint64_t *cycles)
+{
+	unsigned int clockhi, clocklo, clockhi1;
+	int i;
+	struct mlx5_context *ctx = to_mctx(context);
+
+	if (!ctx->hca_core_clock)
+		return -EOPNOTSUPP;
+
+	/* Handle wraparound */
+	for (i = 0; i < 2; i++) {
+		clockhi = ntohl(READL(ctx->hca_core_clock));
+		clocklo = ntohl(READL(ctx->hca_core_clock + 4));
+		clockhi1 = ntohl(READL(ctx->hca_core_clock));
+		if (clockhi == clockhi1)
+			break;
+	}
+
+	*cycles = (uint64_t)clockhi << 32 | (uint64_t)clocklo;
+
+	return 0;
+}
+
+int mlx5_query_rt_values(struct ibv_context *context,
+			 struct ibv_values_ex *values)
+{
+	uint32_t comp_mask = 0;
+	int err = 0;
+
+	if (values->comp_mask & IBV_VALUES_MASK_RAW_CLOCK) {
+		uint64_t cycles;
+
+		err = mlx5_read_clock(context, &cycles);
+		if (!err) {
+			values->raw_clock.tv_sec = 0;
+			values->raw_clock.tv_nsec = cycles;
+			comp_mask |= IBV_VALUES_MASK_RAW_CLOCK;
+		}
+	}
+
+	values->comp_mask = comp_mask;
+
+	return err;
+}
+
 int mlx5_query_port(struct ibv_context *context, uint8_t port,
 		     struct ibv_port_attr *attr)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 7/7] Use configuration symbol for always in-line
       [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-06-01 13:48   ` [PATCH libmlx5 6/7] Add ibv_query_rt_values support Yishai Hadas
@ 2016-06-01 13:48   ` Yishai Hadas
       [not found]     ` <1464788882-1876-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  6 siblings, 1 reply; 22+ messages in thread
From: Yishai Hadas @ 2016-06-01 13:48 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

In order to be compiler agnostic, we add the required checks to the
autoconf script.

Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 Makefile.am                 |   1 +
 configure.ac                |   3 +
 m4/ax_gcc_func_attribute.m4 | 223 ++++++++++++++++++++++++++++++++++++++++++++
 src/cq.c                    |  24 ++---
 src/mlx5.h                  |   6 ++
 5 files changed, 245 insertions(+), 12 deletions(-)
 create mode 100644 m4/ax_gcc_func_attribute.m4

diff --git a/Makefile.am b/Makefile.am
index d44a4bc..39ca65d 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -1,4 +1,5 @@
 AM_CFLAGS = -g -Wall -D_GNU_SOURCE
+ACLOCAL_AMFLAGS = -I m4
 
 mlx5_version_script = @MLX5_VERSION_SCRIPT@
 
diff --git a/configure.ac b/configure.ac
index fca0b46..8ece649 100644
--- a/configure.ac
+++ b/configure.ac
@@ -6,6 +6,7 @@ AC_CONFIG_SRCDIR([src/mlx5.h])
 AC_CONFIG_AUX_DIR(config)
 AC_CONFIG_HEADER(config.h)
 AM_INIT_AUTOMAKE([foreign])
+AC_CONFIG_MACRO_DIR([m4])
 m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])
 
 AC_PROG_LIBTOOL
@@ -84,6 +85,8 @@ AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script,
         ac_cv_version_script=no
     fi])
 
+AX_GCC_FUNC_ATTRIBUTE([always_inline])
+
 if test $ac_cv_version_script = yes; then
     MLX5_VERSION_SCRIPT='-Wl,--version-script=$(srcdir)/src/mlx5.map'
 else
diff --git a/m4/ax_gcc_func_attribute.m4 b/m4/ax_gcc_func_attribute.m4
new file mode 100644
index 0000000..c788ca9
--- /dev/null
+++ b/m4/ax_gcc_func_attribute.m4
@@ -0,0 +1,223 @@
+# ===========================================================================
+#   http://www.gnu.org/software/autoconf-archive/ax_gcc_func_attribute.html
+# ===========================================================================
+#
+# SYNOPSIS
+#
+#   AX_GCC_FUNC_ATTRIBUTE(ATTRIBUTE)
+#
+# DESCRIPTION
+#
+#   This macro checks if the compiler supports one of GCC's function
+#   attributes; many other compilers also provide function attributes with
+#   the same syntax. Compiler warnings are used to detect supported
+#   attributes as unsupported ones are ignored by default so quieting
+#   warnings when using this macro will yield false positives.
+#
+#   The ATTRIBUTE parameter holds the name of the attribute to be checked.
+#
+#   If ATTRIBUTE is supported define HAVE_FUNC_ATTRIBUTE_<ATTRIBUTE>.
+#
+#   The macro caches its result in the ax_cv_have_func_attribute_<attribute>
+#   variable.
+#
+#   The macro currently supports the following function attributes:
+#
+#    alias
+#    aligned
+#    alloc_size
+#    always_inline
+#    artificial
+#    cold
+#    const
+#    constructor
+#    constructor_priority for constructor attribute with priority
+#    deprecated
+#    destructor
+#    dllexport
+#    dllimport
+#    error
+#    externally_visible
+#    flatten
+#    format
+#    format_arg
+#    gnu_inline
+#    hot
+#    ifunc
+#    leaf
+#    malloc
+#    noclone
+#    noinline
+#    nonnull
+#    noreturn
+#    nothrow
+#    optimize
+#    pure
+#    unused
+#    used
+#    visibility
+#    warning
+#    warn_unused_result
+#    weak
+#    weakref
+#
+#   Unsuppored function attributes will be tested with a prototype returning
+#   an int and not accepting any arguments and the result of the check might
+#   be wrong or meaningless so use with care.
+#
+# LICENSE
+#
+#   Copyright (c) 2013 Gabriele Svelto <gabriele.svelto-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
+#
+#   Copying and distribution of this file, with or without modification, are
+#   permitted in any medium without royalty provided the copyright notice
+#   and this notice are preserved.  This file is offered as-is, without any
+#   warranty.
+
+#serial 3
+
+AC_DEFUN([AX_GCC_FUNC_ATTRIBUTE], [
+    AS_VAR_PUSHDEF([ac_var], [ax_cv_have_func_attribute_$1])
+
+    AC_CACHE_CHECK([for __attribute__(($1))], [ac_var], [
+        AC_LINK_IFELSE([AC_LANG_PROGRAM([
+            m4_case([$1],
+                [alias], [
+                    int foo( void ) { return 0; }
+                    int bar( void ) __attribute__(($1("foo")));
+                ],
+                [aligned], [
+                    int foo( void ) __attribute__(($1(32)));
+                ],
+                [alloc_size], [
+                    void *foo(int a) __attribute__(($1(1)));
+                ],
+                [always_inline], [
+                    inline __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [artificial], [
+                    inline __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [cold], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [const], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [constructor_priority], [
+                    int foo( void ) __attribute__((__constructor__(65535/2)));
+                ],
+                [constructor], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [deprecated], [
+                    int foo( void ) __attribute__(($1("")));
+                ],
+                [destructor], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [dllexport], [
+                    __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [dllimport], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [error], [
+                    int foo( void ) __attribute__(($1("")));
+                ],
+                [externally_visible], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [flatten], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [format], [
+                    int foo(const char *p, ...) __attribute__(($1(printf, 1, 2)));
+                ],
+                [format_arg], [
+                    char *foo(const char *p) __attribute__(($1(1)));
+                ],
+                [gnu_inline], [
+                    inline __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [hot], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [ifunc], [
+                    int my_foo( void ) { return 0; }
+                    static int (*resolve_foo(void))(void) { return my_foo; }
+                    int foo( void ) __attribute__(($1("resolve_foo")));
+                ],
+                [leaf], [
+                    __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [malloc], [
+                    void *foo( void ) __attribute__(($1));
+                ],
+                [noclone], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [noinline], [
+                    __attribute__(($1)) int foo( void ) { return 0; }
+                ],
+                [nonnull], [
+                    int foo(char *p) __attribute__(($1(1)));
+                ],
+                [noreturn], [
+                    void foo( void ) __attribute__(($1));
+                ],
+                [nothrow], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [optimize], [
+                    __attribute__(($1(3))) int foo( void ) { return 0; }
+                ],
+                [pure], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [unused], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [used], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [visibility], [
+                    int foo_def( void ) __attribute__(($1("default")));
+                    int foo_hid( void ) __attribute__(($1("hidden")));
+                    int foo_int( void ) __attribute__(($1("internal")));
+                    int foo_pro( void ) __attribute__(($1("protected")));
+                ],
+                [warning], [
+                    int foo( void ) __attribute__(($1("")));
+                ],
+                [warn_unused_result], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [weak], [
+                    int foo( void ) __attribute__(($1));
+                ],
+                [weakref], [
+                    static int foo( void ) { return 0; }
+                    static int bar( void ) __attribute__(($1("foo")));
+                ],
+                [
+                 m4_warn([syntax], [Unsupported attribute $1, the test may fail])
+                 int foo( void ) __attribute__(($1));
+                ]
+            )], [])
+            ],
+            dnl GCC doesn't exit with an error if an unknown attribute is
+            dnl provided but only outputs a warning, so accept the attribute
+            dnl only if no warning were issued.
+            [AS_IF([test -s conftest.err],
+                [AS_VAR_SET([ac_var], [no])],
+                [AS_VAR_SET([ac_var], [yes])])],
+            [AS_VAR_SET([ac_var], [no])])
+    ])
+
+    AS_IF([test yes = AS_VAR_GET([ac_var])],
+        [AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_FUNC_ATTRIBUTE_$1), 1,
+            [Define to 1 if the system has the `$1' function attribute])], [])
+
+    AS_VAR_POPDEF([ac_var])
+])
diff --git a/src/cq.c b/src/cq.c
index de91f07..be66cfd 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -436,7 +436,7 @@ static void mlx5_get_cycles(uint64_t *cycles)
 static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
 					      struct mlx5_resource **cur_rsc,
 					      uint32_t rsn, int cqe_ver)
-					      __attribute__((always_inline));
+					      ALWAYS_INLINE;
 static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
 					      struct mlx5_resource **cur_rsc,
 					      uint32_t rsn, int cqe_ver)
@@ -452,7 +452,7 @@ static inline int get_resp_ctx_v1(struct mlx5_context *mctx,
 				  struct mlx5_resource **cur_rsc,
 				  struct mlx5_srq **cur_srq,
 				  uint32_t uidx, uint8_t *is_srq)
-				  __attribute__((always_inline));
+				  ALWAYS_INLINE;
 static inline int get_resp_ctx_v1(struct mlx5_context *mctx,
 				  struct mlx5_resource **cur_rsc,
 				  struct mlx5_srq **cur_srq,
@@ -488,7 +488,7 @@ static inline int get_resp_ctx_v1(struct mlx5_context *mctx,
 static inline int get_qp_ctx(struct mlx5_context *mctx,
 			     struct mlx5_resource **cur_rsc,
 			     uint32_t qpn)
-			       __attribute__((always_inline));
+			     ALWAYS_INLINE;
 static inline int get_qp_ctx(struct mlx5_context *mctx,
 			     struct mlx5_resource **cur_rsc,
 			     uint32_t qpn)
@@ -510,7 +510,7 @@ static inline int get_qp_ctx(struct mlx5_context *mctx,
 static inline int get_srq_ctx(struct mlx5_context *mctx,
 			      struct mlx5_srq **cur_srq,
 			      uint32_t srqn_uidx)
-			      __attribute__((always_inline));
+			      ALWAYS_INLINE;
 static inline int get_srq_ctx(struct mlx5_context *mctx,
 			      struct mlx5_srq **cur_srq,
 			      uint32_t srqn)
@@ -553,7 +553,7 @@ static inline int get_cur_rsc(struct mlx5_context *mctx,
 static inline int mlx5_get_next_cqe(struct mlx5_cq *cq,
 				    struct mlx5_cqe64 **pcqe64,
 				    void **pcqe)
-				    __attribute__((always_inline));
+				    ALWAYS_INLINE;
 static inline int mlx5_get_next_cqe(struct mlx5_cq *cq,
 				    struct mlx5_cqe64 **pcqe64,
 				    void **pcqe)
@@ -602,7 +602,7 @@ static inline int mlx5_parse_cqe(struct mlx5_cq *cq,
 				 struct mlx5_srq **cur_srq,
 				 struct ibv_wc *wc,
 				 int cqe_ver, int lazy)
-				 __attribute__((always_inline));
+				 ALWAYS_INLINE;
 static inline int mlx5_parse_cqe(struct mlx5_cq *cq,
 				 struct mlx5_cqe64 *cqe64,
 				 void *cqe,
@@ -754,7 +754,7 @@ static inline int mlx5_parse_cqe(struct mlx5_cq *cq,
 static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq,
 				      struct mlx5_cqe64 *cqe64,
 				      void *cqe, int cqe_ver)
-				      __attribute__((always_inline));
+				      ALWAYS_INLINE;
 static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq,
 				      struct mlx5_cqe64 *cqe64,
 				      void *cqe, int cqe_ver)
@@ -766,7 +766,7 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 				struct mlx5_resource **cur_rsc,
 				struct mlx5_srq **cur_srq,
 				struct ibv_wc *wc, int cqe_ver)
-				__attribute__((always_inline));
+				ALWAYS_INLINE;
 static inline int mlx5_poll_one(struct mlx5_cq *cq,
 				struct mlx5_resource **cur_rsc,
 				struct mlx5_srq **cur_srq,
@@ -785,7 +785,7 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
 
 static inline int poll_cq(struct ibv_cq *ibcq, int ne,
 		      struct ibv_wc *wc, int cqe_ver)
-		      __attribute__((always_inline));
+		      ALWAYS_INLINE;
 static inline int poll_cq(struct ibv_cq *ibcq, int ne,
 		      struct ibv_wc *wc, int cqe_ver)
 {
@@ -848,7 +848,7 @@ enum  polling_mode {
 
 static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq,
 				 int lock, enum polling_mode stall)
-				 __attribute__((always_inline));
+				 ALWAYS_INLINE;
 static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq,
 				 int lock, enum polling_mode stall)
 {
@@ -884,7 +884,7 @@ static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq,
 
 static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr,
 				  int lock, enum polling_mode stall, int cqe_version)
-				  __attribute__((always_inline));
+				  ALWAYS_INLINE;
 static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr,
 				  int lock, enum polling_mode stall, int cqe_version)
 {
@@ -952,7 +952,7 @@ static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_att
 
 static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq,
 				 enum polling_mode stall, int cqe_version)
-				 __attribute__((always_inline));
+				 ALWAYS_INLINE;
 static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq,
 				 enum polling_mode stall,
 				 int cqe_version)
diff --git a/src/mlx5.h b/src/mlx5.h
index 78357d3..124c8fe 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -107,6 +107,12 @@
 
 #define HIDDEN		__attribute__((visibility("hidden")))
 
+#ifdef HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE
+#define ALWAYS_INLINE __attribute__((always_inline))
+#else
+#define ALWAYS_INLINE
+#endif
+
 #define PFX		"mlx5: "
 
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 7/7] Use configuration symbol for always in-line
       [not found]     ` <1464788882-1876-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 16:10       ` Jason Gunthorpe
       [not found]         ` <20160601161055.GA15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2016-06-01 16:10 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 01, 2016 at 04:48:02PM +0300, Yishai Hadas wrote:

>  #define HIDDEN		__attribute__((visibility("hidden")))

If you are going to do one you might as well do them all...

BTW, it is strange to use visibility("hidden"), usually you'd use
-fvisibility=hidden during the compile and then use
visibility("default") only on the one or two functions that need to be
public.

Much safer.

> +#ifdef HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE
> +#define ALWAYS_INLINE __attribute__((always_inline))
> +#else
> +#define ALWAYS_INLINE
> +#endif


Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 7/7] Use configuration symbol for always in-line
       [not found]         ` <20160601161055.GA15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-06-01 16:15           ` Matan Barak (External)
       [not found]             ` <07a22f6a-eea9-201c-2c64-f51eb557a1d3-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Matan Barak (External) @ 2016-06-01 16:15 UTC (permalink / raw)
  To: Jason Gunthorpe, Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w

On 01/06/2016 19:10, Jason Gunthorpe wrote:
> On Wed, Jun 01, 2016 at 04:48:02PM +0300, Yishai Hadas wrote:
>
>>  #define HIDDEN		__attribute__((visibility("hidden")))
>
> If you are going to do one you might as well do them all...
>
> BTW, it is strange to use visibility("hidden"), usually you'd use
> -fvisibility=hidden during the compile and then use
> visibility("default") only on the one or two functions that need to be
> public.
>
> Much safer.
>

It wasn't added by this series :)
Actually, it isn't used in libmlx5 so it should be dropped.

>> +#ifdef HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE
>> +#define ALWAYS_INLINE __attribute__((always_inline))
>> +#else
>> +#define ALWAYS_INLINE
>> +#endif
>
>
> Jason
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 3/7] Add inline functions to read completion's attributes
       [not found]     ` <1464788882-1876-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 16:24       ` Jason Gunthorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Gunthorpe @ 2016-06-01 16:24 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 01, 2016 at 04:47:58PM +0300, Yishai Hadas wrote:
> Add inline functions in order to read various completion's
> attributes. These functions will be assigned in the ibv_cq_ex
> structure in order to allow the user to read the completion's
> attributes.

Did you look at using for these?

__attribute__((optimize("-fno-omit-frame-pointer")))

To help get rid of the stack frame?

> Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>  src/cq.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 121 insertions(+)
> 
> diff --git a/src/cq.c b/src/cq.c
> index a056787..188b34e 100644
> +++ b/src/cq.c
> @@ -850,6 +850,127 @@ int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
>  	return poll_cq(ibcq, ne, wc, 1);
>  }
>  
> +static inline enum ibv_wc_opcode mlx5_cq_read_wc_opcode(struct ibv_cq_ex *ibcq)
> +{
> +	fprintf(stderr, "un-expected opcode in cqe\n");

Something like this can be very expensive, depending on the version of
gcc it may force a stack frame to be created for all callers. You
should audit the assembly to be sure that gcc is not creating
stack frames.

> +static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex
> *ibcq)

You need to go through everything you've written and make sure it is
const-correct.
You should also annotate with restrict.

Doing this properly can increase performance by allowing the caller to
avoid reloads.

> +{
> +	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));

You might want to look at having the caller pass in 'cq' from a member
in ibv_cq_ex. That way the caller can hold the cq in a register
instead of constantly reloading it like this - I'm assuming to_mcq is
not just an container_of??

> +static inline int mlx5_cq_read_wc_flags(struct ibv_cq_ex *ibcq)
> +{
> +	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
> +	int wc_flags = 0;
> +
> +	if (cq->flags & MLX5_CQ_FLAGS_RX_CSUM_VALID)
> +		wc_flags = (!!(cq->cqe64->hds_ip_ext & MLX5_CQE_L4_OK) &
> +				 !!(cq->cqe64->hds_ip_ext & MLX5_CQE_L3_OK) &
> +				 (get_cqe_l3_hdr_type(cq->cqe64) ==
> +				  MLX5_CQE_L3_HDR_TYPE_IPV4)) <<
> +				IBV_WC_IP_CSUM_OK_SHIFT;
> +
> +	switch (cq->cqe64->op_own >> 4) {
> +	case MLX5_CQE_RESP_WR_IMM:
> +	case MLX5_CQE_RESP_SEND_IMM:
> +		wc_flags	|= IBV_WC_WITH_IMM;
> +		break;
> +	}
> +
> +	wc_flags |= ((ntohl(cq->cqe64->flags_rqpn) >> 28) & 3) ? IBV_WC_GRH : 0;
> +	return wc_flags;
> +}

There is a disappointing amount of branching here...

Anyhow, if there was any doubt that we could use a generic inline
scheme I think this patch puts it too rest, the work being done in
most of the accessors is fairly complex.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 7/7] Use configuration symbol for always in-line
       [not found]             ` <07a22f6a-eea9-201c-2c64-f51eb557a1d3-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 16:28               ` Jason Gunthorpe
       [not found]                 ` <20160601162818.GC15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2016-06-01 16:28 UTC (permalink / raw)
  To: Matan Barak (External)
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 01, 2016 at 07:15:35PM +0300, Matan Barak (External) wrote:

> It wasn't added by this series :)

No, but the autodetection of attributes was added. It doesn't make
sense to autodetect only one attribute. Don't do it at all or do it
properly, adding non-functional theoretical 'portability' crap is a
bad idea.

> Actually, it isn't used in libmlx5 so it should be dropped.

Make sure that nm --dynamic on libmlx5.so shows the provider is only
definining the one or two entry point symbols.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 7/7] Use configuration symbol for always in-line
       [not found]                 ` <20160601162818.GC15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-06-01 16:58                   ` Matan Barak
  0 siblings, 0 replies; 22+ messages in thread
From: Matan Barak @ 2016-06-01 16:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w

On 01/06/2016 19:28, Jason Gunthorpe wrote:
> On Wed, Jun 01, 2016 at 07:15:35PM +0300, Matan Barak (External) wrote:
>
>> It wasn't added by this series :)
>
> No, but the autodetection of attributes was added. It doesn't make
> sense to autodetect only one attribute. Don't do it at all or do it
> properly, adding non-functional theoretical 'portability' crap is a
> bad idea.
>

Agree, there are 3 __attribute__ in code:
1. hidden - not used in code, should be dropped. Cleaning unused code 
isn't related to this patch.
2. always_inline - used and checked, optional.
3. constructor - if the compiler doesn't support it, nothing will be 
able to allocate and populate the dev structure, so it's mandatory. It's 
probably more correct to check that it's available in configure.ac.

>> Actually, it isn't used in libmlx5 so it should be dropped.
>
> Make sure that nm --dynamic on libmlx5.so shows the provider is only
> definining the one or two entry point symbols.

No mlx5 symbols there. The registration is currently done through
static __attribute__((constructor)) void mlx5_register_driver(void)
{
         verbs_register_driver("mlx5", mlx5_driver_init);
}

>
> Jason
>

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 1/7] Refactor mlx5_poll_one
       [not found]     ` <1464788882-1876-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 18:39       ` Jason Gunthorpe
       [not found]         ` <20160601183906.GA3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2016-06-01 18:39 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 01, 2016 at 04:47:56PM +0300, Yishai Hadas wrote:
>  static void mlx5_handle_error_cqe(struct mlx5_err_cqe *cqe,
> -				  struct ibv_wc *wc)
> +				  enum ibv_wc_status *pstatus)
>  {

Yuk

 static enum ibv_wc_status mlx5_get_error_status(const struct mlx5_err_cqe *cqe)

>  	case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
> -		wc->status = IBV_WC_LOC_LEN_ERR;
> +		*pstatus = IBV_WC_LOC_LEN_ERR;
>  		break;

	case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
	        return IBV_WC_LOC_LEN_ERR;

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 5/7] Add support for creating an extended CQ
       [not found]     ` <1464788882-1876-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2016-06-01 18:58       ` Jason Gunthorpe
       [not found]         ` <20160601185856.GB3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Gunthorpe @ 2016-06-01 18:58 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, matanb-VPRAkNaXOzVWk0Htik3J/w,
	majd-VPRAkNaXOzVWk0Htik3J/w, talal-VPRAkNaXOzVWk0Htik3J/w

On Wed, Jun 01, 2016 at 04:48:00PM +0300, Yishai Hadas wrote:
> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> 
> This patch adds the support for creating an extended CQ.
> This means we support:
> - The new polling mechanism.
> - A CQ which is single threaded and by thus doesn't waste CPU cycles on locking.
> - Getting completion timestamp from the CQ.

I'm still very much of the opinion that extended CQs should only allow
compatible QP's to be added.

> +	if (mctx->cqe_version) {
> +		if (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED) {
> +			if (cq->stall_enable) {
> +				if (cq->stall_adaptive_enable) {
> +					cq->ibv_cq.start_poll =
> +						mlx5_start_poll_adaptive_stall_enable_v1;
> +					cq->ibv_cq.end_poll =
> +
> mlx5_end_poll_adaptive_stall_enable;

[..]

I feel like this sort of thing is going to show up in every driver :|

Maybe use a more tidy scheme:

#define SINGLE_THREADED BIT(0)
#define STALL BIT(1)
#define V1 BIT(2)
#define ADAPTIVE BIT(3)
static const struct ops[] =
{
[V1 | SINGLE_THREADED | STALL] = {
  .start_poll = &mlx5_start_poll_nonadaptive_stall_enable_v1 ,
  .end_poll = &mlx5_end_poll_nonadaptive_stall_enable,
  .next_poll = &mlx5_next_poll_v0,
  },
[..]
}

const struct op *poll_ops = &ps[
           (cq->stall_adaptive_enable?ADAPTIVE:0) |
           (mctx->cqe_version?V1:0) |
           (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED?SINGLE_THREADED:0) |
	   (cq->stall_enable?STALL:0)];

BTW, you may want to use C++ for some of this function replication
stuff.

C++ function templates allow the construction of perfect code like
always_inline does, but more directly and without so much hassle to
trigger the always_inline behaviour.

The idiom looks like this:

template <bool SINGLE_THREADED,bool STALL,bool ADAPTIVE>
static void mlx_start_poll(...)
{
   if (!SINGLE_THREADED)
     lock(..)
}

.start_poll = &mlx_start_poll<false,false,false>
.start_poll = &mlx_start_poll<true,false,false>

The compiler will create a unique mlx_start_poll function for each
combination of template arguments and then fully optimize each one
treating the template argument as a compile-time constant - full dead
code removal, etc.

This is probably much easier to understand than all the hand coded
versions.. But not for everyone of course..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 1/7] Refactor mlx5_poll_one
       [not found]         ` <20160601183906.GA3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-06-02  8:13           ` Matan Barak
  0 siblings, 0 replies; 22+ messages in thread
From: Matan Barak @ 2016-06-02  8:13 UTC (permalink / raw)
  To: Jason Gunthorpe, Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w

On 01/06/2016 21:39, Jason Gunthorpe wrote:
> On Wed, Jun 01, 2016 at 04:47:56PM +0300, Yishai Hadas wrote:
>>  static void mlx5_handle_error_cqe(struct mlx5_err_cqe *cqe,
>> -				  struct ibv_wc *wc)
>> +				  enum ibv_wc_status *pstatus)
>>  {
>
> Yuk
>
>  static enum ibv_wc_status mlx5_get_error_status(const struct mlx5_err_cqe *cqe)
>

Yeah, that's better.
Thanks.

>>  	case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
>> -		wc->status = IBV_WC_LOC_LEN_ERR;
>> +		*pstatus = IBV_WC_LOC_LEN_ERR;
>>  		break;
>
> 	case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
> 	        return IBV_WC_LOC_LEN_ERR;
>
> Jason
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 5/7] Add support for creating an extended CQ
       [not found]         ` <20160601185856.GB3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2016-06-02  8:40           ` Matan Barak
  0 siblings, 0 replies; 22+ messages in thread
From: Matan Barak @ 2016-06-02  8:40 UTC (permalink / raw)
  To: Jason Gunthorpe, Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w,
	talal-VPRAkNaXOzVWk0Htik3J/w

On 01/06/2016 21:58, Jason Gunthorpe wrote:
> On Wed, Jun 01, 2016 at 04:48:00PM +0300, Yishai Hadas wrote:
>> From: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> This patch adds the support for creating an extended CQ.
>> This means we support:
>> - The new polling mechanism.
>> - A CQ which is single threaded and by thus doesn't waste CPU cycles on locking.
>> - Getting completion timestamp from the CQ.
>
> I'm still very much of the opinion that extended CQs should only allow
> compatible QP's to be added.
>
>> +	if (mctx->cqe_version) {
>> +		if (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED) {
>> +			if (cq->stall_enable) {
>> +				if (cq->stall_adaptive_enable) {
>> +					cq->ibv_cq.start_poll =
>> +						mlx5_start_poll_adaptive_stall_enable_v1;
>> +					cq->ibv_cq.end_poll =
>> +
>> mlx5_end_poll_adaptive_stall_enable;
>
> [..]
>
> I feel like this sort of thing is going to show up in every driver :|
>
> Maybe use a more tidy scheme:
>
> #define SINGLE_THREADED BIT(0)
> #define STALL BIT(1)
> #define V1 BIT(2)
> #define ADAPTIVE BIT(3)
> static const struct ops[] =
> {
> [V1 | SINGLE_THREADED | STALL] = {
>   .start_poll = &mlx5_start_poll_nonadaptive_stall_enable_v1 ,
>   .end_poll = &mlx5_end_poll_nonadaptive_stall_enable,
>   .next_poll = &mlx5_next_poll_v0,
>   },
> [..]
> }
>
> const struct op *poll_ops = &ps[
>            (cq->stall_adaptive_enable?ADAPTIVE:0) |
>            (mctx->cqe_version?V1:0) |
>            (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED?SINGLE_THREADED:0) |
> 	   (cq->stall_enable?STALL:0)];
>

Yep, that looks better.

> BTW, you may want to use C++ for some of this function replication
> stuff.
>
> C++ function templates allow the construction of perfect code like
> always_inline does, but more directly and without so much hassle to
> trigger the always_inline behaviour.
>
> The idiom looks like this:
>
> template <bool SINGLE_THREADED,bool STALL,bool ADAPTIVE>
> static void mlx_start_poll(...)
> {
>    if (!SINGLE_THREADED)
>      lock(..)
> }
>
> .start_poll = &mlx_start_poll<false,false,false>
> .start_poll = &mlx_start_poll<true,false,false>
>
> The compiler will create a unique mlx_start_poll function for each
> combination of template arguments and then fully optimize each one
> treating the template argument as a compile-time constant - full dead
> code removal, etc.

Yeah, I know templates could fit in really well here, but I wanted to 
stay with C code.
Hypothetically speaking, if we go to C++, this could work really well too:

class no_lock {
public:
	static inline void lock(struct mlx5_cq *cq)
	{
	};
	static inline void unlock(struct mlx5_cq *cq)
	{
	};
};

class lock {
public:
	static inline void lock(struct mlx5_cq *cq)
	{
		mlx5_spin_lock(&cq->lock);
	};
	static inline void unlock(struct mlx5_cq *cq)
	{
		mlx5_spin_unlock(&cq->lock);
	};
};

template <class lock, class stall>
static void mlx_start_poll(...)
{
	lock::lock(mcq);
	....
}

Even better, lock object could get the spinlock in the ctor and unlock 
it automatically in dtor.

 >
>
> This is probably much easier to understand than all the hand coded
> versions.. But not for everyone of course..
>
> Jason
>

Thanks for the review.

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 0/7] Completion timestamping
       [not found]     ` <5649FA3F.2050209-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
@ 2015-11-16 16:30       ` Matan Barak
  0 siblings, 0 replies; 22+ messages in thread
From: Matan Barak @ 2015-11-16 16:30 UTC (permalink / raw)
  To: Tom Talpey
  Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
	Eran Ben Elisha, Christoph Lameter

On Mon, Nov 16, 2015 at 5:46 PM, Tom Talpey <tom-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org> wrote:
> On 11/15/2015 7:30 AM, Matan Barak wrote:
>>
>> This series adds support for completion timestamp. In order to
>> support this feature, several extended verbs were implemented
>> (as instructed in libibverbs).
>
>
> Can you describe what these timestamps are actually for? It's not
> clear at all from the comments. I'm assuming they are for some sort
> of fine-grained statistics? Are they purely for userspace consumers?
>
>

Completion timestamps could be used for various things. For examples,
applications could use them for ordering (packet x came before packet
y), measuring the delta (in raw cycles) between two packets
(benchmarking) and between an event (for example, the call to
ibv_post_send) and its respective WQE.
The times are given in raw cycles time. The frequency of the
respective timer is given in the extended verb of ibv_query_device,
but this isn't system time just another form of raw time converted to
nanoseconds.

We currently implemented only the control path in kernel, so it's for
userspace consumers only. However, implementing it in kernel shouldn't
be hard.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 0/7] Completion timestamping
       [not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-11-15 18:11   ` Christoph Lameter
@ 2015-11-16 15:46   ` Tom Talpey
       [not found]     ` <5649FA3F.2050209-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 22+ messages in thread
From: Tom Talpey @ 2015-11-16 15:46 UTC (permalink / raw)
  To: Matan Barak, Eli Cohen
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
	Christoph Lameter

On 11/15/2015 7:30 AM, Matan Barak wrote:
> This series adds support for completion timestamp. In order to
> support this feature, several extended verbs were implemented
> (as instructed in libibverbs).

Can you describe what these timestamps are actually for? It's not
clear at all from the comments. I'm assuming they are for some sort
of fine-grained statistics? Are they purely for userspace consumers?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 0/7] Completion timestamping
       [not found]     ` <alpine.DEB.2.20.1511151209290.31074-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
@ 2015-11-16  9:07       ` Matan Barak
  0 siblings, 0 replies; 22+ messages in thread
From: Matan Barak @ 2015-11-16  9:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford, Eran Ben Elisha

On Sun, Nov 15, 2015 at 8:11 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote:
> On Sun, 15 Nov 2015, Matan Barak wrote:
>
>> This series adds support for completion timestamp. In order to
>> support this feature, several extended verbs were implemented
>> (as instructed in libibverbs).
>
> This is the portion that
> implements timestaping for libmlx5 and this patchset depends on another
> one that needs to be merged into libibverbs.
>
> Right?
>

Yeah - good point.
This series depends on '[PATCH libibverbs 0/5] Completion
timestamping' and is rebased above '[PATCH libmlx5 v1 0/5] Support CQE
versions'.


> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH libmlx5 0/7] Completion timestamping
       [not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-15 18:11   ` Christoph Lameter
       [not found]     ` <alpine.DEB.2.20.1511151209290.31074-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
  2015-11-16 15:46   ` Tom Talpey
  1 sibling, 1 reply; 22+ messages in thread
From: Christoph Lameter @ 2015-11-15 18:11 UTC (permalink / raw)
  To: Matan Barak
  Cc: Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford,
	Eran Ben Elisha

On Sun, 15 Nov 2015, Matan Barak wrote:

> This series adds support for completion timestamp. In order to
> support this feature, several extended verbs were implemented
> (as instructed in libibverbs).

This is the portion that
implements timestaping for libmlx5 and this patchset depends on another
one that needs to be merged into libibverbs.

Right?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH libmlx5 0/7] Completion timestamping
@ 2015-11-15 12:30 Matan Barak
       [not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
  To: Eli Cohen
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
	Eran Ben Elisha, Christoph Lameter

Hi Eli,

This series adds support for completion timestamp. In order to
support this feature, several extended verbs were implemented
(as instructed in libibverbs).

ibv_query_device_ex was extended to support reading the
hca_core_clock and timestamp mask. The same verb was extended
with vendor dependant data which is used in order to map the
HCA's free running clock register.
When libmlx5 initializes, it tries to mmap this free running
clock register. This mapping is used in order to implement
ibv_query_values_ex efficiently.

In order to support CQ completion timestmap reporting, we implement
ibv_create_cq_ex verb. This verb is used both for creating a CQ
which supports timestamp and in order to state which fields should
be returned via WC. Returning this data is done via implementing
ibv_poll_cq_ex. We query the CQ requested wc_flags for every field
the user has requested and populate it according to the carried
network operation and WC status.

Last but not least, ibv_poll_cq_ex was optimized in order to eliminate
the if statements and or operations for common combinations of wc
fields. This is done by inlining and using a custom poll_one_ex
function for these fields.

Thanks,
Matan

Matan Barak (7):
  Add timestamp support query_device_ex
  Add ibv_poll_cq_ex support
  Add timestmap support for ibv_poll_cq_ex
  Add ibv_create_cq_ex support
  Add ibv_query_values support
  Optimize poll_cq
  Add always_inline check

 configure.ac   |  17 +
 src/cq.c       | 959 ++++++++++++++++++++++++++++++++++++++++++++++++---------
 src/mlx5-abi.h |   9 +
 src/mlx5.c     |  44 +++
 src/mlx5.h     |  46 ++-
 src/verbs.c    | 153 ++++++++-
 6 files changed, 1073 insertions(+), 155 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-06-02  8:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01 13:47 [PATCH libmlx5 0/7] Completion timestamping Yishai Hadas
     [not found] ` <1464788882-1876-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 13:47   ` [PATCH libmlx5 1/7] Refactor mlx5_poll_one Yishai Hadas
     [not found]     ` <1464788882-1876-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 18:39       ` Jason Gunthorpe
     [not found]         ` <20160601183906.GA3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-02  8:13           ` Matan Barak
2016-06-01 13:47   ` [PATCH libmlx5 2/7] Add lazy CQ polling Yishai Hadas
2016-06-01 13:47   ` [PATCH libmlx5 3/7] Add inline functions to read completion's attributes Yishai Hadas
     [not found]     ` <1464788882-1876-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 16:24       ` Jason Gunthorpe
2016-06-01 13:47   ` [PATCH libmlx5 4/7] Add ability to poll CQs through iterator's style API Yishai Hadas
2016-06-01 13:48   ` [PATCH libmlx5 5/7] Add support for creating an extended CQ Yishai Hadas
     [not found]     ` <1464788882-1876-6-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 18:58       ` Jason Gunthorpe
     [not found]         ` <20160601185856.GB3471-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-02  8:40           ` Matan Barak
2016-06-01 13:48   ` [PATCH libmlx5 6/7] Add ibv_query_rt_values support Yishai Hadas
2016-06-01 13:48   ` [PATCH libmlx5 7/7] Use configuration symbol for always in-line Yishai Hadas
     [not found]     ` <1464788882-1876-8-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 16:10       ` Jason Gunthorpe
     [not found]         ` <20160601161055.GA15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-01 16:15           ` Matan Barak (External)
     [not found]             ` <07a22f6a-eea9-201c-2c64-f51eb557a1d3-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2016-06-01 16:28               ` Jason Gunthorpe
     [not found]                 ` <20160601162818.GC15186-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-06-01 16:58                   ` Matan Barak
  -- strict thread matches above, loose matches on Subject: below --
2015-11-15 12:30 [PATCH libmlx5 0/7] Completion timestamping Matan Barak
     [not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-15 18:11   ` Christoph Lameter
     [not found]     ` <alpine.DEB.2.20.1511151209290.31074-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2015-11-16  9:07       ` Matan Barak
2015-11-16 15:46   ` Tom Talpey
     [not found]     ` <5649FA3F.2050209-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
2015-11-16 16:30       ` Matan Barak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.