* [PATCH libmlx5 1/7] Add timestamp support query_device_ex
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 2/7] Add ibv_poll_cq_ex support Matan Barak
` (7 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
Add timestamp support for query_device extended verb.
This is necessary in order to support hca_core_clock and
timestamp_mask. In addition, hca_core_clock_offset is added
to the vendor specific part in order to map the cycles register
correctly.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/mlx5-abi.h | 9 +++++++++
src/mlx5.h | 8 ++++++++
src/verbs.c | 42 ++++++++++++++++++++++++++++++++++--------
3 files changed, 51 insertions(+), 8 deletions(-)
diff --git a/src/mlx5-abi.h b/src/mlx5-abi.h
index c2490a5..97dfeec 100644
--- a/src/mlx5-abi.h
+++ b/src/mlx5-abi.h
@@ -165,8 +165,17 @@ struct mlx5_query_device_ex {
struct ibv_query_device_ex ibv_cmd;
};
+enum query_device_resp_mask {
+ QUERY_DEVICE_RESP_MASK_TIMESTAMP = 1UL << 0,
+};
+
struct mlx5_query_device_ex_resp {
struct ibv_query_device_resp_ex ibv_resp;
+ struct {
+ uint32_t comp_mask;
+ uint32_t response_length;
+ uint64_t hca_core_clock_offset;
+ };
};
#endif /* MLX4_ABI_H */
diff --git a/src/mlx5.h b/src/mlx5.h
index b57c7c7..325e07b 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -307,6 +307,10 @@ struct mlx5_context {
struct mlx5_spinlock hugetlb_lock;
struct list_head hugetlb_list;
uint8_t cqe_version;
+ struct {
+ uint64_t offset;
+ uint64_t mask;
+ } core_clock;
};
struct mlx5_bitmap {
@@ -576,6 +580,10 @@ void mlx5_free_db(struct mlx5_context *context, uint32_t *db);
int mlx5_query_device(struct ibv_context *context,
struct ibv_device_attr *attr);
+int _mlx5_query_device_ex(struct ibv_context *context,
+ const struct ibv_query_device_ex_input *input,
+ struct ibv_device_attr_ex *attr, size_t attr_size,
+ uint32_t *comp_mask);
int mlx5_query_device_ex(struct ibv_context *context,
const struct ibv_query_device_ex_input *input,
struct ibv_device_attr_ex *attr,
diff --git a/src/verbs.c b/src/verbs.c
index 92f273d..4c054f1 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -1475,10 +1475,10 @@ struct ibv_srq *mlx5_create_srq_ex(struct ibv_context *context,
return NULL;
}
-int mlx5_query_device_ex(struct ibv_context *context,
- const struct ibv_query_device_ex_input *input,
- struct ibv_device_attr_ex *attr,
- size_t attr_size)
+int _mlx5_query_device_ex(struct ibv_context *context,
+ const struct ibv_query_device_ex_input *input,
+ struct ibv_device_attr_ex *attr,
+ size_t attr_size, uint32_t *comp_mask)
{
struct mlx5_query_device_ex_resp resp;
struct mlx5_query_device_ex cmd;
@@ -1493,10 +1493,19 @@ int mlx5_query_device_ex(struct ibv_context *context,
memset(&resp, 0, sizeof(resp));
err = ibv_cmd_query_device_ex(context, input, attr, attr_size,
&raw_fw_ver, &cmd.ibv_cmd, sizeof(cmd.ibv_cmd),
- sizeof(cmd), &resp.ibv_resp, sizeof(resp),
- sizeof(resp.ibv_resp));
- if (err)
- return err;
+ sizeof(cmd), &resp.ibv_resp,
+ sizeof(resp.ibv_resp), sizeof(resp));
+ if (err) {
+ err = ibv_cmd_query_device_ex(context, input, attr, attr_size,
+ &raw_fw_ver, &cmd.ibv_cmd,
+ sizeof(cmd.ibv_cmd),
+ sizeof(cmd.ibv_cmd),
+ &resp.ibv_resp,
+ sizeof(resp.ibv_resp),
+ sizeof(resp.ibv_resp));
+ if (err)
+ return err;
+ }
major = (raw_fw_ver >> 32) & 0xffff;
minor = (raw_fw_ver >> 16) & 0xffff;
@@ -1505,5 +1514,22 @@ int mlx5_query_device_ex(struct ibv_context *context,
snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d",
major, minor, sub_minor);
+ if (resp.comp_mask & QUERY_DEVICE_RESP_MASK_TIMESTAMP &&
+ resp.response_length >= (offsetof(typeof(resp), hca_core_clock_offset) +
+ sizeof(resp.hca_core_clock_offset) -
+ sizeof(resp.ibv_resp)))
+ to_mctx(context)->core_clock.offset =
+ resp.hca_core_clock_offset;
+
+ if (comp_mask)
+ *comp_mask = resp.comp_mask;
+
return 0;
}
+
+int mlx5_query_device_ex(struct ibv_context *context,
+ const struct ibv_query_device_ex_input *input,
+ struct ibv_device_attr_ex *attr, size_t attr_size)
+{
+ return _mlx5_query_device_ex(context, input, attr, attr_size, NULL);
+}
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 2/7] Add ibv_poll_cq_ex support
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-15 12:30 ` [PATCH libmlx5 1/7] Add timestamp support query_device_ex Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 3/7] Add timestmap support for ibv_poll_cq_ex Matan Barak
` (6 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
Extended poll_cq supports writing only user's required work completion
fields. Adding support for this extended verb.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/cq.c | 699 +++++++++++++++++++++++++++++++++++++++++++++++++------------
src/mlx5.c | 5 +
src/mlx5.h | 14 ++
3 files changed, 584 insertions(+), 134 deletions(-)
diff --git a/src/cq.c b/src/cq.c
index 32f0dd4..0185696 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -200,6 +200,85 @@ static void handle_good_req(struct ibv_wc *wc, struct mlx5_cqe64 *cqe)
}
}
+union wc_buffer {
+ uint8_t *b8;
+ uint16_t *b16;
+ uint32_t *b32;
+ uint64_t *b64;
+};
+
+static inline void handle_good_req_ex(struct ibv_wc_ex *wc_ex,
+ union wc_buffer *pwc_buffer,
+ struct mlx5_cqe64 *cqe,
+ uint64_t wc_flags,
+ uint32_t qpn)
+{
+ union wc_buffer wc_buffer = *pwc_buffer;
+
+ switch (ntohl(cqe->sop_drop_qpn) >> 24) {
+ case MLX5_OPCODE_RDMA_WRITE_IMM:
+ wc_ex->wc_flags |= IBV_WC_EX_IMM;
+ case MLX5_OPCODE_RDMA_WRITE:
+ wc_ex->opcode = IBV_WC_RDMA_WRITE;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_OPCODE_SEND_IMM:
+ wc_ex->wc_flags |= IBV_WC_EX_IMM;
+ case MLX5_OPCODE_SEND:
+ case MLX5_OPCODE_SEND_INVAL:
+ wc_ex->opcode = IBV_WC_SEND;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_OPCODE_RDMA_READ:
+ wc_ex->opcode = IBV_WC_RDMA_READ;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ *wc_buffer.b32++ = ntohl(cqe->byte_cnt);
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_OPCODE_ATOMIC_CS:
+ wc_ex->opcode = IBV_WC_COMP_SWAP;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ *wc_buffer.b32++ = 8;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_OPCODE_ATOMIC_FA:
+ wc_ex->opcode = IBV_WC_FETCH_ADD;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ *wc_buffer.b32++ = 8;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_OPCODE_BIND_MW:
+ wc_ex->opcode = IBV_WC_BIND_MW;
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ }
+
+ if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ *wc_buffer.b32++ = qpn;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ }
+
+ *pwc_buffer = wc_buffer;
+}
+
static int handle_responder(struct ibv_wc *wc, struct mlx5_cqe64 *cqe,
struct mlx5_qp *qp, struct mlx5_srq *srq)
{
@@ -262,6 +341,103 @@ static int handle_responder(struct ibv_wc *wc, struct mlx5_cqe64 *cqe,
return IBV_WC_SUCCESS;
}
+static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
+ union wc_buffer *pwc_buffer,
+ struct mlx5_cqe64 *cqe,
+ struct mlx5_qp *qp, struct mlx5_srq *srq,
+ uint64_t wc_flags, uint32_t qpn)
+{
+ uint16_t wqe_ctr;
+ struct mlx5_wq *wq;
+ uint8_t g;
+ union wc_buffer wc_buffer = *pwc_buffer;
+ int err = 0;
+ uint32_t byte_len = ntohl(cqe->byte_cnt);
+
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ *wc_buffer.b32++ = byte_len;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ }
+ if (srq) {
+ wqe_ctr = ntohs(cqe->wqe_counter);
+ wc_ex->wr_id = srq->wrid[wqe_ctr];
+ mlx5_free_srq_wqe(srq, wqe_ctr);
+ if (cqe->op_own & MLX5_INLINE_SCATTER_32)
+ err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe,
+ byte_len);
+ else if (cqe->op_own & MLX5_INLINE_SCATTER_64)
+ err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe - 1,
+ byte_len);
+ } else {
+ wq = &qp->rq;
+ wqe_ctr = wq->tail & (wq->wqe_cnt - 1);
+ wc_ex->wr_id = wq->wrid[wqe_ctr];
+ ++wq->tail;
+ if (cqe->op_own & MLX5_INLINE_SCATTER_32)
+ err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe,
+ byte_len);
+ else if (cqe->op_own & MLX5_INLINE_SCATTER_64)
+ err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe - 1,
+ byte_len);
+ }
+ if (err)
+ return err;
+
+ switch (cqe->op_own >> 4) {
+ case MLX5_CQE_RESP_WR_IMM:
+ wc_ex->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
+ wc_ex->wc_flags = IBV_WC_EX_IMM;
+ if (wc_flags & IBV_WC_EX_WITH_IMM) {
+ *wc_buffer.b32++ = ntohl(cqe->byte_cnt);
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_IMM;
+ }
+ break;
+ case MLX5_CQE_RESP_SEND:
+ wc_ex->opcode = IBV_WC_RECV;
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ break;
+ case MLX5_CQE_RESP_SEND_IMM:
+ wc_ex->opcode = IBV_WC_RECV;
+ wc_ex->wc_flags = IBV_WC_EX_WITH_IMM;
+ if (wc_flags & IBV_WC_EX_WITH_IMM) {
+ *wc_buffer.b32++ = ntohl(cqe->imm_inval_pkey);
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_IMM;
+ }
+ break;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ *wc_buffer.b32++ = qpn;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_SRC_QP) {
+ *wc_buffer.b32++ = ntohl(cqe->flags_rqpn) & 0xffffff;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_SRC_QP;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX) {
+ *wc_buffer.b16++ = ntohl(cqe->imm_inval_pkey) & 0xffff;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_PKEY_INDEX;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_SLID) {
+ *wc_buffer.b16++ = ntohs(cqe->slid);
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_SLID;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_SL) {
+ *wc_buffer.b8++ = (ntohl(cqe->flags_rqpn) >> 24) & 0xf;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_SL;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) {
+ *wc_buffer.b8++ = cqe->ml_path & 0x7f;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_DLID_PATH_BITS;
+ }
+
+ g = (ntohl(cqe->flags_rqpn) >> 28) & 3;
+ wc_ex->wc_flags |= g ? IBV_WC_EX_GRH : 0;
+
+ *pwc_buffer = wc_buffer;
+ return IBV_WC_SUCCESS;
+}
+
static void dump_cqe(FILE *fp, void *buf)
{
uint32_t *p = buf;
@@ -273,54 +449,55 @@ static void dump_cqe(FILE *fp, void *buf)
}
static void mlx5_handle_error_cqe(struct mlx5_err_cqe *cqe,
- struct ibv_wc *wc)
+ uint32_t *pwc_status,
+ uint32_t *pwc_vendor_err)
{
switch (cqe->syndrome) {
case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR:
- wc->status = IBV_WC_LOC_LEN_ERR;
+ *pwc_status = IBV_WC_LOC_LEN_ERR;
break;
case MLX5_CQE_SYNDROME_LOCAL_QP_OP_ERR:
- wc->status = IBV_WC_LOC_QP_OP_ERR;
+ *pwc_status = IBV_WC_LOC_QP_OP_ERR;
break;
case MLX5_CQE_SYNDROME_LOCAL_PROT_ERR:
- wc->status = IBV_WC_LOC_PROT_ERR;
+ *pwc_status = IBV_WC_LOC_PROT_ERR;
break;
case MLX5_CQE_SYNDROME_WR_FLUSH_ERR:
- wc->status = IBV_WC_WR_FLUSH_ERR;
+ *pwc_status = IBV_WC_WR_FLUSH_ERR;
break;
case MLX5_CQE_SYNDROME_MW_BIND_ERR:
- wc->status = IBV_WC_MW_BIND_ERR;
+ *pwc_status = IBV_WC_MW_BIND_ERR;
break;
case MLX5_CQE_SYNDROME_BAD_RESP_ERR:
- wc->status = IBV_WC_BAD_RESP_ERR;
+ *pwc_status = IBV_WC_BAD_RESP_ERR;
break;
case MLX5_CQE_SYNDROME_LOCAL_ACCESS_ERR:
- wc->status = IBV_WC_LOC_ACCESS_ERR;
+ *pwc_status = IBV_WC_LOC_ACCESS_ERR;
break;
case MLX5_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR:
- wc->status = IBV_WC_REM_INV_REQ_ERR;
+ *pwc_status = IBV_WC_REM_INV_REQ_ERR;
break;
case MLX5_CQE_SYNDROME_REMOTE_ACCESS_ERR:
- wc->status = IBV_WC_REM_ACCESS_ERR;
+ *pwc_status = IBV_WC_REM_ACCESS_ERR;
break;
case MLX5_CQE_SYNDROME_REMOTE_OP_ERR:
- wc->status = IBV_WC_REM_OP_ERR;
+ *pwc_status = IBV_WC_REM_OP_ERR;
break;
case MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR:
- wc->status = IBV_WC_RETRY_EXC_ERR;
+ *pwc_status = IBV_WC_RETRY_EXC_ERR;
break;
case MLX5_CQE_SYNDROME_RNR_RETRY_EXC_ERR:
- wc->status = IBV_WC_RNR_RETRY_EXC_ERR;
+ *pwc_status = IBV_WC_RNR_RETRY_EXC_ERR;
break;
case MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR:
- wc->status = IBV_WC_REM_ABORT_ERR;
+ *pwc_status = IBV_WC_REM_ABORT_ERR;
break;
default:
- wc->status = IBV_WC_GENERAL_ERR;
+ *pwc_status = IBV_WC_GENERAL_ERR;
break;
}
- wc->vendor_err = cqe->vendor_err_synd;
+ *pwc_vendor_err = cqe->vendor_err_synd;
}
#if defined(__x86_64__) || defined (__i386__)
@@ -453,6 +630,171 @@ static inline int get_srq_ctx(struct mlx5_context *mctx,
return CQ_OK;
}
+static inline void dump_cqe_debug(FILE *fp, struct mlx5_cqe64 *cqe64)
+ __attribute__((always_inline));
+static inline void dump_cqe_debug(FILE *fp, struct mlx5_cqe64 *cqe64)
+{
+#ifdef MLX5_DEBUG
+ if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) {
+ mlx5_dbg(fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn);
+ dump_cqe(fp, cqe64);
+ }
+#endif
+}
+
+inline int mlx5_poll_one_cqe_req(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ void *cqe, uint32_t qpn, int cqe_ver,
+ uint64_t *wr_id) __attribute__((always_inline));
+inline int mlx5_poll_one_cqe_req(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ void *cqe, uint32_t qpn, int cqe_ver,
+ uint64_t *wr_id)
+{
+ struct mlx5_context *mctx = to_mctx(cq->ibv_cq.context);
+ struct mlx5_qp *mqp = NULL;
+ struct mlx5_cqe64 *cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
+ uint32_t byte_len = ntohl(cqe64->byte_cnt);
+ struct mlx5_wq *wq;
+ uint16_t wqe_ctr;
+ int err;
+ int idx;
+
+ mqp = get_req_context(mctx, cur_rsc,
+ (cqe_ver ? (ntohl(cqe64->srqn_uidx) & 0xffffff) : qpn),
+ cqe_ver);
+ if (unlikely(!mqp))
+ return CQ_POLL_ERR;
+ wq = &mqp->sq;
+ wqe_ctr = ntohs(cqe64->wqe_counter);
+ idx = wqe_ctr & (wq->wqe_cnt - 1);
+ if (cqe64->op_own & MLX5_INLINE_SCATTER_32)
+ err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe,
+ byte_len);
+ else if (cqe64->op_own & MLX5_INLINE_SCATTER_64)
+ err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe - 1,
+ byte_len);
+ else
+ err = 0;
+
+ wq->tail = wq->wqe_head[idx] + 1;
+ *wr_id = wq->wrid[idx];
+
+ return err;
+}
+
+inline int mlx5_poll_one_cqe_resp(struct mlx5_context *mctx,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct mlx5_cqe64 *cqe64, int cqe_ver,
+ uint32_t qpn, int *is_srq)
+ __attribute__((always_inline));
+inline int mlx5_poll_one_cqe_resp(struct mlx5_context *mctx,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct mlx5_cqe64 *cqe64, int cqe_ver,
+ uint32_t qpn, int *is_srq)
+{
+ uint32_t srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
+ int err;
+
+ if (cqe_ver) {
+ err = get_resp_cxt_v1(mctx, cur_rsc, cur_srq, srqn_uidx, is_srq);
+ } else {
+ if (srqn_uidx) {
+ err = get_srq_ctx(mctx, cur_srq, srqn_uidx);
+ *is_srq = 1;
+ } else {
+ err = get_resp_ctx(mctx, cur_rsc, qpn);
+ }
+ }
+
+ return err;
+}
+
+inline int mlx5_poll_one_cqe_err(struct mlx5_context *mctx,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct mlx5_cqe64 *cqe64, int cqe_ver,
+ uint32_t qpn, uint32_t *pwc_status,
+ uint32_t *pwc_vendor_err,
+ uint64_t *pwc_wr_id, uint8_t opcode)
+ __attribute__((always_inline));
+inline int mlx5_poll_one_cqe_err(struct mlx5_context *mctx,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct mlx5_cqe64 *cqe64, int cqe_ver,
+ uint32_t qpn, uint32_t *pwc_status,
+ uint32_t *pwc_vendor_err,
+ uint64_t *pwc_wr_id, uint8_t opcode)
+{
+ uint32_t srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
+ struct mlx5_err_cqe *ecqe = (struct mlx5_err_cqe *)cqe64;
+ int err = CQ_OK;
+
+ mlx5_handle_error_cqe(ecqe, pwc_status, pwc_vendor_err);
+ if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR &&
+ ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR)) {
+ FILE *fp = mctx->dbg_fp;
+
+ fprintf(fp, PFX "%s: got completion with error:\n",
+ mctx->hostname);
+ dump_cqe(fp, ecqe);
+ if (mlx5_freeze_on_error_cqe) {
+ fprintf(fp, PFX "freezing at poll cq...");
+ while (1)
+ sleep(10);
+ }
+ }
+
+ if (opcode == MLX5_CQE_REQ_ERR) {
+ struct mlx5_qp *mqp = NULL;
+ struct mlx5_wq *wq;
+ uint16_t wqe_ctr;
+ int idx;
+
+ mqp = get_req_context(mctx, cur_rsc, (cqe_ver ? srqn_uidx : qpn), cqe_ver);
+ if (unlikely(!mqp))
+ return CQ_POLL_ERR;
+ wq = &mqp->sq;
+ wqe_ctr = ntohs(cqe64->wqe_counter);
+ idx = wqe_ctr & (wq->wqe_cnt - 1);
+ *pwc_wr_id = wq->wrid[idx];
+ wq->tail = wq->wqe_head[idx] + 1;
+ } else {
+ int is_srq = 0;
+
+ if (cqe_ver) {
+ err = get_resp_cxt_v1(mctx, cur_rsc, cur_srq, srqn_uidx, &is_srq);
+ } else {
+ if (srqn_uidx) {
+ err = get_srq_ctx(mctx, cur_srq, srqn_uidx);
+ is_srq = 1;
+ } else {
+ err = get_resp_ctx(mctx, cur_rsc, qpn);
+ }
+ }
+ if (unlikely(err))
+ return CQ_POLL_ERR;
+
+ if (is_srq) {
+ uint16_t wqe_ctr = ntohs(cqe64->wqe_counter);
+
+ *pwc_wr_id = (*cur_srq)->wrid[wqe_ctr];
+ mlx5_free_srq_wqe(*cur_srq, wqe_ctr);
+ } else {
+ struct mlx5_qp *mqp = rsc_to_mqp(*cur_rsc);
+ struct mlx5_wq *wq;
+
+ wq = &mqp->rq;
+ *pwc_wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
+ ++wq->tail;
+ }
+ }
+
+ return err;
+}
+
static inline int mlx5_poll_one(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -464,17 +806,10 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
struct ibv_wc *wc, int cqe_ver)
{
struct mlx5_cqe64 *cqe64;
- struct mlx5_wq *wq;
- uint16_t wqe_ctr;
void *cqe;
uint32_t qpn;
- uint32_t srqn_uidx;
- int idx;
uint8_t opcode;
- struct mlx5_err_cqe *ecqe;
int err;
- int is_srq = 0;
- struct mlx5_qp *mqp = NULL;
struct mlx5_context *mctx = to_mctx(cq->ibv_cq.context);
cqe = next_cqe_sw(cq);
@@ -494,137 +829,165 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
*/
rmb();
-#ifdef MLX5_DEBUG
- if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) {
- FILE *fp = mctx->dbg_fp;
-
- mlx5_dbg(fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn);
- dump_cqe(fp, cqe64);
- }
-#endif
+ dump_cqe_debug(mctx->dbg_fp, cqe64);
qpn = ntohl(cqe64->sop_drop_qpn) & 0xffffff;
wc->wc_flags = 0;
switch (opcode) {
case MLX5_CQE_REQ:
- mqp = get_req_context(mctx, cur_rsc,
- (cqe_ver ? (ntohl(cqe64->srqn_uidx) & 0xffffff) : qpn),
- cqe_ver);
- if (unlikely(!mqp))
- return CQ_POLL_ERR;
- wq = &mqp->sq;
- wqe_ctr = ntohs(cqe64->wqe_counter);
- idx = wqe_ctr & (wq->wqe_cnt - 1);
+ err = mlx5_poll_one_cqe_req(cq, cur_rsc, cqe, qpn, cqe_ver,
+ &wc->wr_id);
handle_good_req(wc, cqe64);
- if (cqe64->op_own & MLX5_INLINE_SCATTER_32)
- err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe,
- wc->byte_len);
- else if (cqe64->op_own & MLX5_INLINE_SCATTER_64)
- err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe - 1,
- wc->byte_len);
- else
- err = 0;
-
- wc->wr_id = wq->wrid[idx];
- wq->tail = wq->wqe_head[idx] + 1;
wc->status = err;
break;
+
case MLX5_CQE_RESP_WR_IMM:
case MLX5_CQE_RESP_SEND:
case MLX5_CQE_RESP_SEND_IMM:
- case MLX5_CQE_RESP_SEND_INV:
- srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
- if (cqe_ver) {
- err = get_resp_cxt_v1(mctx, cur_rsc, cur_srq, srqn_uidx, &is_srq);
- } else {
- if (srqn_uidx) {
- err = get_srq_ctx(mctx, cur_srq, srqn_uidx);
- is_srq = 1;
- } else {
- err = get_resp_ctx(mctx, cur_rsc, qpn);
- }
- }
+ case MLX5_CQE_RESP_SEND_INV: {
+ int is_srq;
+
+ err = mlx5_poll_one_cqe_resp(mctx, cur_rsc, cur_srq, cqe64,
+ cqe_ver, qpn, &is_srq);
if (unlikely(err))
return err;
wc->status = handle_responder(wc, cqe64, rsc_to_mqp(*cur_rsc),
is_srq ? *cur_srq : NULL);
break;
+ }
case MLX5_CQE_RESIZE_CQ:
break;
case MLX5_CQE_REQ_ERR:
case MLX5_CQE_RESP_ERR:
- srqn_uidx = ntohl(cqe64->srqn_uidx) & 0xffffff;
- ecqe = (struct mlx5_err_cqe *)cqe64;
- mlx5_handle_error_cqe(ecqe, wc);
- if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR &&
- ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR)) {
- FILE *fp = mctx->dbg_fp;
- fprintf(fp, PFX "%s: got completion with error:\n",
- mctx->hostname);
- dump_cqe(fp, ecqe);
- if (mlx5_freeze_on_error_cqe) {
- fprintf(fp, PFX "freezing at poll cq...");
- while (1)
- sleep(10);
- }
- }
+ err = mlx5_poll_one_cqe_err(mctx, cur_rsc, cur_srq, cqe64,
+ cqe_ver, qpn, &wc->status,
+ &wc->vendor_err, &wc->wr_id,
+ opcode);
+ if (err != CQ_OK)
+ return err;
+ break;
+ }
- if (opcode == MLX5_CQE_REQ_ERR) {
- mqp = get_req_context(mctx, cur_rsc, (cqe_ver ? srqn_uidx : qpn), cqe_ver);
- if (unlikely(!mqp))
- return CQ_POLL_ERR;
- wq = &mqp->sq;
- wqe_ctr = ntohs(cqe64->wqe_counter);
- idx = wqe_ctr & (wq->wqe_cnt - 1);
- wc->wr_id = wq->wrid[idx];
- wq->tail = wq->wqe_head[idx] + 1;
- } else {
- if (cqe_ver) {
- err = get_resp_cxt_v1(mctx, cur_rsc, cur_srq, srqn_uidx, &is_srq);
- } else {
- if (srqn_uidx) {
- err = get_srq_ctx(mctx, cur_srq, srqn_uidx);
- is_srq = 1;
- } else {
- err = get_resp_ctx(mctx, cur_rsc, qpn);
- }
- }
- if (unlikely(err))
- return CQ_POLL_ERR;
+ wc->qp_num = qpn;
+ return CQ_OK;
+}
- if (is_srq) {
- wqe_ctr = ntohs(cqe64->wqe_counter);
- wc->wr_id = (*cur_srq)->wrid[wqe_ctr];
- mlx5_free_srq_wqe(*cur_srq, wqe_ctr);
- } else {
- mqp = rsc_to_mqp(*cur_rsc);
- wq = &mqp->rq;
- wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)];
- ++wq->tail;
- }
- }
+inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver)
+{
+ struct mlx5_cqe64 *cqe64;
+ void *cqe;
+ uint32_t qpn;
+ uint8_t opcode;
+ int err;
+ struct mlx5_context *mctx = to_mctx(cq->ibv_cq.context);
+ struct ibv_wc_ex *wc_ex = *pwc_ex;
+ union wc_buffer wc_buffer;
+
+ cqe = next_cqe_sw(cq);
+ if (!cqe)
+ return CQ_EMPTY;
+
+ cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64;
+
+ opcode = cqe64->op_own >> 4;
+ ++cq->cons_index;
+
+ VALGRIND_MAKE_MEM_DEFINED(cqe64, sizeof *cqe64);
+
+ /*
+ * Make sure we read CQ entry contents after we've checked the
+ * ownership bit.
+ */
+ rmb();
+
+ dump_cqe_debug(mctx->dbg_fp, cqe64);
+
+ qpn = ntohl(cqe64->sop_drop_qpn) & 0xffffff;
+ wc_buffer.b64 = (uint64_t *)&wc_ex->buffer;
+ wc_ex->wc_flags = 0;
+ wc_ex->reserved = 0;
+
+ switch (opcode) {
+ case MLX5_CQE_REQ:
+ err = mlx5_poll_one_cqe_req(cq, cur_rsc, cqe, qpn, cqe_ver,
+ &wc_ex->wr_id);
+ handle_good_req_ex(wc_ex, &wc_buffer, cqe64, wc_flags, qpn);
+ wc_ex->status = err;
+ if (wc_flags & IBV_WC_EX_WITH_SRC_QP)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX)
+ wc_buffer.b16++;
+ if (wc_flags & IBV_WC_EX_WITH_SLID)
+ wc_buffer.b16++;
+ if (wc_flags & IBV_WC_EX_WITH_SL)
+ wc_buffer.b8++;
+ if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
+ wc_buffer.b8++;
+ break;
+
+ case MLX5_CQE_RESP_WR_IMM:
+ case MLX5_CQE_RESP_SEND:
+ case MLX5_CQE_RESP_SEND_IMM:
+ case MLX5_CQE_RESP_SEND_INV: {
+ int is_srq;
+
+ err = mlx5_poll_one_cqe_resp(mctx, cur_rsc, cur_srq, cqe64,
+ cqe_ver, qpn, &is_srq);
+ if (unlikely(err))
+ return err;
+
+ wc_ex->status = handle_responder_ex(wc_ex, &wc_buffer, cqe64,
+ rsc_to_mqp(*cur_rsc),
+ is_srq ? *cur_srq : NULL,
+ wc_flags, qpn);
break;
}
+ case MLX5_CQE_REQ_ERR:
+ case MLX5_CQE_RESP_ERR:
+ err = mlx5_poll_one_cqe_err(mctx, cur_rsc, cur_srq, cqe64,
+ cqe_ver, qpn, &wc_ex->status,
+ &wc_ex->vendor_err, &wc_ex->wr_id,
+ opcode);
+ if (err != CQ_OK)
+ return err;
- wc->qp_num = qpn;
+ case MLX5_CQE_RESIZE_CQ:
+ if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_IMM)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ *wc_buffer.b32++ = qpn;
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ }
+ if (wc_flags & IBV_WC_EX_WITH_SRC_QP)
+ wc_buffer.b32++;
+ if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX)
+ wc_buffer.b16++;
+ if (wc_flags & IBV_WC_EX_WITH_SLID)
+ wc_buffer.b16++;
+ if (wc_flags & IBV_WC_EX_WITH_SL)
+ wc_buffer.b8++;
+ if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
+ wc_buffer.b8++;
+ break;
+ }
+ *pwc_ex = (struct ibv_wc_ex *)((uintptr_t)(wc_buffer.b8 + sizeof(uint64_t) - 1) &
+ ~(sizeof(uint64_t) - 1));
return CQ_OK;
}
-static inline int poll_cq(struct ibv_cq *ibcq, int ne,
- struct ibv_wc *wc, int cqe_ver)
- __attribute__((always_inline));
-static inline int poll_cq(struct ibv_cq *ibcq, int ne,
- struct ibv_wc *wc, int cqe_ver)
+static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
+__attribute__((always_inline));
+static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
{
- struct mlx5_cq *cq = to_mcq(ibcq);
- struct mlx5_resource *rsc = NULL;
- struct mlx5_srq *srq = NULL;
- int npolled;
- int err = CQ_OK;
-
if (cq->stall_enable) {
if (cq->stall_adaptive_enable) {
if (cq->stall_last_count)
@@ -634,19 +997,13 @@ static inline int poll_cq(struct ibv_cq *ibcq, int ne,
mlx5_stall_poll_cq();
}
}
+}
- mlx5_spin_lock(&cq->lock);
-
- for (npolled = 0; npolled < ne; ++npolled) {
- err = mlx5_poll_one(cq, &rsc, &srq, wc + npolled, cqe_ver);
- if (err != CQ_OK)
- break;
- }
-
- update_cons_index(cq);
-
- mlx5_spin_unlock(&cq->lock);
-
+static inline void mlx5_poll_cq_stall_end(struct mlx5_cq *cq, int ne,
+ int npolled, int err) __attribute__((always_inline));
+static inline void mlx5_poll_cq_stall_end(struct mlx5_cq *cq, int ne,
+ int npolled, int err)
+{
if (cq->stall_enable) {
if (cq->stall_adaptive_enable) {
if (npolled == 0) {
@@ -666,6 +1023,34 @@ static inline int poll_cq(struct ibv_cq *ibcq, int ne,
cq->stall_next_poll = 1;
}
}
+}
+
+static inline int poll_cq(struct ibv_cq *ibcq, int ne,
+ struct ibv_wc *wc, int cqe_ver)
+ __attribute__((always_inline));
+static inline int poll_cq(struct ibv_cq *ibcq, int ne,
+ struct ibv_wc *wc, int cqe_ver)
+{
+ struct mlx5_cq *cq = to_mcq(ibcq);
+ struct mlx5_resource *rsc = NULL;
+ struct mlx5_srq *srq = NULL;
+ int npolled;
+ int err = CQ_OK;
+
+ mlx5_poll_cq_stall_start(cq);
+ mlx5_spin_lock(&cq->lock);
+
+ for (npolled = 0; npolled < ne; ++npolled) {
+ err = mlx5_poll_one(cq, &rsc, &srq, wc + npolled, cqe_ver);
+ if (err != CQ_OK)
+ break;
+ }
+
+ update_cons_index(cq);
+
+ mlx5_spin_unlock(&cq->lock);
+
+ mlx5_poll_cq_stall_end(cq, ne, npolled, err);
return err == CQ_POLL_ERR ? err : npolled;
}
@@ -680,6 +1065,52 @@ int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
return poll_cq(ibcq, ne, wc, 1);
}
+static inline int poll_cq_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
+ struct ibv_poll_cq_ex_attr *attr, int cqe_ver)
+{
+ struct mlx5_cq *cq = to_mcq(ibcq);
+ struct mlx5_resource *rsc = NULL;
+ struct mlx5_srq *srq = NULL;
+ int npolled;
+ int err = CQ_OK;
+ int (*poll_fn)(struct mlx5_cq *cq, struct mlx5_resource **rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver) =
+ cq->poll_one;
+ uint64_t wc_flags = cq->wc_flags;
+ unsigned int ne = attr->max_entries;
+
+ mlx5_poll_cq_stall_start(cq);
+ mlx5_spin_lock(&cq->lock);
+
+ for (npolled = 0; npolled < ne; ++npolled) {
+ err = poll_fn(cq, &rsc, &srq, &wc, wc_flags, cqe_ver);
+ if (err != CQ_OK)
+ break;
+ }
+
+ update_cons_index(cq);
+
+ mlx5_spin_unlock(&cq->lock);
+
+ mlx5_poll_cq_stall_end(cq, ne, npolled, err);
+
+ return err == CQ_POLL_ERR ? err : npolled;
+}
+
+int mlx5_poll_cq_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
+ struct ibv_poll_cq_ex_attr *attr)
+{
+ return poll_cq_ex(ibcq, wc, attr, 0);
+}
+
+int mlx5_poll_cq_v1_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
+ struct ibv_poll_cq_ex_attr *attr)
+{
+ return poll_cq_ex(ibcq, wc, attr, 1);
+}
+
int mlx5_arm_cq(struct ibv_cq *ibvcq, int solicited)
{
struct mlx5_cq *cq = to_mcq(ibvcq);
diff --git a/src/mlx5.c b/src/mlx5.c
index 5e9b61c..eac332b 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -664,6 +664,11 @@ static int mlx5_init_context(struct verbs_device *vdev,
verbs_set_ctx_op(v_ctx, create_srq_ex, mlx5_create_srq_ex);
verbs_set_ctx_op(v_ctx, get_srq_num, mlx5_get_srq_num);
verbs_set_ctx_op(v_ctx, query_device_ex, mlx5_query_device_ex);
+ if (context->cqe_version && context->cqe_version == 1)
+ verbs_set_ctx_op(v_ctx, poll_cq_ex, mlx5_poll_cq_v1_ex);
+ else
+ verbs_set_ctx_op(v_ctx, poll_cq_ex, mlx5_poll_cq_ex);
+
return 0;
diff --git a/src/mlx5.h b/src/mlx5.h
index 325e07b..e27e79c 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -349,6 +349,11 @@ enum {
struct mlx5_cq {
struct ibv_cq ibv_cq;
+ uint64_t wc_flags;
+ int (*poll_one)(struct mlx5_cq *cq, struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver);
struct mlx5_buf buf_a;
struct mlx5_buf buf_b;
struct mlx5_buf *active_buf;
@@ -603,6 +608,15 @@ int mlx5_dereg_mr(struct ibv_mr *mr);
struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
struct ibv_comp_channel *channel,
int comp_vector);
+int mlx5_poll_cq_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
+ struct ibv_poll_cq_ex_attr *attr);
+int mlx5_poll_cq_v1_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
+ struct ibv_poll_cq_ex_attr *attr);
+int mlx5_poll_one_ex(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver);
int mlx5_alloc_cq_buf(struct mlx5_context *mctx, struct mlx5_cq *cq,
struct mlx5_buf *buf, int nent, int cqe_sz);
int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf);
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 3/7] Add timestmap support for ibv_poll_cq_ex
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-11-15 12:30 ` [PATCH libmlx5 1/7] Add timestamp support query_device_ex Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 2/7] Add ibv_poll_cq_ex support Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 4/7] Add ibv_create_cq_ex support Matan Barak
` (5 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
Add support for filling the timestamp field in ibv_poll_cq_ex
(if it's required by the user).
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/cq.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/cq.c b/src/cq.c
index 0185696..5e06990 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -913,6 +913,11 @@ inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
wc_ex->wc_flags = 0;
wc_ex->reserved = 0;
+ if (wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) {
+ *wc_buffer.b64++ = ntohll(cqe64->timestamp);
+ wc_ex->wc_flags |= IBV_WC_EX_WITH_COMPLETION_TIMESTAMP;
+ }
+
switch (opcode) {
case MLX5_CQE_REQ:
err = mlx5_poll_one_cqe_req(cq, cur_rsc, cqe, qpn, cqe_ver,
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 4/7] Add ibv_create_cq_ex support
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (2 preceding siblings ...)
2015-11-15 12:30 ` [PATCH libmlx5 3/7] Add timestmap support for ibv_poll_cq_ex Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 5/7] Add ibv_query_values support Matan Barak
` (4 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
In order to create a CQ which supports timestamp, the user needs
to specify the timestamp flag for ibv_create_cq_ex.
Adding support for ibv_create_cq_ex in the mlx5's vendor library.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/mlx5.c | 1 +
src/mlx5.h | 2 ++
src/verbs.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++--------
3 files changed, 63 insertions(+), 8 deletions(-)
diff --git a/src/mlx5.c b/src/mlx5.c
index eac332b..229d99d 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -664,6 +664,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
verbs_set_ctx_op(v_ctx, create_srq_ex, mlx5_create_srq_ex);
verbs_set_ctx_op(v_ctx, get_srq_num, mlx5_get_srq_num);
verbs_set_ctx_op(v_ctx, query_device_ex, mlx5_query_device_ex);
+ verbs_set_ctx_op(v_ctx, create_cq_ex, mlx5_create_cq_ex);
if (context->cqe_version && context->cqe_version == 1)
verbs_set_ctx_op(v_ctx, poll_cq_ex, mlx5_poll_cq_v1_ex);
else
diff --git a/src/mlx5.h b/src/mlx5.h
index e27e79c..66dc4a9 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -608,6 +608,8 @@ int mlx5_dereg_mr(struct ibv_mr *mr);
struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
struct ibv_comp_channel *channel,
int comp_vector);
+struct ibv_cq *mlx5_create_cq_ex(struct ibv_context *context,
+ struct ibv_create_cq_attr_ex *cq_attr);
int mlx5_poll_cq_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
struct ibv_poll_cq_ex_attr *attr);
int mlx5_poll_cq_v1_ex(struct ibv_cq *ibcq, struct ibv_wc_ex *wc,
diff --git a/src/verbs.c b/src/verbs.c
index 4c054f1..76885f3 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -240,9 +240,21 @@ static int qp_sig_enabled(void)
return 0;
}
-struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
- struct ibv_comp_channel *channel,
- int comp_vector)
+enum {
+ CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS |
+ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP
+};
+
+enum {
+ CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CREATE_CQ_ATTR_FLAGS
+};
+
+enum {
+ CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_COMPLETION_TIMESTAMP
+};
+
+static struct ibv_cq *create_cq(struct ibv_context *context,
+ const struct ibv_create_cq_attr_ex *cq_attr)
{
struct mlx5_create_cq cmd;
struct mlx5_create_cq_resp resp;
@@ -254,12 +266,31 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
FILE *fp = to_mctx(context)->dbg_fp;
#endif
- if (!cqe) {
+ if (!cq_attr->cqe) {
+ mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
+ errno = EINVAL;
+ return NULL;
+ }
+
+ if (cq_attr->comp_mask & ~CREATE_CQ_SUPPORTED_COMP_MASK) {
+ mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
+ errno = EINVAL;
+ return NULL;
+ }
+
+ if (cq_attr->comp_mask & IBV_CREATE_CQ_ATTR_FLAGS &&
+ cq_attr->flags & ~CREATE_CQ_SUPPORTED_FLAGS) {
mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
errno = EINVAL;
return NULL;
}
+ if (cq_attr->wc_flags & ~CREATE_CQ_SUPPORTED_WC_FLAGS) {
+ mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
+ errno = ENOTSUP;
+ return NULL;
+ }
+
cq = calloc(1, sizeof *cq);
if (!cq) {
mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
@@ -273,14 +304,14 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
goto err;
/* The additional entry is required for resize CQ */
- if (cqe <= 0) {
+ if (cq_attr->cqe <= 0) {
mlx5_dbg(fp, MLX5_DBG_CQ, "\n");
errno = EINVAL;
goto err_spl;
}
- ncqe = align_queue_size(cqe + 1);
- if ((ncqe > (1 << 24)) || (ncqe < (cqe + 1))) {
+ ncqe = align_queue_size(cq_attr->cqe + 1);
+ if ((ncqe > (1 << 24)) || (ncqe < (cq_attr->cqe + 1))) {
mlx5_dbg(fp, MLX5_DBG_CQ, "ncqe %d\n", ncqe);
errno = EINVAL;
goto err_spl;
@@ -313,7 +344,8 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
cmd.db_addr = (uintptr_t) cq->dbrec;
cmd.cqe_size = cqe_sz;
- ret = ibv_cmd_create_cq(context, ncqe - 1, channel, comp_vector,
+ ret = ibv_cmd_create_cq(context, ncqe - 1, cq_attr->channel,
+ cq_attr->comp_vector,
&cq->ibv_cq, &cmd.ibv_cmd, sizeof cmd,
&resp.ibv_resp, sizeof resp);
if (ret) {
@@ -328,6 +360,9 @@ struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
cq->stall_adaptive_enable = to_mctx(context)->stall_adaptive_enable;
cq->stall_cycles = to_mctx(context)->stall_cycles;
+ cq->wc_flags = cq_attr->wc_flags;
+ cq->poll_one = mlx5_poll_one_ex;
+
return &cq->ibv_cq;
err_db:
@@ -345,6 +380,23 @@ err:
return NULL;
}
+struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe,
+ struct ibv_comp_channel *channel,
+ int comp_vector)
+{
+ struct ibv_create_cq_attr_ex cq_attr = {.cqe = cqe, .channel = channel,
+ .comp_vector = comp_vector,
+ .wc_flags = IBV_WC_STANDARD_FLAGS};
+
+ return create_cq(context, &cq_attr);
+}
+
+struct ibv_cq *mlx5_create_cq_ex(struct ibv_context *context,
+ struct ibv_create_cq_attr_ex *cq_attr)
+{
+ return create_cq(context, cq_attr);
+}
+
int mlx5_resize_cq(struct ibv_cq *ibcq, int cqe)
{
struct mlx5_cq *cq = to_mcq(ibcq);
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 5/7] Add ibv_query_values support
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (3 preceding siblings ...)
2015-11-15 12:30 ` [PATCH libmlx5 4/7] Add ibv_create_cq_ex support Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 6/7] Optimize poll_cq Matan Barak
` (3 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
In order to query the current HCA's core clock, libmlx5 should
support ibv_query_values verb. Querying the hardware's cycles
register is done by mmaping this register to user-space.
Therefore, when libmlx5 initializes we mmap the cycles register.
This assumes the machine's architecture places the PCI and memory in
the same address space.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/mlx5.c | 38 ++++++++++++++++++++++++++++++++++++++
src/mlx5.h | 6 +++++-
src/verbs.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/src/mlx5.c b/src/mlx5.c
index 229d99d..81d1da2 100644
--- a/src/mlx5.c
+++ b/src/mlx5.c
@@ -524,6 +524,30 @@ static int single_threaded_app(void)
return 0;
}
+static int mlx5_map_internal_clock(struct mlx5_device *mdev,
+ struct ibv_context *ibv_ctx)
+{
+ struct mlx5_context *context = to_mctx(ibv_ctx);
+ void *hca_clock_page;
+ off_t offset = 0;
+
+ set_command(MLX5_MMAP_GET_CORE_CLOCK_CMD, &offset);
+ hca_clock_page = mmap(NULL, mdev->page_size,
+ PROT_READ, MAP_SHARED, ibv_ctx->cmd_fd,
+ mdev->page_size * offset);
+
+ if (hca_clock_page == MAP_FAILED) {
+ fprintf(stderr, PFX
+ "Warning: Timestamp available,\n"
+ "but failed to mmap() hca core clock page.\n");
+ return -1;
+ }
+
+ context->hca_core_clock = hca_clock_page +
+ (context->core_clock.offset & (mdev->page_size - 1));
+ return 0;
+}
+
static int mlx5_init_context(struct verbs_device *vdev,
struct ibv_context *ctx, int cmd_fd)
{
@@ -539,6 +563,10 @@ static int mlx5_init_context(struct verbs_device *vdev,
off_t offset;
struct mlx5_device *mdev;
struct verbs_context *v_ctx;
+ int ret;
+ uint32_t comp_mask;
+ struct ibv_device_attr_ex dev_attrs;
+ struct ibv_query_device_ex_input dev_attrs_input = {.comp_mask = 0};
mdev = to_mdev(&vdev->device);
v_ctx = verbs_get_ctx(ctx);
@@ -647,6 +675,12 @@ static int mlx5_init_context(struct verbs_device *vdev,
context->bfs[j].uuarn = j;
}
+ context->hca_core_clock = NULL;
+ ret = _mlx5_query_device_ex(ctx, &dev_attrs_input, &dev_attrs,
+ sizeof(dev_attrs), &comp_mask);
+ if (!ret && comp_mask & QUERY_DEVICE_RESP_MASK_TIMESTAMP)
+ mlx5_map_internal_clock(mdev, ctx);
+
mlx5_spinlock_init(&context->lock32);
context->prefer_bf = get_always_bf();
@@ -664,6 +698,7 @@ static int mlx5_init_context(struct verbs_device *vdev,
verbs_set_ctx_op(v_ctx, create_srq_ex, mlx5_create_srq_ex);
verbs_set_ctx_op(v_ctx, get_srq_num, mlx5_get_srq_num);
verbs_set_ctx_op(v_ctx, query_device_ex, mlx5_query_device_ex);
+ verbs_set_ctx_op(v_ctx, query_values, mlx5_query_values);
verbs_set_ctx_op(v_ctx, create_cq_ex, mlx5_create_cq_ex);
if (context->cqe_version && context->cqe_version == 1)
verbs_set_ctx_op(v_ctx, poll_cq_ex, mlx5_poll_cq_v1_ex);
@@ -697,6 +732,9 @@ static void mlx5_cleanup_context(struct verbs_device *device,
if (context->uar[i])
munmap(context->uar[i], page_size);
}
+ if (context->hca_core_clock)
+ munmap(context->hca_core_clock - context->core_clock.offset,
+ page_size);
close_debug_file(context);
}
diff --git a/src/mlx5.h b/src/mlx5.h
index 66dc4a9..818fe85 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -117,7 +117,8 @@ enum {
enum {
MLX5_MMAP_GET_REGULAR_PAGES_CMD = 0,
- MLX5_MMAP_GET_CONTIGUOUS_PAGES_CMD = 1
+ MLX5_MMAP_GET_CONTIGUOUS_PAGES_CMD = 1,
+ MLX5_MMAP_GET_CORE_CLOCK_CMD = 5
};
#define MLX5_CQ_PREFIX "MLX_CQ"
@@ -311,6 +312,7 @@ struct mlx5_context {
uint64_t offset;
uint64_t mask;
} core_clock;
+ void *hca_core_clock;
};
struct mlx5_bitmap {
@@ -593,6 +595,8 @@ int mlx5_query_device_ex(struct ibv_context *context,
const struct ibv_query_device_ex_input *input,
struct ibv_device_attr_ex *attr,
size_t attr_size);
+int mlx5_query_values(struct ibv_context *context,
+ struct ibv_values_ex *values);
struct ibv_qp *mlx5_create_qp_ex(struct ibv_context *context,
struct ibv_qp_init_attr_ex *attr);
int mlx5_query_port(struct ibv_context *context, uint8_t port,
diff --git a/src/verbs.c b/src/verbs.c
index 76885f3..50955ae 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -79,6 +79,52 @@ int mlx5_query_device(struct ibv_context *context, struct ibv_device_attr *attr)
return 0;
}
+#define READL(ptr) (*((uint32_t *)(ptr)))
+static int mlx5_read_clock(struct ibv_context *context, uint64_t *cycles)
+{
+ unsigned int clockhi, clocklo, clockhi1;
+ int i;
+ struct mlx5_context *ctx = to_mctx(context);
+
+ if (!ctx->hca_core_clock)
+ return -EOPNOTSUPP;
+
+ /* Handle wraparound */
+ for (i = 0; i < 2; i++) {
+ clockhi = ntohl(READL(ctx->hca_core_clock));
+ clocklo = ntohl(READL(ctx->hca_core_clock + 4));
+ clockhi1 = ntohl(READL(ctx->hca_core_clock));
+ if (clockhi == clockhi1)
+ break;
+ }
+
+ *cycles = (uint64_t)clockhi << 32 | (uint64_t)clocklo;
+
+ return 0;
+}
+
+int mlx5_query_values(struct ibv_context *context,
+ struct ibv_values_ex *values)
+{
+ uint32_t comp_mask = 0;
+ int err = 0;
+
+ if (values->comp_mask & IBV_VALUES_MASK_RAW_CLOCK) {
+ uint64_t cycles;
+
+ err = mlx5_read_clock(context, &cycles);
+ if (!err) {
+ values->raw_clock.tv_sec = 0;
+ values->raw_clock.tv_nsec = cycles;
+ comp_mask |= IBV_VALUES_MASK_RAW_CLOCK;
+ }
+ }
+
+ values->comp_mask = comp_mask;
+
+ return err;
+}
+
int mlx5_query_port(struct ibv_context *context, uint8_t port,
struct ibv_port_attr *attr)
{
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 6/7] Optimize poll_cq
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (4 preceding siblings ...)
2015-11-15 12:30 ` [PATCH libmlx5 5/7] Add ibv_query_values support Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 12:30 ` [PATCH libmlx5 7/7] Add always_inline check Matan Barak
` (2 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
The current ibv_poll_cq_ex mechanism needs to query every field
for its existence. In order to avoid this penalty at runtime,
add optimized functions for special cases.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
src/cq.c | 363 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
src/mlx5.h | 10 ++
src/verbs.c | 9 +-
3 files changed, 310 insertions(+), 72 deletions(-)
diff --git a/src/cq.c b/src/cq.c
index 5e06990..fcb4237 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -41,6 +41,7 @@
#include <netinet/in.h>
#include <string.h>
#include <errno.h>
+#include <assert.h>
#include <unistd.h>
#include <infiniband/opcode.h>
@@ -207,73 +208,91 @@ union wc_buffer {
uint64_t *b64;
};
+#define IS_IN_WC_FLAGS(yes, no, maybe, flag) (((yes) & (flag)) || \
+ (!((no) & (flag)) && \
+ ((maybe) & (flag))))
static inline void handle_good_req_ex(struct ibv_wc_ex *wc_ex,
union wc_buffer *pwc_buffer,
struct mlx5_cqe64 *cqe,
uint64_t wc_flags,
- uint32_t qpn)
+ uint64_t wc_flags_yes,
+ uint64_t wc_flags_no,
+ uint32_t qpn, uint64_t *wc_flags_out)
{
union wc_buffer wc_buffer = *pwc_buffer;
switch (ntohl(cqe->sop_drop_qpn) >> 24) {
case MLX5_OPCODE_RDMA_WRITE_IMM:
- wc_ex->wc_flags |= IBV_WC_EX_IMM;
+ *wc_flags_out |= IBV_WC_EX_IMM;
case MLX5_OPCODE_RDMA_WRITE:
wc_ex->opcode = IBV_WC_RDMA_WRITE;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_OPCODE_SEND_IMM:
- wc_ex->wc_flags |= IBV_WC_EX_IMM;
+ *wc_flags_out |= IBV_WC_EX_IMM;
case MLX5_OPCODE_SEND:
case MLX5_OPCODE_SEND_INVAL:
wc_ex->opcode = IBV_WC_SEND;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_OPCODE_RDMA_READ:
wc_ex->opcode = IBV_WC_RDMA_READ;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN)) {
*wc_buffer.b32++ = ntohl(cqe->byte_cnt);
- wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ *wc_flags_out |= IBV_WC_EX_WITH_BYTE_LEN;
}
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_OPCODE_ATOMIC_CS:
wc_ex->opcode = IBV_WC_COMP_SWAP;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN)) {
*wc_buffer.b32++ = 8;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ *wc_flags_out |= IBV_WC_EX_WITH_BYTE_LEN;
}
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_OPCODE_ATOMIC_FA:
wc_ex->opcode = IBV_WC_FETCH_ADD;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN)) {
*wc_buffer.b32++ = 8;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ *wc_flags_out |= IBV_WC_EX_WITH_BYTE_LEN;
}
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_OPCODE_BIND_MW:
wc_ex->opcode = IBV_WC_BIND_MW;
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
}
- if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_QP_NUM)) {
*wc_buffer.b32++ = qpn;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ *wc_flags_out |= IBV_WC_EX_WITH_QP_NUM;
}
*pwc_buffer = wc_buffer;
@@ -345,7 +364,9 @@ static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
union wc_buffer *pwc_buffer,
struct mlx5_cqe64 *cqe,
struct mlx5_qp *qp, struct mlx5_srq *srq,
- uint64_t wc_flags, uint32_t qpn)
+ uint64_t wc_flags, uint64_t wc_flags_yes,
+ uint64_t wc_flags_no, uint32_t qpn,
+ uint64_t *wc_flags_out)
{
uint16_t wqe_ctr;
struct mlx5_wq *wq;
@@ -354,9 +375,10 @@ static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
int err = 0;
uint32_t byte_len = ntohl(cqe->byte_cnt);
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN)) {
*wc_buffer.b32++ = byte_len;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_BYTE_LEN;
+ *wc_flags_out |= IBV_WC_EX_WITH_BYTE_LEN;
}
if (srq) {
wqe_ctr = ntohs(cqe->wqe_counter);
@@ -386,53 +408,62 @@ static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
switch (cqe->op_own >> 4) {
case MLX5_CQE_RESP_WR_IMM:
wc_ex->opcode = IBV_WC_RECV_RDMA_WITH_IMM;
- wc_ex->wc_flags = IBV_WC_EX_IMM;
- if (wc_flags & IBV_WC_EX_WITH_IMM) {
+ *wc_flags_out = IBV_WC_EX_IMM;
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM)) {
*wc_buffer.b32++ = ntohl(cqe->byte_cnt);
- wc_ex->wc_flags |= IBV_WC_EX_WITH_IMM;
+ *wc_flags_out |= IBV_WC_EX_WITH_IMM;
}
break;
case MLX5_CQE_RESP_SEND:
wc_ex->opcode = IBV_WC_RECV;
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
break;
case MLX5_CQE_RESP_SEND_IMM:
wc_ex->opcode = IBV_WC_RECV;
- wc_ex->wc_flags = IBV_WC_EX_WITH_IMM;
- if (wc_flags & IBV_WC_EX_WITH_IMM) {
+ *wc_flags_out = IBV_WC_EX_WITH_IMM;
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM)) {
*wc_buffer.b32++ = ntohl(cqe->imm_inval_pkey);
- wc_ex->wc_flags |= IBV_WC_EX_WITH_IMM;
+ *wc_flags_out |= IBV_WC_EX_WITH_IMM;
}
break;
}
- if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_QP_NUM)) {
*wc_buffer.b32++ = qpn;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ *wc_flags_out |= IBV_WC_EX_WITH_QP_NUM;
}
- if (wc_flags & IBV_WC_EX_WITH_SRC_QP) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SRC_QP)) {
*wc_buffer.b32++ = ntohl(cqe->flags_rqpn) & 0xffffff;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_SRC_QP;
+ *wc_flags_out |= IBV_WC_EX_WITH_SRC_QP;
}
- if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_PKEY_INDEX)) {
*wc_buffer.b16++ = ntohl(cqe->imm_inval_pkey) & 0xffff;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_PKEY_INDEX;
+ *wc_flags_out |= IBV_WC_EX_WITH_PKEY_INDEX;
}
- if (wc_flags & IBV_WC_EX_WITH_SLID) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SLID)) {
*wc_buffer.b16++ = ntohs(cqe->slid);
- wc_ex->wc_flags |= IBV_WC_EX_WITH_SLID;
+ *wc_flags_out |= IBV_WC_EX_WITH_SLID;
}
- if (wc_flags & IBV_WC_EX_WITH_SL) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SL)) {
*wc_buffer.b8++ = (ntohl(cqe->flags_rqpn) >> 24) & 0xf;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_SL;
+ *wc_flags_out |= IBV_WC_EX_WITH_SL;
}
- if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_DLID_PATH_BITS)) {
*wc_buffer.b8++ = cqe->ml_path & 0x7f;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_DLID_PATH_BITS;
+ *wc_flags_out |= IBV_WC_EX_WITH_DLID_PATH_BITS;
}
g = (ntohl(cqe->flags_rqpn) >> 28) & 3;
- wc_ex->wc_flags |= g ? IBV_WC_EX_GRH : 0;
+ *wc_flags_out |= g ? IBV_WC_EX_GRH : 0;
*pwc_buffer = wc_buffer;
return IBV_WC_SUCCESS;
@@ -795,6 +826,9 @@ inline int mlx5_poll_one_cqe_err(struct mlx5_context *mctx,
return err;
}
+#define IS_IN_WC_FLAGS(yes, no, maybe, flag) (((yes) & (flag)) || \
+ (!((no) & (flag)) && \
+ ((maybe) & (flag))))
static inline int mlx5_poll_one(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -874,11 +908,21 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
return CQ_OK;
}
-inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
- struct mlx5_resource **cur_rsc,
- struct mlx5_srq **cur_srq,
- struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
- int cqe_ver)
+static inline int _mlx5_poll_one_ex(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex,
+ uint64_t wc_flags,
+ uint64_t wc_flags_yes, uint64_t wc_flags_no,
+ int cqe_ver)
+ __attribute__((always_inline));
+static inline int _mlx5_poll_one_ex(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex,
+ uint64_t wc_flags,
+ uint64_t wc_flags_yes, uint64_t wc_flags_no,
+ int cqe_ver)
{
struct mlx5_cqe64 *cqe64;
void *cqe;
@@ -888,6 +932,7 @@ inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
struct mlx5_context *mctx = to_mctx(cq->ibv_cq.context);
struct ibv_wc_ex *wc_ex = *pwc_ex;
union wc_buffer wc_buffer;
+ uint64_t wc_flags_out = 0;
cqe = next_cqe_sw(cq);
if (!cqe)
@@ -913,26 +958,34 @@ inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
wc_ex->wc_flags = 0;
wc_ex->reserved = 0;
- if (wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP)) {
*wc_buffer.b64++ = ntohll(cqe64->timestamp);
- wc_ex->wc_flags |= IBV_WC_EX_WITH_COMPLETION_TIMESTAMP;
+ wc_flags_out |= IBV_WC_EX_WITH_COMPLETION_TIMESTAMP;
}
switch (opcode) {
case MLX5_CQE_REQ:
err = mlx5_poll_one_cqe_req(cq, cur_rsc, cqe, qpn, cqe_ver,
&wc_ex->wr_id);
- handle_good_req_ex(wc_ex, &wc_buffer, cqe64, wc_flags, qpn);
+ handle_good_req_ex(wc_ex, &wc_buffer, cqe64, wc_flags,
+ wc_flags_yes, wc_flags_no, qpn,
+ &wc_flags_out);
wc_ex->status = err;
- if (wc_flags & IBV_WC_EX_WITH_SRC_QP)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SRC_QP))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_PKEY_INDEX))
wc_buffer.b16++;
- if (wc_flags & IBV_WC_EX_WITH_SLID)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SLID))
wc_buffer.b16++;
- if (wc_flags & IBV_WC_EX_WITH_SL)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SL))
wc_buffer.b8++;
- if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_DLID_PATH_BITS))
wc_buffer.b8++;
break;
@@ -950,7 +1003,9 @@ inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
wc_ex->status = handle_responder_ex(wc_ex, &wc_buffer, cqe64,
rsc_to_mqp(*cur_rsc),
is_srq ? *cur_srq : NULL,
- wc_flags, qpn);
+ wc_flags, wc_flags_yes,
+ wc_flags_no, qpn,
+ &wc_flags_out);
break;
}
case MLX5_CQE_REQ_ERR:
@@ -963,32 +1018,208 @@ inline int mlx5_poll_one_ex(struct mlx5_cq *cq,
return err;
case MLX5_CQE_RESIZE_CQ:
- if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_BYTE_LEN))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_IMM)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_IMM))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_QP_NUM) {
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_QP_NUM)) {
*wc_buffer.b32++ = qpn;
- wc_ex->wc_flags |= IBV_WC_EX_WITH_QP_NUM;
+ wc_flags_out |= IBV_WC_EX_WITH_QP_NUM;
}
- if (wc_flags & IBV_WC_EX_WITH_SRC_QP)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SRC_QP))
wc_buffer.b32++;
- if (wc_flags & IBV_WC_EX_WITH_PKEY_INDEX)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_PKEY_INDEX))
wc_buffer.b16++;
- if (wc_flags & IBV_WC_EX_WITH_SLID)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SLID))
wc_buffer.b16++;
- if (wc_flags & IBV_WC_EX_WITH_SL)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_SL))
wc_buffer.b8++;
- if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
+ if (IS_IN_WC_FLAGS(wc_flags_yes, wc_flags_no, wc_flags,
+ IBV_WC_EX_WITH_DLID_PATH_BITS))
wc_buffer.b8++;
break;
}
+ wc_ex->wc_flags = wc_flags_out;
*pwc_ex = (struct ibv_wc_ex *)((uintptr_t)(wc_buffer.b8 + sizeof(uint64_t) - 1) &
~(sizeof(uint64_t) - 1));
return CQ_OK;
}
+int mlx5_poll_one_ex(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver)
+{
+ return _mlx5_poll_one_ex(cq, cur_rsc, cur_srq, pwc_ex, wc_flags, 0, 0,
+ cqe_ver);
+}
+
+#define MLX5_POLL_ONE_EX_WC_FLAGS_NAME(wc_flags_yes, wc_flags_no) \
+ mlx5_poll_one_ex_custom##wc_flags_yes ## _ ## wc_flags_no
+
+/* The compiler will create one function per wc_flags combination. Since
+ * _mlx5_poll_one_ex is always inlined (for compilers that supports that),
+ * the compiler drops the if statements and merge all wc_flags_out ORs/ANDs.
+ */
+#define MLX5_POLL_ONE_EX_WC_FLAGS(wc_flags_yes, wc_flags_no) \
+static int MLX5_POLL_ONE_EX_WC_FLAGS_NAME(wc_flags_yes, wc_flags_no) \
+ (struct mlx5_cq *cq, \
+ struct mlx5_resource **cur_rsc,\
+ struct mlx5_srq **cur_srq, \
+ struct ibv_wc_ex **pwc_ex, \
+ uint64_t wc_flags, \
+ int cqe_ver) \
+{ \
+ return _mlx5_poll_one_ex(cq, cur_rsc, cur_srq, pwc_ex, wc_flags, \
+ wc_flags_yes, wc_flags_no, cqe_ver); \
+}
+
+/*
+ Since we use the preprocessor here, we have to calculate the Or value
+ ourselves:
+ IBV_WC_EX_GRH = 1 << 0,
+ IBV_WC_EX_IMM = 1 << 1,
+ IBV_WC_EX_WITH_BYTE_LEN = 1 << 2,
+ IBV_WC_EX_WITH_IMM = 1 << 3,
+ IBV_WC_EX_WITH_QP_NUM = 1 << 4,
+ IBV_WC_EX_WITH_SRC_QP = 1 << 5,
+ IBV_WC_EX_WITH_PKEY_INDEX = 1 << 6,
+ IBV_WC_EX_WITH_SLID = 1 << 7,
+ IBV_WC_EX_WITH_SL = 1 << 8,
+ IBV_WC_EX_WITH_DLID_PATH_BITS = 1 << 9,
+ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP = 1 << 10,
+*/
+
+/* Bitwise or of all flags between IBV_WC_EX_WITH_BYTE_LEN and
+ * IBV_WC_EX_WITH_COMPLETION_TIMESTAMP.
+ */
+#define SUPPORTED_WC_ALL_FLAGS 2045
+/* Bitwise or of all flags between IBV_WC_EX_WITH_BYTE_LEN and
+ * IBV_WC_EX_WITH_DLID_PATH_BITS (all the fields that are available
+ * in the legacy WC).
+ */
+#define SUPPORTED_WC_STD_FLAGS 1020
+
+#define OPTIMIZE_POLL_CQ /* All maybe - must be in table! */ \
+ OP(0, 0) SEP \
+ /* No options */ \
+ OP(0, SUPPORTED_WC_ALL_FLAGS) SEP \
+ /* All options */ \
+ OP(SUPPORTED_WC_ALL_FLAGS, 0) SEP \
+ /* All standard options */ \
+ OP(SUPPORTED_WC_STD_FLAGS, 1024) SEP \
+ /* Just Bytelen - for DPDK */ \
+ OP(4, 1016) SEP \
+ /* Timestmap only, for FSI */ \
+ OP(1024, 1020) SEP
+
+#define OP MLX5_POLL_ONE_EX_WC_FLAGS
+#define SEP ;
+
+/* Declare optimized poll_one function for popular scenarios. Each function
+ * has a name of
+ * mlx5_poll_one_ex_custom<supported_wc_flags>_<not_supported_wc_flags>.
+ * Since the supported and not supported wc_flags are given beforehand,
+ * the compiler could optimize the if and or statements and create optimized
+ * code.
+ */
+OPTIMIZE_POLL_CQ
+
+#define ADD_POLL_ONE(_wc_flags_yes, _wc_flags_no) \
+ {.wc_flags_yes = _wc_flags_yes, \
+ .wc_flags_no = _wc_flags_no, \
+ .fn = MLX5_POLL_ONE_EX_WC_FLAGS_NAME( \
+ _wc_flags_yes, _wc_flags_no) \
+ }
+
+#undef OP
+#undef SEP
+#define OP ADD_POLL_ONE
+#define SEP ,
+
+struct {
+ int (*fn)(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver);
+ uint64_t wc_flags_yes;
+ uint64_t wc_flags_no;
+} mlx5_poll_one_ex_fns[] = {
+ /* This array contains all the custom poll_one functions. Every entry
+ * in this array looks like:
+ * {.wc_flags_yes = <flags that are always in the wc>,
+ * .wc_flags_no = <flags that are never in the wc>,
+ * .fn = <the custom poll one function}.
+ * The .fn function is optimized according to the .wc_flags_yes and
+ * .wc_flags_no flags. Other flags have the "if statement".
+ */
+ OPTIMIZE_POLL_CQ
+};
+
+/* This function gets wc_flags as an argument and returns a function pointer
+ * of type * int (*fn)(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver);
+ * The returned function is one of the custom poll one functions declared in
+ * mlx5_poll_one_ex_fns. The function is chosen as the function which the
+ * number of wc_flags_maybe bits (the fields that aren't in the yes/no parts)
+ * is the smallest.
+ */
+int (*mlx5_get_poll_one_fn(uint64_t wc_flags))(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
+ int cqe_ver)
+{
+ unsigned int i = 0;
+ uint8_t min_bits = -1;
+ int min_index = 0xff;
+
+ for (i = 0;
+ i < sizeof(mlx5_poll_one_ex_fns) / sizeof(mlx5_poll_one_ex_fns[0]);
+ i++) {
+ uint64_t bits;
+ uint8_t nbits;
+
+ /* Can't have required flags in "no" */
+ if (wc_flags & mlx5_poll_one_ex_fns[i].wc_flags_no)
+ continue;
+
+ /* Can't have not required flags in yes */
+ if (~wc_flags & mlx5_poll_one_ex_fns[i].wc_flags_yes)
+ continue;
+
+ /* Number of wc_flags_maybe. See above comment for more details */
+ bits = (wc_flags ^ mlx5_poll_one_ex_fns[i].wc_flags_yes) |
+ ((~wc_flags ^ mlx5_poll_one_ex_fns[i].wc_flags_no) &
+ CREATE_CQ_SUPPORTED_WC_FLAGS);
+
+ nbits = ibv_popcount64(bits);
+
+ /* Look for the minimum number of bits */
+ if (nbits < min_bits) {
+ min_bits = nbits;
+ min_index = i;
+ }
+ }
+
+ assert(min_index >= 0);
+
+ return mlx5_poll_one_ex_fns[min_index].fn;
+}
+
static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
__attribute__((always_inline));
static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
diff --git a/src/mlx5.h b/src/mlx5.h
index 818fe85..9287fbd 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -109,6 +109,10 @@
#define PFX "mlx5: "
+enum {
+ CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS |
+ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP
+};
enum {
MLX5_IB_MMAP_CMD_SHIFT = 8,
@@ -623,6 +627,12 @@ int mlx5_poll_one_ex(struct mlx5_cq *cq,
struct mlx5_srq **cur_srq,
struct ibv_wc_ex **pwc_ex, uint64_t wc_flags,
int cqe_ver);
+int (*mlx5_get_poll_one_fn(uint64_t wc_flags))(struct mlx5_cq *cq,
+ struct mlx5_resource **cur_rsc,
+ struct mlx5_srq **cur_srq,
+ struct ibv_wc_ex **pwc_ex,
+ uint64_t wc_flags,
+ int cqe_ver);
int mlx5_alloc_cq_buf(struct mlx5_context *mctx, struct mlx5_cq *cq,
struct mlx5_buf *buf, int nent, int cqe_sz);
int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf);
diff --git a/src/verbs.c b/src/verbs.c
index 50955ae..86d0951 100644
--- a/src/verbs.c
+++ b/src/verbs.c
@@ -287,11 +287,6 @@ static int qp_sig_enabled(void)
}
enum {
- CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS |
- IBV_WC_EX_WITH_COMPLETION_TIMESTAMP
-};
-
-enum {
CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CREATE_CQ_ATTR_FLAGS
};
@@ -407,7 +402,9 @@ static struct ibv_cq *create_cq(struct ibv_context *context,
cq->stall_cycles = to_mctx(context)->stall_cycles;
cq->wc_flags = cq_attr->wc_flags;
- cq->poll_one = mlx5_poll_one_ex;
+ cq->poll_one = mlx5_get_poll_one_fn(cq->wc_flags);
+ if (!cq->poll_one)
+ cq->poll_one = mlx5_poll_one_ex;
return &cq->ibv_cq;
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH libmlx5 7/7] Add always_inline check
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (5 preceding siblings ...)
2015-11-15 12:30 ` [PATCH libmlx5 6/7] Optimize poll_cq Matan Barak
@ 2015-11-15 12:30 ` Matan Barak
2015-11-15 18:11 ` [PATCH libmlx5 0/7] Completion timestamping Christoph Lameter
2015-11-16 15:46 ` Tom Talpey
8 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-11-15 12:30 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Matan Barak,
Eran Ben Elisha, Christoph Lameter
Always inline isn't supported by every compiler. Adding it to
configure.ac in order to support it only when possible.
Inline other poll_one data path functions in order to eliminate
"ifs".
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
configure.ac | 17 +++++++++++++++++
src/cq.c | 42 +++++++++++++++++++++++++++++-------------
src/mlx5.h | 6 ++++++
3 files changed, 52 insertions(+), 13 deletions(-)
diff --git a/configure.ac b/configure.ac
index fca0b46..50b4f9c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -65,6 +65,23 @@ AC_CHECK_FUNC(ibv_read_sysfs_file, [],
AC_MSG_ERROR([ibv_read_sysfs_file() not found. libmlx5 requires libibverbs >= 1.0.3.]))
AC_CHECK_FUNCS(ibv_dontfork_range ibv_dofork_range ibv_register_driver)
+AC_MSG_CHECKING("always inline")
+CFLAGS_BAK="$CFLAGS"
+CFLAGS="$CFLAGS -Werror"
+AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[
+ static inline int f(void)
+ __attribute__((always_inline));
+ static inline int f(void)
+ {
+ return 1;
+ }
+]],[[
+ int a = f();
+ a = a;
+]])], [AC_MSG_RESULT([yes]) AC_DEFINE([HAVE_ALWAYS_INLINE], [1], [Define if __attribute((always_inline)).])],
+[AC_MSG_RESULT([no])])
+CFLAGS="$CFLAGS_BAK"
+
dnl Now check if for libibverbs 1.0 vs 1.1
dummy=if$$
cat <<IBV_VERSION > $dummy.c
diff --git a/src/cq.c b/src/cq.c
index fcb4237..41751b7 100644
--- a/src/cq.c
+++ b/src/cq.c
@@ -218,6 +218,14 @@ static inline void handle_good_req_ex(struct ibv_wc_ex *wc_ex,
uint64_t wc_flags_yes,
uint64_t wc_flags_no,
uint32_t qpn, uint64_t *wc_flags_out)
+ ALWAYS_INLINE;
+static inline void handle_good_req_ex(struct ibv_wc_ex *wc_ex,
+ union wc_buffer *pwc_buffer,
+ struct mlx5_cqe64 *cqe,
+ uint64_t wc_flags,
+ uint64_t wc_flags_yes,
+ uint64_t wc_flags_no,
+ uint32_t qpn, uint64_t *wc_flags_out)
{
union wc_buffer wc_buffer = *pwc_buffer;
@@ -367,6 +375,14 @@ static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
uint64_t wc_flags, uint64_t wc_flags_yes,
uint64_t wc_flags_no, uint32_t qpn,
uint64_t *wc_flags_out)
+ ALWAYS_INLINE;
+static inline int handle_responder_ex(struct ibv_wc_ex *wc_ex,
+ union wc_buffer *pwc_buffer,
+ struct mlx5_cqe64 *cqe,
+ struct mlx5_qp *qp, struct mlx5_srq *srq,
+ uint64_t wc_flags, uint64_t wc_flags_yes,
+ uint64_t wc_flags_no, uint32_t qpn,
+ uint64_t *wc_flags_out)
{
uint16_t wqe_ctr;
struct mlx5_wq *wq;
@@ -573,7 +589,7 @@ static void mlx5_get_cycles(uint64_t *cycles)
static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
uint32_t rsn, int cqe_ver)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
uint32_t rsn, int cqe_ver)
@@ -589,7 +605,7 @@ static inline int get_resp_cxt_v1(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
uint32_t uidx, int *is_srq)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int get_resp_cxt_v1(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -625,7 +641,7 @@ static inline int get_resp_cxt_v1(struct mlx5_context *mctx,
static inline int get_resp_ctx(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
uint32_t qpn)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int get_resp_ctx(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
uint32_t qpn)
@@ -647,7 +663,7 @@ static inline int get_resp_ctx(struct mlx5_context *mctx,
static inline int get_srq_ctx(struct mlx5_context *mctx,
struct mlx5_srq **cur_srq,
uint32_t srqn_uidx)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int get_srq_ctx(struct mlx5_context *mctx,
struct mlx5_srq **cur_srq,
uint32_t srqn)
@@ -662,7 +678,7 @@ static inline int get_srq_ctx(struct mlx5_context *mctx,
}
static inline void dump_cqe_debug(FILE *fp, struct mlx5_cqe64 *cqe64)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline void dump_cqe_debug(FILE *fp, struct mlx5_cqe64 *cqe64)
{
#ifdef MLX5_DEBUG
@@ -676,7 +692,7 @@ static inline void dump_cqe_debug(FILE *fp, struct mlx5_cqe64 *cqe64)
inline int mlx5_poll_one_cqe_req(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
void *cqe, uint32_t qpn, int cqe_ver,
- uint64_t *wr_id) __attribute__((always_inline));
+ uint64_t *wr_id) ALWAYS_INLINE;
inline int mlx5_poll_one_cqe_req(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
void *cqe, uint32_t qpn, int cqe_ver,
@@ -719,7 +735,7 @@ inline int mlx5_poll_one_cqe_resp(struct mlx5_context *mctx,
struct mlx5_srq **cur_srq,
struct mlx5_cqe64 *cqe64, int cqe_ver,
uint32_t qpn, int *is_srq)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
inline int mlx5_poll_one_cqe_resp(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -750,7 +766,7 @@ inline int mlx5_poll_one_cqe_err(struct mlx5_context *mctx,
uint32_t qpn, uint32_t *pwc_status,
uint32_t *pwc_vendor_err,
uint64_t *pwc_wr_id, uint8_t opcode)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
inline int mlx5_poll_one_cqe_err(struct mlx5_context *mctx,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -833,7 +849,7 @@ static inline int mlx5_poll_one(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
struct ibv_wc *wc, int cqe_ver)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int mlx5_poll_one(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -915,7 +931,7 @@ static inline int _mlx5_poll_one_ex(struct mlx5_cq *cq,
uint64_t wc_flags,
uint64_t wc_flags_yes, uint64_t wc_flags_no,
int cqe_ver)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int _mlx5_poll_one_ex(struct mlx5_cq *cq,
struct mlx5_resource **cur_rsc,
struct mlx5_srq **cur_srq,
@@ -1221,7 +1237,7 @@ int (*mlx5_get_poll_one_fn(uint64_t wc_flags))(struct mlx5_cq *cq,
}
static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
-__attribute__((always_inline));
+ALWAYS_INLINE;
static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
{
if (cq->stall_enable) {
@@ -1236,7 +1252,7 @@ static inline void mlx5_poll_cq_stall_start(struct mlx5_cq *cq)
}
static inline void mlx5_poll_cq_stall_end(struct mlx5_cq *cq, int ne,
- int npolled, int err) __attribute__((always_inline));
+ int npolled, int err) ALWAYS_INLINE;
static inline void mlx5_poll_cq_stall_end(struct mlx5_cq *cq, int ne,
int npolled, int err)
{
@@ -1263,7 +1279,7 @@ static inline void mlx5_poll_cq_stall_end(struct mlx5_cq *cq, int ne,
static inline int poll_cq(struct ibv_cq *ibcq, int ne,
struct ibv_wc *wc, int cqe_ver)
- __attribute__((always_inline));
+ ALWAYS_INLINE;
static inline int poll_cq(struct ibv_cq *ibcq, int ne,
struct ibv_wc *wc, int cqe_ver)
{
diff --git a/src/mlx5.h b/src/mlx5.h
index 9287fbd..d733224 100644
--- a/src/mlx5.h
+++ b/src/mlx5.h
@@ -114,6 +114,12 @@ enum {
IBV_WC_EX_WITH_COMPLETION_TIMESTAMP
};
+#ifdef HAVE_ALWAYS_INLINE
+#define ALWAYS_INLINE __attribute__((always_inline))
+#else
+#define ALWAYS_INLINE
+#endif
+
enum {
MLX5_IB_MMAP_CMD_SHIFT = 8,
MLX5_IB_MMAP_CMD_MASK = 0xff,
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH libmlx5 0/7] Completion timestamping
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (6 preceding siblings ...)
2015-11-15 12:30 ` [PATCH libmlx5 7/7] Add always_inline check Matan Barak
@ 2015-11-15 18:11 ` Christoph Lameter
[not found] ` <alpine.DEB.2.20.1511151209290.31074-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>
2015-11-16 15:46 ` Tom Talpey
8 siblings, 1 reply; 13+ messages in thread
From: Christoph Lameter @ 2015-11-15 18:11 UTC (permalink / raw)
To: Matan Barak
Cc: Eli Cohen, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford,
Eran Ben Elisha
On Sun, 15 Nov 2015, Matan Barak wrote:
> This series adds support for completion timestamp. In order to
> support this feature, several extended verbs were implemented
> (as instructed in libibverbs).
This is the portion that
implements timestaping for libmlx5 and this patchset depends on another
one that needs to be merged into libibverbs.
Right?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH libmlx5 0/7] Completion timestamping
[not found] ` <1447590634-12858-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (7 preceding siblings ...)
2015-11-15 18:11 ` [PATCH libmlx5 0/7] Completion timestamping Christoph Lameter
@ 2015-11-16 15:46 ` Tom Talpey
[not found] ` <5649FA3F.2050209-CLs1Zie5N5HQT0dZR+AlfA@public.gmane.org>
8 siblings, 1 reply; 13+ messages in thread
From: Tom Talpey @ 2015-11-16 15:46 UTC (permalink / raw)
To: Matan Barak, Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Christoph Lameter
On 11/15/2015 7:30 AM, Matan Barak wrote:
> This series adds support for completion timestamp. In order to
> support this feature, several extended verbs were implemented
> (as instructed in libibverbs).
Can you describe what these timestamps are actually for? It's not
clear at all from the comments. I'm assuming they are for some sort
of fine-grained statistics? Are they purely for userspace consumers?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread