* [PATCH for-next V1 0/5] User-space time-stamping support for mlx5_ib
@ 2015-12-03 15:44 Matan Barak
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
Hi Eli,
This patch-set adds user-space support for time-stamping in mlx5_ib.
It implements the necessary API:
(a) ib_create_cq_ex - Add support for CQ creation flags
(b) ib_query_device - return timestamp_mask and hca_core_clock.
We also add support for mmaping the HCA's free running clock.
In order to do so, we use the response of the vendor's extended
part in init_ucontext. This allows us to pass the page offset
of the free running clock register to the user-space driver.
In order to implement it in a future extensible manner, we use the
same mechanism of verbs extensions to the mlx5 vendor part as well.
Regards,
Matan
Changes from v0:
* Limit mmap PAGE_SIZE to 4K (security wise).
* Optimize ib_is_udata_cleared.
* Pass hca_core_clock_offset in the vendor's response part of init_ucontext.
Matan Barak (5):
IB/mlx5: Add create_cq extended command
IB/core: Add ib_is_udata_cleared
IB/mlx5: Add support for hca_core_clock and timestamp_mask
IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext
IB/mlx5: Mmap the HCA's core clock register to user-space
drivers/infiniband/hw/mlx5/cq.c | 7 ++++
drivers/infiniband/hw/mlx5/main.c | 67 +++++++++++++++++++++++++++++++-----
drivers/infiniband/hw/mlx5/mlx5_ib.h | 7 +++-
drivers/infiniband/hw/mlx5/user.h | 12 +++++--
include/linux/mlx5/device.h | 7 ++--
include/linux/mlx5/mlx5_ifc.h | 9 +++--
include/rdma/ib_verbs.h | 67 ++++++++++++++++++++++++++++++++++++
7 files changed, 160 insertions(+), 16 deletions(-)
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH for-next V1 1/5] IB/mlx5: Add create_cq extended command
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-03 15:44 ` Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared Matan Barak
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
In order to create a CQ that supports timestamp, mlx5 needs to
support the extended create CQ command with the timestamp flag.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx5/cq.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 3ce5cfa7..a9a7921 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -760,6 +760,10 @@ static void destroy_cq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *cq)
mlx5_db_free(dev->mdev, &cq->db);
}
+enum {
+ CQ_CREATE_FLAGS_SUPPORTED = IB_CQ_FLAGS_TIMESTAMP_COMPLETION
+};
+
struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
const struct ib_cq_init_attr *attr,
struct ib_ucontext *context,
@@ -783,6 +787,9 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev,
if (entries < 0)
return ERR_PTR(-EINVAL);
+ if (attr->flags & ~CQ_CREATE_FLAGS_SUPPORTED)
+ return ERR_PTR(-EOPNOTSUPP);
+
entries = roundup_pow_of_two(entries + 1);
if (entries > (1 << MLX5_CAP_GEN(dev->mdev, log_max_cq_sz)))
return ERR_PTR(-EINVAL);
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-03 15:44 ` [PATCH for-next V1 1/5] IB/mlx5: Add create_cq extended command Matan Barak
@ 2015-12-03 15:44 ` Matan Barak
[not found] ` <1449157463-25313-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-03 15:44 ` [PATCH for-next V1 3/5] IB/mlx5: Add support for hca_core_clock and timestamp_mask Matan Barak
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
Extending core and vendor verb commands require us to check that the
unknown part of the user's given command is all zeros.
Adding ib_is_udata_cleared in order to do so.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Moshe Lazer <moshel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
include/rdma/ib_verbs.h | 67 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 31fb409..0ad89e3 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1947,6 +1947,73 @@ static inline int ib_copy_to_udata(struct ib_udata *udata, void *src, size_t len
return copy_to_user(udata->outbuf, src, len) ? -EFAULT : 0;
}
+#define IB_UDATA_ELEMENT_CLEARED(type, ptr, len, expected) \
+ ({type v; \
+ typeof(ptr) __ptr = ptr; \
+ \
+ ptr = (void *)ptr + sizeof(type); \
+ len -= sizeof(type); \
+ !copy_from_user(&v, __ptr, sizeof(v)) && (v == expected); })
+
+static inline bool ib_is_udata_cleared(struct ib_udata *udata,
+ u8 cleared_char,
+ size_t offset,
+ size_t len)
+{
+ const void __user *p = udata->inbuf + offset;
+#ifdef CONFIG_64BIT
+ u64 expected = cleared_char;
+#else
+ u32 expected = cleared_char;
+#endif
+
+ if (len > USHRT_MAX)
+ return false;
+
+ if (len && (uintptr_t)p & 1)
+ if (!IB_UDATA_ELEMENT_CLEARED(u8, p, len, expected))
+ return false;
+
+ expected = expected << 8 | expected;
+ if (len >= 2 && (uintptr_t)p & 2)
+ if (!IB_UDATA_ELEMENT_CLEARED(u16, p, len, expected))
+ return false;
+
+ expected = expected << 16 | expected;
+#ifdef CONFIG_64BIT
+ if (len >= 4 && (uintptr_t)p & 4)
+ if (!IB_UDATA_ELEMENT_CLEARED(u32, p, len, expected))
+ return false;
+
+ expected = expected << 32 | expected;
+#define IB_UDATA_CLEAR_LOOP_TYPE u64
+#else
+#define IB_UDATA_CLEAR_LOOP_TYPE u32
+#endif
+ while (len >= sizeof(IB_UDATA_CLEAR_LOOP_TYPE))
+ if (!IB_UDATA_ELEMENT_CLEARED(IB_UDATA_CLEAR_LOOP_TYPE, p, len,
+ expected))
+ return false;
+
+#ifdef CONFIG_64BIT
+ expected = expected >> 32;
+ if (len >= 4 && (uintptr_t)p & 4)
+ if (!IB_UDATA_ELEMENT_CLEARED(u32, p, len, expected))
+ return false;
+#endif
+ expected = expected >> 16;
+ if (len >= 2 && (uintptr_t)p & 2)
+ if (!IB_UDATA_ELEMENT_CLEARED(u16, p, len, expected))
+ return false;
+
+ expected = expected >> 8;
+ if (len)
+ if (!IB_UDATA_ELEMENT_CLEARED(u8, p, len, expected))
+ return false;
+
+ return true;
+}
+
/**
* ib_modify_qp_is_ok - Check that the supplied attribute mask
* contains all required attributes and no attributes not allowed for
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH for-next V1 3/5] IB/mlx5: Add support for hca_core_clock and timestamp_mask
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-03 15:44 ` [PATCH for-next V1 1/5] IB/mlx5: Add create_cq extended command Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared Matan Barak
@ 2015-12-03 15:44 ` Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 5/5] IB/mlx5: Mmap the HCA's core clock register to user-space Matan Barak
4 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
Reporting the hca_core_clock (in kHZ) and the timestamp_mask in
query_device extended verb. timestamp_mask is used by users in order
to know what is the valid range of the raw timestamps, while
hca_core_clock reports the clock frequency that is used for
timestamps.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-by: Moshe Lazer <moshel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx5/main.c | 2 ++
include/linux/mlx5/mlx5_ifc.h | 9 ++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 9b77058..8aa0330 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -504,6 +504,8 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
props->max_mcast_grp;
props->max_map_per_fmr = INT_MAX; /* no limit in ConnectIB */
+ props->hca_core_clock = MLX5_CAP_GEN(mdev, device_frequency_khz);
+ props->timestamp_mask = 0x7FFFFFFFFFFFFFFFULL;
#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
if (MLX5_CAP_GEN(mdev, pg))
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index af51cd2..c57e975 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -792,15 +792,18 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_63[0x8];
u8 log_uar_page_sz[0x10];
- u8 reserved_64[0x100];
+ u8 reserved_64[0x20];
+ u8 device_frequency_mhz[0x20];
+ u8 device_frequency_khz[0x20];
+ u8 reserved_65[0xa0];
- u8 reserved_65[0x1f];
+ u8 reserved_66[0x1f];
u8 cqe_zip[0x1];
u8 cqe_zip_timeout[0x10];
u8 cqe_zip_max_num[0x10];
- u8 reserved_66[0x220];
+ u8 reserved_67[0x220];
};
enum {
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH for-next V1 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (2 preceding siblings ...)
2015-12-03 15:44 ` [PATCH for-next V1 3/5] IB/mlx5: Add support for hca_core_clock and timestamp_mask Matan Barak
@ 2015-12-03 15:44 ` Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 5/5] IB/mlx5: Mmap the HCA's core clock register to user-space Matan Barak
4 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
Pass hca_core_clock_offset to user-space is mandatory in order to
let the user-space read the free-running clock register from the
right offset in the memory mapped page.
Passing this value is done by changing the vendor's command
and response of init_ucontext to be in extensible form.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-By: Moshe Lazer <moshel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx5/main.c | 37 ++++++++++++++++++++++++++++--------
drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +++
drivers/infiniband/hw/mlx5/user.h | 12 ++++++++++--
include/linux/mlx5/device.h | 7 +++++--
4 files changed, 47 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 8aa0330..e4ce010 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -796,8 +796,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
struct ib_udata *udata)
{
struct mlx5_ib_dev *dev = to_mdev(ibdev);
- struct mlx5_ib_alloc_ucontext_req_v2 req;
- struct mlx5_ib_alloc_ucontext_resp resp;
+ struct mlx5_ib_alloc_ucontext_req_v2 req = {};
+ struct mlx5_ib_alloc_ucontext_resp resp = {};
struct mlx5_ib_ucontext *context;
struct mlx5_uuar_info *uuari;
struct mlx5_uar *uars;
@@ -812,20 +812,19 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
if (!dev->ib_active)
return ERR_PTR(-EAGAIN);
- memset(&req, 0, sizeof(req));
reqlen = udata->inlen - sizeof(struct ib_uverbs_cmd_hdr);
if (reqlen == sizeof(struct mlx5_ib_alloc_ucontext_req))
ver = 0;
- else if (reqlen == sizeof(struct mlx5_ib_alloc_ucontext_req_v2))
+ else if (reqlen >= sizeof(struct mlx5_ib_alloc_ucontext_req_v2))
ver = 2;
else
return ERR_PTR(-EINVAL);
- err = ib_copy_from_udata(&req, udata, reqlen);
+ err = ib_copy_from_udata(&req, udata, min(reqlen, sizeof(req)));
if (err)
return ERR_PTR(err);
- if (req.flags || req.reserved)
+ if (req.flags)
return ERR_PTR(-EINVAL);
if (req.total_num_uuars > MLX5_MAX_UUARS)
@@ -834,6 +833,14 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
if (req.total_num_uuars == 0)
return ERR_PTR(-EINVAL);
+ if (req.comp_mask)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ if (reqlen > sizeof(req) &&
+ !ib_is_udata_cleared(udata, '\0', sizeof(req),
+ udata->inlen - sizeof(req)))
+ return ERR_PTR(-EOPNOTSUPP);
+
req.total_num_uuars = ALIGN(req.total_num_uuars,
MLX5_NON_FP_BF_REGS_PER_PAGE);
if (req.num_low_latency_uuars > req.total_num_uuars - 1)
@@ -849,6 +856,8 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
resp.max_send_wqebb = 1 << MLX5_CAP_GEN(dev->mdev, log_max_qp_sz);
resp.max_recv_wr = 1 << MLX5_CAP_GEN(dev->mdev, log_max_qp_sz);
resp.max_srq_recv_wr = 1 << MLX5_CAP_GEN(dev->mdev, log_max_srq_sz);
+ resp.response_length = min(offsetof(typeof(resp), response_length) +
+ sizeof(resp.response_length), udata->outlen);
context = kzalloc(sizeof(*context), GFP_KERNEL);
if (!context)
@@ -899,8 +908,20 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
resp.tot_uuars = req.total_num_uuars;
resp.num_ports = MLX5_CAP_GEN(dev->mdev, num_ports);
- err = ib_copy_to_udata(udata, &resp,
- sizeof(resp) - sizeof(resp.reserved));
+
+ if (field_avail(typeof(resp), reserved2, udata->outlen))
+ resp.response_length += sizeof(resp.reserved2);
+
+ if (field_avail(typeof(resp), hca_core_clock_offset, udata->outlen)) {
+ resp.comp_mask |=
+ MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET;
+ resp.hca_core_clock_offset =
+ offsetof(struct mlx5_init_seg, internal_timer_h) %
+ PAGE_SIZE;
+ resp.response_length += sizeof(resp.hca_core_clock_offset);
+ }
+
+ err = ib_copy_to_udata(udata, &resp, resp.response_length);
if (err)
goto out_uars;
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index b0deeb3..b2a6643 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -55,6 +55,9 @@ pr_err("%s:%s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \
pr_warn("%s:%s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \
__LINE__, current->pid, ##arg)
+#define field_avail(type, fld, sz) (offsetof(type, fld) + \
+ sizeof(((type *)0)->fld) <= (sz))
+
enum {
MLX5_IB_MMAP_CMD_SHIFT = 8,
MLX5_IB_MMAP_CMD_MASK = 0xff,
diff --git a/drivers/infiniband/hw/mlx5/user.h b/drivers/infiniband/hw/mlx5/user.h
index 76fb7b9..e22a35f 100644
--- a/drivers/infiniband/hw/mlx5/user.h
+++ b/drivers/infiniband/hw/mlx5/user.h
@@ -66,7 +66,11 @@ struct mlx5_ib_alloc_ucontext_req_v2 {
__u32 total_num_uuars;
__u32 num_low_latency_uuars;
__u32 flags;
- __u32 reserved;
+ __u32 comp_mask;
+};
+
+enum mlx5_ib_alloc_ucontext_resp_mask {
+ MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET = 1UL << 0,
};
struct mlx5_ib_alloc_ucontext_resp {
@@ -80,7 +84,11 @@ struct mlx5_ib_alloc_ucontext_resp {
__u32 max_recv_wr;
__u32 max_srq_recv_wr;
__u16 num_ports;
- __u16 reserved;
+ __u16 reserved1;
+ __u32 comp_mask;
+ __u32 response_length;
+ __u32 reserved2;
+ __u64 hca_core_clock_offset;
};
struct mlx5_ib_alloc_pd_resp {
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index 14b6e7e..841d7de 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -461,9 +461,12 @@ struct mlx5_init_seg {
__be32 cmd_dbell;
__be32 rsvd1[121];
struct health_buffer health;
- __be32 rsvd2[884];
+ __be32 rsvd2[880];
+ __be32 internal_timer_h;
+ __be32 internal_timer_l;
+ __be32 rsvd3[2];
__be32 health_counter;
- __be32 rsvd3[1019];
+ __be32 rsvd4[1019];
__be64 ieee1588_clk;
__be32 ieee1588_clk_type;
__be32 clr_intx;
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH for-next V1 5/5] IB/mlx5: Mmap the HCA's core clock register to user-space
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
` (3 preceding siblings ...)
2015-12-03 15:44 ` [PATCH for-next V1 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext Matan Barak
@ 2015-12-03 15:44 ` Matan Barak
4 siblings, 0 replies; 13+ messages in thread
From: Matan Barak @ 2015-12-03 15:44 UTC (permalink / raw)
To: Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud, Matan Barak
In order to read the HCA's current cycles register, we need
to map it to user-space. Add support to map this register
via mmap command.
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Reviewed-By: Moshe Lazer <moshel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/hw/mlx5/main.c | 28 ++++++++++++++++++++++++++++
drivers/infiniband/hw/mlx5/mlx5_ib.h | 4 +++-
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e4ce010..41d916c 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1024,6 +1024,34 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext, struct vm_area_struct *vm
case MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES:
return -ENOSYS;
+ case MLX5_IB_MMAP_CORE_CLOCK:
+ {
+ phys_addr_t pfn;
+
+ if (vma->vm_end - vma->vm_start != PAGE_SIZE)
+ return -EINVAL;
+
+ if (vma->vm_flags & (VM_WRITE | VM_EXEC))
+ return -EPERM;
+
+ /* Don't expose to user-space information it shouldn't have */
+ if (PAGE_SIZE > 4096)
+ return -EOPNOTSUPP;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+ pfn = (dev->mdev->iseg_base +
+ offsetof(struct mlx5_init_seg, internal_timer_h)) >>
+ PAGE_SHIFT;
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn,
+ PAGE_SIZE, vma->vm_page_prot))
+ return -EAGAIN;
+
+ mlx5_ib_dbg(dev, "mapped internal timer at 0x%lx, PA 0x%llx\n",
+ vma->vm_start,
+ (unsigned long long)pfn << PAGE_SHIFT);
+ break;
+ }
+
default:
return -EINVAL;
}
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index b2a6643..1303487 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -65,7 +65,9 @@ enum {
enum mlx5_ib_mmap_cmd {
MLX5_IB_MMAP_REGULAR_PAGE = 0,
- MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES = 1, /* always last */
+ MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES = 1,
+ /* 5 is chosen in order to be compatible with old versions of libmlx5 */
+ MLX5_IB_MMAP_CORE_CLOCK = 5,
};
enum {
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <1449157463-25313-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-07 13:18 ` Haggai Eran
[not found] ` <56658718.8080308-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Haggai Eran @ 2015-12-07 13:18 UTC (permalink / raw)
To: Matan Barak, Eli Cohen
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Doug Ledford, Eran Ben Elisha,
Yann Droneaud
On 12/03/2015 05:44 PM, Matan Barak wrote:
> Extending core and vendor verb commands require us to check that the
> unknown part of the user's given command is all zeros.
> Adding ib_is_udata_cleared in order to do so.
>
Why not copy the data into kernel space and run memchr_inv() on it?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <56658718.8080308-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-10 14:59 ` Matan Barak
[not found] ` <CAAKD3BC7bhKC+jKV3aH9W+pcBwV19PJZYwXdZue6hNew=CJW9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-12-10 14:59 UTC (permalink / raw)
To: Haggai Eran
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On Mon, Dec 7, 2015 at 3:18 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 12/03/2015 05:44 PM, Matan Barak wrote:
>> Extending core and vendor verb commands require us to check that the
>> unknown part of the user's given command is all zeros.
>> Adding ib_is_udata_cleared in order to do so.
>>
>
> Why not copy the data into kernel space and run memchr_inv() on it?
>
Probably less efficient, isn't it?
I know it isn't data path, but we'll execute this code in all extended
functions (sometimes even more than once).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <CAAKD3BC7bhKC+jKV3aH9W+pcBwV19PJZYwXdZue6hNew=CJW9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-12-10 15:20 ` Haggai Eran
[not found] ` <5669983E.5000200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Haggai Eran @ 2015-12-10 15:20 UTC (permalink / raw)
To: Matan Barak
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On 10/12/2015 16:59, Matan Barak wrote:
> On Mon, Dec 7, 2015 at 3:18 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> On 12/03/2015 05:44 PM, Matan Barak wrote:
>>> Extending core and vendor verb commands require us to check that the
>>> unknown part of the user's given command is all zeros.
>>> Adding ib_is_udata_cleared in order to do so.
>>>
>>
>> Why not copy the data into kernel space and run memchr_inv() on it?
>>
>
> Probably less efficient, isn't it?
Why do you think it is less efficient?
I'm not sure calling copy_from_user multiple times is very efficient.
For once, you are calling access_ok multiple times. I guess it depends
on the amount of data you are copying.
> I know it isn't data path, but we'll execute this code in all extended
> functions (sometimes even more than once).
Do you think it is important enough to maintain our own copy of
memchr_inv()?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <5669983E.5000200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-10 17:29 ` Matan Barak
[not found] ` <CAAKD3BDfobwroOnvoxPMTA5zL=wxY3KDL=FAvq5EdUOkFG_F+Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-12-10 17:29 UTC (permalink / raw)
To: Haggai Eran
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On Thu, Dec 10, 2015 at 5:20 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 10/12/2015 16:59, Matan Barak wrote:
>> On Mon, Dec 7, 2015 at 3:18 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>>> On 12/03/2015 05:44 PM, Matan Barak wrote:
>>>> Extending core and vendor verb commands require us to check that the
>>>> unknown part of the user's given command is all zeros.
>>>> Adding ib_is_udata_cleared in order to do so.
>>>>
>>>
>>> Why not copy the data into kernel space and run memchr_inv() on it?
>>>
>>
>> Probably less efficient, isn't it?
> Why do you think it is less efficient?
>
> I'm not sure calling copy_from_user multiple times is very efficient.
> For once, you are calling access_ok multiple times. I guess it depends
> on the amount of data you are copying.
>
Isn't access_ok pretty cheap?
It calls __chk_range_not_ok which on x86 seems like a very cheap
function and __chk_user_ptr which is a compiler check.
I guess most kernel-user implementation will be pretty much in sync,
so we'll possibly call it for a few/dozens of bytes. In that case, I
think this implementation is a bit faster.
>> I know it isn't data path, but we'll execute this code in all extended
>> functions (sometimes even more than once).
> Do you think it is important enough to maintain our own copy of
> memchr_inv()?
>
True, I'm not sure it's important enough, but do you think it's that
complicated?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <CAAKD3BDfobwroOnvoxPMTA5zL=wxY3KDL=FAvq5EdUOkFG_F+Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-12-13 15:47 ` Haggai Eran
[not found] ` <566D9315.9050807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Haggai Eran @ 2015-12-13 15:47 UTC (permalink / raw)
To: Matan Barak
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On 10/12/2015 19:29, Matan Barak wrote:
> On Thu, Dec 10, 2015 at 5:20 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>> On 10/12/2015 16:59, Matan Barak wrote:
>>> On Mon, Dec 7, 2015 at 3:18 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>>>> On 12/03/2015 05:44 PM, Matan Barak wrote:
>>>>> Extending core and vendor verb commands require us to check that the
>>>>> unknown part of the user's given command is all zeros.
>>>>> Adding ib_is_udata_cleared in order to do so.
>>>>>
>>>>
>>>> Why not copy the data into kernel space and run memchr_inv() on it?
>>>>
>>>
>>> Probably less efficient, isn't it?
>> Why do you think it is less efficient?
>>
>> I'm not sure calling copy_from_user multiple times is very efficient.
>> For once, you are calling access_ok multiple times. I guess it depends
>> on the amount of data you are copying.
>>
>
> Isn't access_ok pretty cheap?
> It calls __chk_range_not_ok which on x86 seems like a very cheap
> function and __chk_user_ptr which is a compiler check.
> I guess most kernel-user implementation will be pretty much in sync,
> so we'll possibly call it for a few/dozens of bytes. In that case, I
> think this implementation is a bit faster.
>
>>> I know it isn't data path, but we'll execute this code in all extended
>>> functions (sometimes even more than once).
>> Do you think it is important enough to maintain our own copy of
>> memchr_inv()?
>>
>
> True, I'm not sure it's important enough, but do you think it's that
> complicated?
It is complicated in my opinion. It is 67 lines of code, it's
architecture dependent and relies on preprocessor macros and conditional
code. I think this kind of stuff belongs in lib/string.c and not in the
RDMA stack.
Haggai
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
[not found] ` <566D9315.9050807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-12-14 16:31 ` Matan Barak
2015-12-15 14:13 ` Haggai Eran
0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-12-14 16:31 UTC (permalink / raw)
To: Haggai Eran
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On Sun, Dec 13, 2015 at 5:47 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
> On 10/12/2015 19:29, Matan Barak wrote:
>> On Thu, Dec 10, 2015 at 5:20 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>>> On 10/12/2015 16:59, Matan Barak wrote:
>>>> On Mon, Dec 7, 2015 at 3:18 PM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote:
>>>>> On 12/03/2015 05:44 PM, Matan Barak wrote:
>>>>>> Extending core and vendor verb commands require us to check that the
>>>>>> unknown part of the user's given command is all zeros.
>>>>>> Adding ib_is_udata_cleared in order to do so.
>>>>>>
>>>>>
>>>>> Why not copy the data into kernel space and run memchr_inv() on it?
>>>>>
>>>>
>>>> Probably less efficient, isn't it?
>>> Why do you think it is less efficient?
>>>
>>> I'm not sure calling copy_from_user multiple times is very efficient.
>>> For once, you are calling access_ok multiple times. I guess it depends
>>> on the amount of data you are copying.
>>>
>>
>> Isn't access_ok pretty cheap?
>> It calls __chk_range_not_ok which on x86 seems like a very cheap
>> function and __chk_user_ptr which is a compiler check.
>> I guess most kernel-user implementation will be pretty much in sync,
>> so we'll possibly call it for a few/dozens of bytes. In that case, I
>> think this implementation is a bit faster.
>>
>>>> I know it isn't data path, but we'll execute this code in all extended
>>>> functions (sometimes even more than once).
>>> Do you think it is important enough to maintain our own copy of
>>> memchr_inv()?
>>>
>>
>> True, I'm not sure it's important enough, but do you think it's that
>> complicated?
>
> It is complicated in my opinion. It is 67 lines of code, it's
> architecture dependent and relies on preprocessor macros and conditional
> code. I think this kind of stuff belongs in lib/string.c and not in the
> RDMA stack.
>
I'm not sure regarding the string.c location, as it deals with user
buffers, but in order not to
be dependent on this, I'll change this code to the following.
static inline bool ib_is_udata_cleared(struct ib_udata *udata,
u8 cleared_char,
size_t offset,
size_t len)
{
const void __user *p = udata->inbuf + offset;
bool ret = false;
u8 *buf;
if (len > USHRT_MAX)
return false;
buf = kmalloc(len, GFP_KERNEL);
if (!buf)
return false;
if (copy_from_user(buf, p, len))
goto free;
ret = !memchr_inv(buf, cleared_char, len);
free:
kfree(buf);
return ret;
}
> Haggai
Regards,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared
2015-12-14 16:31 ` Matan Barak
@ 2015-12-15 14:13 ` Haggai Eran
0 siblings, 0 replies; 13+ messages in thread
From: Haggai Eran @ 2015-12-15 14:13 UTC (permalink / raw)
To: Matan Barak
Cc: Matan Barak, Eli Cohen, linux-rdma, Doug Ledford,
Eran Ben Elisha, Yann Droneaud
On 14/12/2015 18:31, Matan Barak wrote:
> I'm not sure regarding the string.c location, as it deals with user
> buffers, but in order not to
> be dependent on this, I'll change this code to the following.
>
> static inline bool ib_is_udata_cleared(struct ib_udata *udata,
> u8 cleared_char,
> size_t offset,
> size_t len)
> {
> const void __user *p = udata->inbuf + offset;
> bool ret = false;
> u8 *buf;
>
> if (len > USHRT_MAX)
> return false;
>
> buf = kmalloc(len, GFP_KERNEL);
> if (!buf)
> return false;
>
> if (copy_from_user(buf, p, len))
> goto free;
>
> ret = !memchr_inv(buf, cleared_char, len);
>
> free:
> kfree(buf);
> return ret;
> }
Looks good to me.
Thanks,
Haggai
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-12-15 14:13 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-03 15:44 [PATCH for-next V1 0/5] User-space time-stamping support for mlx5_ib Matan Barak
[not found] ` <1449157463-25313-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-03 15:44 ` [PATCH for-next V1 1/5] IB/mlx5: Add create_cq extended command Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 2/5] IB/core: Add ib_is_udata_cleared Matan Barak
[not found] ` <1449157463-25313-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-07 13:18 ` Haggai Eran
[not found] ` <56658718.8080308-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-10 14:59 ` Matan Barak
[not found] ` <CAAKD3BC7bhKC+jKV3aH9W+pcBwV19PJZYwXdZue6hNew=CJW9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-12-10 15:20 ` Haggai Eran
[not found] ` <5669983E.5000200-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-10 17:29 ` Matan Barak
[not found] ` <CAAKD3BDfobwroOnvoxPMTA5zL=wxY3KDL=FAvq5EdUOkFG_F+Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-12-13 15:47 ` Haggai Eran
[not found] ` <566D9315.9050807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-12-14 16:31 ` Matan Barak
2015-12-15 14:13 ` Haggai Eran
2015-12-03 15:44 ` [PATCH for-next V1 3/5] IB/mlx5: Add support for hca_core_clock and timestamp_mask Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 4/5] IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext Matan Barak
2015-12-03 15:44 ` [PATCH for-next V1 5/5] IB/mlx5: Mmap the HCA's core clock register to user-space Matan Barak
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.