* [PATCH v2 for-next 0/7] Add support for multicast loopback prevention to mlx4 @ 2015-10-15 11:44 Eran Ben Elisha [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter Hi Doug, This patch-set adds a new implementation for multicast loopback prevention for mlx4 driver. The current implementation is very limited, especially if link layer is Ethernet. The new implementation is based on HW feature of dropping incoming multicast packets if the sender QP counter index is equal to the receiver counter index. Patch 0001 extends ib_uverbs_create_qp in order to allow receiving the multicast loopback flag at create flags. Patch 0002 allows setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK in create_flag field both from uverbs and ib_create_qp. Patch 0003 adds an infrastructure for the counters' loopback prevention in the mlx4_core. Patch 0004 modifies mlx4_en QPs to use the new loopback prevention mode. Patches 0005-0007 implements this feature for mlx4_ib driver. Changes from v0: Move loopback assignment outside the for loop according to Yuval's comment rebase over to-be-rebased/for-4.3 Changes from v1: rebase over k.o/for-4.4 Fix cover letter patch description - was out of order Tested-by: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> Thanks, Eran. Eran Ben Elisha (5): IB/core: Extend ib_uverbs_create_qp IB/core: Allow setting create flags in QP init attribute IB/mlx4: Add IB counters table IB/mlx4: Add counter based implementation for QP multicast loopback block IB/mlx4: Add support for blocking multicast loopback QP creation user flag Maor Gottlieb (2): net/mlx4_core: Add support for filtering multicast loopback net/mlx4_en: Implement mcast loopback prevention for ETH qps drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++------ drivers/infiniband/core/uverbs_main.c | 1 + drivers/infiniband/hw/mlx4/mad.c | 25 +- drivers/infiniband/hw/mlx4/main.c | 66 ++++-- drivers/infiniband/hw/mlx4/mlx4_ib.h | 10 +- drivers/infiniband/hw/mlx4/qp.c | 88 ++++++- drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++ drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 ++ drivers/net/ethernet/mellanox/mlx4/fw.c | 6 + drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 +- drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +- .../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 ++- include/linux/mlx4/device.h | 2 + include/linux/mlx4/qp.h | 24 +- include/uapi/rdma/ib_user_verbs.h | 26 +++ 16 files changed, 498 insertions(+), 109 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-10-15 11:44 ` Eran Ben Elisha [not found] ` <1444909482-17113-2-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha ` (5 subsequent siblings) 6 siblings, 1 reply; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter ib_uverbs_ex_create_qp follows the extension verbs mechanism. New features (for example, QP creation flags field which is added in a downstream patch) could used via user-space libraries without breaking the ABI. Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 259 +++++++++++++++++++++++++--------- drivers/infiniband/core/uverbs_main.c | 1 + include/uapi/rdma/ib_user_verbs.h | 26 ++++ 4 files changed, 222 insertions(+), 65 deletions(-) diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h index 3863d33..94bbd8c 100644 --- a/drivers/infiniband/core/uverbs.h +++ b/drivers/infiniband/core/uverbs.h @@ -272,5 +272,6 @@ IB_UVERBS_DECLARE_EX_CMD(create_flow); IB_UVERBS_DECLARE_EX_CMD(destroy_flow); IB_UVERBS_DECLARE_EX_CMD(query_device); IB_UVERBS_DECLARE_EX_CMD(create_cq); +IB_UVERBS_DECLARE_EX_CMD(create_qp); #endif /* UVERBS_H */ diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index be4cb9f..e795d59 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1741,66 +1741,65 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file, return in_len; } -ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, - struct ib_device *ib_dev, - const char __user *buf, int in_len, - int out_len) -{ - struct ib_uverbs_create_qp cmd; - struct ib_uverbs_create_qp_resp resp; - struct ib_udata udata; - struct ib_uqp_object *obj; - struct ib_device *device; - struct ib_pd *pd = NULL; - struct ib_xrcd *xrcd = NULL; - struct ib_uobject *uninitialized_var(xrcd_uobj); - struct ib_cq *scq = NULL, *rcq = NULL; - struct ib_srq *srq = NULL; - struct ib_qp *qp; - struct ib_qp_init_attr attr; - int ret; - - if (out_len < sizeof resp) - return -ENOSPC; - - if (copy_from_user(&cmd, buf, sizeof cmd)) - return -EFAULT; +static int create_qp(struct ib_uverbs_file *file, + struct ib_udata *ucore, + struct ib_udata *uhw, + struct ib_uverbs_ex_create_qp *cmd, + size_t cmd_sz, + int (*cb)(struct ib_uverbs_file *file, + struct ib_uverbs_ex_create_qp_resp *resp, + struct ib_udata *udata), + void *context) +{ + struct ib_uqp_object *obj; + struct ib_device *device; + struct ib_pd *pd = NULL; + struct ib_xrcd *xrcd = NULL; + struct ib_uobject *uninitialized_var(xrcd_uobj); + struct ib_cq *scq = NULL, *rcq = NULL; + struct ib_srq *srq = NULL; + struct ib_qp *qp; + char *buf; + struct ib_qp_init_attr attr; + struct ib_uverbs_ex_create_qp_resp resp; + int ret; - if (cmd.qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) + if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) return -EPERM; - INIT_UDATA(&udata, buf + sizeof cmd, - (unsigned long) cmd.response + sizeof resp, - in_len - sizeof cmd, out_len - sizeof resp); - obj = kzalloc(sizeof *obj, GFP_KERNEL); if (!obj) return -ENOMEM; - init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class); + init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext, + &qp_lock_class); down_write(&obj->uevent.uobject.mutex); - if (cmd.qp_type == IB_QPT_XRC_TGT) { - xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj); + if (cmd->qp_type == IB_QPT_XRC_TGT) { + xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext, + &xrcd_uobj); if (!xrcd) { ret = -EINVAL; goto err_put; } device = xrcd->device; } else { - if (cmd.qp_type == IB_QPT_XRC_INI) { - cmd.max_recv_wr = cmd.max_recv_sge = 0; + if (cmd->qp_type == IB_QPT_XRC_INI) { + cmd->max_recv_wr = 0; + cmd->max_recv_sge = 0; } else { - if (cmd.is_srq) { - srq = idr_read_srq(cmd.srq_handle, file->ucontext); + if (cmd->is_srq) { + srq = idr_read_srq(cmd->srq_handle, + file->ucontext); if (!srq || srq->srq_type != IB_SRQT_BASIC) { ret = -EINVAL; goto err_put; } } - if (cmd.recv_cq_handle != cmd.send_cq_handle) { - rcq = idr_read_cq(cmd.recv_cq_handle, file->ucontext, 0); + if (cmd->recv_cq_handle != cmd->send_cq_handle) { + rcq = idr_read_cq(cmd->recv_cq_handle, + file->ucontext, 0); if (!rcq) { ret = -EINVAL; goto err_put; @@ -1808,9 +1807,9 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, } } - scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, !!rcq); + scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq); rcq = rcq ?: scq; - pd = idr_read_pd(cmd.pd_handle, file->ucontext); + pd = idr_read_pd(cmd->pd_handle, file->ucontext); if (!pd || !scq) { ret = -EINVAL; goto err_put; @@ -1825,31 +1824,54 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, attr.recv_cq = rcq; attr.srq = srq; attr.xrcd = xrcd; - attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; - attr.qp_type = cmd.qp_type; + attr.sq_sig_type = cmd->sq_sig_all ? IB_SIGNAL_ALL_WR : + IB_SIGNAL_REQ_WR; + attr.qp_type = cmd->qp_type; attr.create_flags = 0; - attr.cap.max_send_wr = cmd.max_send_wr; - attr.cap.max_recv_wr = cmd.max_recv_wr; - attr.cap.max_send_sge = cmd.max_send_sge; - attr.cap.max_recv_sge = cmd.max_recv_sge; - attr.cap.max_inline_data = cmd.max_inline_data; + attr.cap.max_send_wr = cmd->max_send_wr; + attr.cap.max_recv_wr = cmd->max_recv_wr; + attr.cap.max_send_sge = cmd->max_send_sge; + attr.cap.max_recv_sge = cmd->max_recv_sge; + attr.cap.max_inline_data = cmd->max_inline_data; obj->uevent.events_reported = 0; INIT_LIST_HEAD(&obj->uevent.event_list); INIT_LIST_HEAD(&obj->mcast_list); - if (cmd.qp_type == IB_QPT_XRC_TGT) + if (cmd_sz >= offsetof(typeof(*cmd), create_flags) + + sizeof(cmd->create_flags)) + attr.create_flags = cmd->create_flags; + + if (attr.create_flags) { + ret = -EINVAL; + goto err_put; + } + + if (cmd->comp_mask) { + ret = -EINVAL; + goto err_put; + } + + buf = (void *)cmd + sizeof(*cmd); + if (cmd_sz > sizeof(*cmd)) + if (!(buf[0] == 0 && !memcmp(buf, buf + 1, + cmd_sz - sizeof(*cmd) - 1))) { + ret = -EINVAL; + goto err_put; + } + + if (cmd->qp_type == IB_QPT_XRC_TGT) qp = ib_create_qp(pd, &attr); else - qp = device->create_qp(pd, &attr, &udata); + qp = device->create_qp(pd, &attr, uhw); if (IS_ERR(qp)) { ret = PTR_ERR(qp); goto err_put; } - if (cmd.qp_type != IB_QPT_XRC_TGT) { + if (cmd->qp_type != IB_QPT_XRC_TGT) { qp->real_qp = qp; qp->device = device; qp->pd = pd; @@ -1875,19 +1897,20 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, goto err_destroy; memset(&resp, 0, sizeof resp); - resp.qpn = qp->qp_num; - resp.qp_handle = obj->uevent.uobject.id; - resp.max_recv_sge = attr.cap.max_recv_sge; - resp.max_send_sge = attr.cap.max_send_sge; - resp.max_recv_wr = attr.cap.max_recv_wr; - resp.max_send_wr = attr.cap.max_send_wr; - resp.max_inline_data = attr.cap.max_inline_data; + resp.base.qpn = qp->qp_num; + resp.base.qp_handle = obj->uevent.uobject.id; + resp.base.max_recv_sge = attr.cap.max_recv_sge; + resp.base.max_send_sge = attr.cap.max_send_sge; + resp.base.max_recv_wr = attr.cap.max_recv_wr; + resp.base.max_send_wr = attr.cap.max_send_wr; + resp.base.max_inline_data = attr.cap.max_inline_data; - if (copy_to_user((void __user *) (unsigned long) cmd.response, - &resp, sizeof resp)) { - ret = -EFAULT; - goto err_copy; - } + resp.response_length = offsetof(typeof(resp), response_length) + + sizeof(resp.response_length); + + ret = cb(file, &resp, ucore); + if (ret) + goto err_cb; if (xrcd) { obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, @@ -1913,9 +1936,8 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, up_write(&obj->uevent.uobject.mutex); - return in_len; - -err_copy: + return 0; +err_cb: idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); err_destroy: @@ -1937,6 +1959,113 @@ err_put: return ret; } +static int ib_uverbs_create_qp_cb(struct ib_uverbs_file *file, + struct ib_uverbs_ex_create_qp_resp *resp, + struct ib_udata *ucore) +{ + if (ib_copy_to_udata(ucore, &resp->base, sizeof(resp->base))) + return -EFAULT; + + return 0; +} + +ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, + struct ib_device *ib_dev, + const char __user *buf, int in_len, + int out_len) +{ + struct ib_uverbs_create_qp cmd; + struct ib_uverbs_ex_create_qp cmd_ex; + struct ib_udata ucore; + struct ib_udata uhw; + ssize_t resp_size = sizeof(struct ib_uverbs_create_qp_resp); + int err; + + if (out_len < resp_size) + return -ENOSPC; + + if (copy_from_user(&cmd, buf, sizeof(cmd))) + return -EFAULT; + + INIT_UDATA(&ucore, buf, (unsigned long)cmd.response, sizeof(cmd), + resp_size); + INIT_UDATA(&uhw, buf + sizeof(cmd), + (unsigned long)cmd.response + resp_size, + in_len - sizeof(cmd), out_len - resp_size); + + memset(&cmd_ex, 0, sizeof(cmd_ex)); + cmd_ex.user_handle = cmd.user_handle; + cmd_ex.pd_handle = cmd.pd_handle; + cmd_ex.send_cq_handle = cmd.send_cq_handle; + cmd_ex.recv_cq_handle = cmd.recv_cq_handle; + cmd_ex.srq_handle = cmd.srq_handle; + cmd_ex.max_send_wr = cmd.max_send_wr; + cmd_ex.max_recv_wr = cmd.max_recv_wr; + cmd_ex.max_send_sge = cmd.max_send_sge; + cmd_ex.max_recv_sge = cmd.max_recv_sge; + cmd_ex.max_inline_data = cmd.max_inline_data; + cmd_ex.sq_sig_all = cmd.sq_sig_all; + cmd_ex.qp_type = cmd.qp_type; + cmd_ex.is_srq = cmd.is_srq; + + err = create_qp(file, &ucore, &uhw, &cmd_ex, + offsetof(typeof(cmd_ex), is_srq) + + sizeof(cmd.is_srq), ib_uverbs_create_qp_cb, + NULL); + + if (err) + return err; + + return in_len; +} + +static int ib_uverbs_ex_create_qp_cb(struct ib_uverbs_file *file, + struct ib_uverbs_ex_create_qp_resp *resp, + struct ib_udata *ucore) +{ + if (ib_copy_to_udata(ucore, resp, resp->response_length)) + return -EFAULT; + + return 0; +} + +int ib_uverbs_ex_create_qp(struct ib_uverbs_file *file, + struct ib_device *ib_dev, + struct ib_udata *ucore, + struct ib_udata *uhw) +{ + struct ib_uverbs_ex_create_qp_resp resp; + struct ib_uverbs_ex_create_qp cmd = {0}; + int err; + + if (ucore->inlen < (offsetof(typeof(cmd), comp_mask) + + sizeof(cmd.comp_mask))) + return -EINVAL; + + err = ib_copy_from_udata(&cmd, ucore, min(sizeof(cmd), ucore->inlen)); + if (err) + return err; + + if (cmd.comp_mask) + return -EINVAL; + + if (cmd.reserved) + return -EINVAL; + + if (ucore->outlen < (offsetof(typeof(resp), response_length) + + sizeof(resp.response_length))) + return -ENOSPC; + + err = create_qp(file, ucore, uhw, &cmd, + min(ucore->inlen, sizeof(cmd)), + ib_uverbs_ex_create_qp_cb, NULL); + + if (err) + return err; + + return 0; +} + ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file, struct ib_device *ib_dev, const char __user *buf, int in_len, int out_len) diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index c29a660..e3ef288 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -127,6 +127,7 @@ static int (*uverbs_ex_cmd_table[])(struct ib_uverbs_file *file, [IB_USER_VERBS_EX_CMD_DESTROY_FLOW] = ib_uverbs_ex_destroy_flow, [IB_USER_VERBS_EX_CMD_QUERY_DEVICE] = ib_uverbs_ex_query_device, [IB_USER_VERBS_EX_CMD_CREATE_CQ] = ib_uverbs_ex_create_cq, + [IB_USER_VERBS_EX_CMD_CREATE_QP] = ib_uverbs_ex_create_qp, }; static void ib_uverbs_add_one(struct ib_device *device); diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h index 978841e..8126c14 100644 --- a/include/uapi/rdma/ib_user_verbs.h +++ b/include/uapi/rdma/ib_user_verbs.h @@ -92,6 +92,7 @@ enum { enum { IB_USER_VERBS_EX_CMD_QUERY_DEVICE = IB_USER_VERBS_CMD_QUERY_DEVICE, IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ, + IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP, IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD, IB_USER_VERBS_EX_CMD_DESTROY_FLOW, }; @@ -516,6 +517,25 @@ struct ib_uverbs_create_qp { __u64 driver_data[0]; }; +struct ib_uverbs_ex_create_qp { + __u64 user_handle; + __u32 pd_handle; + __u32 send_cq_handle; + __u32 recv_cq_handle; + __u32 srq_handle; + __u32 max_send_wr; + __u32 max_recv_wr; + __u32 max_send_sge; + __u32 max_recv_sge; + __u32 max_inline_data; + __u8 sq_sig_all; + __u8 qp_type; + __u8 is_srq; + __u8 reserved; + __u32 comp_mask; + __u32 create_flags; +}; + struct ib_uverbs_open_qp { __u64 response; __u64 user_handle; @@ -538,6 +558,12 @@ struct ib_uverbs_create_qp_resp { __u32 reserved; }; +struct ib_uverbs_ex_create_qp_resp { + struct ib_uverbs_create_qp_resp base; + __u32 comp_mask; + __u32 response_length; +}; + /* * This struct needs to remain a multiple of 8 bytes to keep the * alignment of the modify QP parameters. -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <1444909482-17113-2-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <1444909482-17113-2-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-10-21 8:34 ` Haggai Eran [not found] ` <56274E02.7010002-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 9:53 ` Sagi Grimberg 1 sibling, 1 reply; 29+ messages in thread From: Haggai Eran @ 2015-10-21 8:34 UTC (permalink / raw) To: Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On 15/10/2015 14:44, Eran Ben Elisha wrote: > ib_uverbs_ex_create_qp follows the extension verbs > mechanism. New features (for example, QP creation flags > field which is added in a downstream patch) could used > via user-space libraries without breaking the ABI. > > Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> > --- > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c > index be4cb9f..e795d59 100644 > --- a/drivers/infiniband/core/uverbs_cmd.c > +++ b/drivers/infiniband/core/uverbs_cmd.c > @@ -1741,66 +1741,65 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file, > return in_len; > } > > -ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > - struct ib_device *ib_dev, > - const char __user *buf, int in_len, > - int out_len) > -{ > - struct ib_uverbs_create_qp cmd; > - struct ib_uverbs_create_qp_resp resp; > - struct ib_udata udata; > - struct ib_uqp_object *obj; > - struct ib_device *device; > - struct ib_pd *pd = NULL; > - struct ib_xrcd *xrcd = NULL; > - struct ib_uobject *uninitialized_var(xrcd_uobj); > - struct ib_cq *scq = NULL, *rcq = NULL; > - struct ib_srq *srq = NULL; > - struct ib_qp *qp; > - struct ib_qp_init_attr attr; > - int ret; > - > - if (out_len < sizeof resp) > - return -ENOSPC; > - > - if (copy_from_user(&cmd, buf, sizeof cmd)) > - return -EFAULT; > +static int create_qp(struct ib_uverbs_file *file, > + struct ib_udata *ucore, > + struct ib_udata *uhw, > + struct ib_uverbs_ex_create_qp *cmd, > + size_t cmd_sz, > + int (*cb)(struct ib_uverbs_file *file, > + struct ib_uverbs_ex_create_qp_resp *resp, > + struct ib_udata *udata), > + void *context) > +{ > + struct ib_uqp_object *obj; > + struct ib_device *device; > + struct ib_pd *pd = NULL; > + struct ib_xrcd *xrcd = NULL; > + struct ib_uobject *uninitialized_var(xrcd_uobj); > + struct ib_cq *scq = NULL, *rcq = NULL; > + struct ib_srq *srq = NULL; > + struct ib_qp *qp; > + char *buf; > + struct ib_qp_init_attr attr; > + struct ib_uverbs_ex_create_qp_resp resp; > + int ret; > > - if (cmd.qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) > + if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) > return -EPERM; > > - INIT_UDATA(&udata, buf + sizeof cmd, > - (unsigned long) cmd.response + sizeof resp, > - in_len - sizeof cmd, out_len - sizeof resp); > - > obj = kzalloc(sizeof *obj, GFP_KERNEL); > if (!obj) > return -ENOMEM; > > - init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class); > + init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext, > + &qp_lock_class); > down_write(&obj->uevent.uobject.mutex); > > - if (cmd.qp_type == IB_QPT_XRC_TGT) { > - xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj); > + if (cmd->qp_type == IB_QPT_XRC_TGT) { > + xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext, > + &xrcd_uobj); > if (!xrcd) { > ret = -EINVAL; > goto err_put; > } > device = xrcd->device; > } else { > - if (cmd.qp_type == IB_QPT_XRC_INI) { > - cmd.max_recv_wr = cmd.max_recv_sge = 0; > + if (cmd->qp_type == IB_QPT_XRC_INI) { > + cmd->max_recv_wr = 0; > + cmd->max_recv_sge = 0; > } else { > - if (cmd.is_srq) { > - srq = idr_read_srq(cmd.srq_handle, file->ucontext); > + if (cmd->is_srq) { > + srq = idr_read_srq(cmd->srq_handle, > + file->ucontext); > if (!srq || srq->srq_type != IB_SRQT_BASIC) { > ret = -EINVAL; > goto err_put; > } > } > > - if (cmd.recv_cq_handle != cmd.send_cq_handle) { > - rcq = idr_read_cq(cmd.recv_cq_handle, file->ucontext, 0); > + if (cmd->recv_cq_handle != cmd->send_cq_handle) { > + rcq = idr_read_cq(cmd->recv_cq_handle, > + file->ucontext, 0); > if (!rcq) { > ret = -EINVAL; > goto err_put; > @@ -1808,9 +1807,9 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > } > } > > - scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, !!rcq); > + scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq); > rcq = rcq ?: scq; > - pd = idr_read_pd(cmd.pd_handle, file->ucontext); > + pd = idr_read_pd(cmd->pd_handle, file->ucontext); > if (!pd || !scq) { > ret = -EINVAL; > goto err_put; > @@ -1825,31 +1824,54 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > attr.recv_cq = rcq; > attr.srq = srq; > attr.xrcd = xrcd; > - attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; > - attr.qp_type = cmd.qp_type; > + attr.sq_sig_type = cmd->sq_sig_all ? IB_SIGNAL_ALL_WR : > + IB_SIGNAL_REQ_WR; > + attr.qp_type = cmd->qp_type; > attr.create_flags = 0; > > - attr.cap.max_send_wr = cmd.max_send_wr; > - attr.cap.max_recv_wr = cmd.max_recv_wr; > - attr.cap.max_send_sge = cmd.max_send_sge; > - attr.cap.max_recv_sge = cmd.max_recv_sge; > - attr.cap.max_inline_data = cmd.max_inline_data; > + attr.cap.max_send_wr = cmd->max_send_wr; > + attr.cap.max_recv_wr = cmd->max_recv_wr; > + attr.cap.max_send_sge = cmd->max_send_sge; > + attr.cap.max_recv_sge = cmd->max_recv_sge; > + attr.cap.max_inline_data = cmd->max_inline_data; > > obj->uevent.events_reported = 0; > INIT_LIST_HEAD(&obj->uevent.event_list); > INIT_LIST_HEAD(&obj->mcast_list); > > - if (cmd.qp_type == IB_QPT_XRC_TGT) > + if (cmd_sz >= offsetof(typeof(*cmd), create_flags) + > + sizeof(cmd->create_flags)) > + attr.create_flags = cmd->create_flags; > + > + if (attr.create_flags) { > + ret = -EINVAL; > + goto err_put; > + } > + > + if (cmd->comp_mask) { Do you need this check in addition to the check in ib_uverbs_ex_create_qp()? > + ret = -EINVAL; > + goto err_put; > + } > + > + buf = (void *)cmd + sizeof(*cmd); > + if (cmd_sz > sizeof(*cmd)) > + if (!(buf[0] == 0 && !memcmp(buf, buf + 1, > + cmd_sz - sizeof(*cmd) - 1))) { > + ret = -EINVAL; > + goto err_put; > + } > + > + if (cmd->qp_type == IB_QPT_XRC_TGT) > qp = ib_create_qp(pd, &attr); > else > - qp = device->create_qp(pd, &attr, &udata); > + qp = device->create_qp(pd, &attr, uhw); > > if (IS_ERR(qp)) { > ret = PTR_ERR(qp); > goto err_put; > } > > - if (cmd.qp_type != IB_QPT_XRC_TGT) { > + if (cmd->qp_type != IB_QPT_XRC_TGT) { > qp->real_qp = qp; > qp->device = device; > qp->pd = pd; > @@ -1875,19 +1897,20 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > goto err_destroy; > > memset(&resp, 0, sizeof resp); > - resp.qpn = qp->qp_num; > - resp.qp_handle = obj->uevent.uobject.id; > - resp.max_recv_sge = attr.cap.max_recv_sge; > - resp.max_send_sge = attr.cap.max_send_sge; > - resp.max_recv_wr = attr.cap.max_recv_wr; > - resp.max_send_wr = attr.cap.max_send_wr; > - resp.max_inline_data = attr.cap.max_inline_data; > + resp.base.qpn = qp->qp_num; > + resp.base.qp_handle = obj->uevent.uobject.id; > + resp.base.max_recv_sge = attr.cap.max_recv_sge; > + resp.base.max_send_sge = attr.cap.max_send_sge; > + resp.base.max_recv_wr = attr.cap.max_recv_wr; > + resp.base.max_send_wr = attr.cap.max_send_wr; > + resp.base.max_inline_data = attr.cap.max_inline_data; > > - if (copy_to_user((void __user *) (unsigned long) cmd.response, > - &resp, sizeof resp)) { > - ret = -EFAULT; > - goto err_copy; > - } > + resp.response_length = offsetof(typeof(resp), response_length) + > + sizeof(resp.response_length); > + > + ret = cb(file, &resp, ucore); > + if (ret) > + goto err_cb; > > if (xrcd) { > obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, > @@ -1913,9 +1936,8 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > > up_write(&obj->uevent.uobject.mutex); > > - return in_len; > - > -err_copy: > + return 0; > +err_cb: > idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); > > err_destroy: > @@ -1937,6 +1959,113 @@ err_put: > return ret; > } > > +static int ib_uverbs_create_qp_cb(struct ib_uverbs_file *file, > + struct ib_uverbs_ex_create_qp_resp *resp, > + struct ib_udata *ucore) > +{ > + if (ib_copy_to_udata(ucore, &resp->base, sizeof(resp->base))) > + return -EFAULT; > + > + return 0; > +} > + > +ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, > + struct ib_device *ib_dev, > + const char __user *buf, int in_len, > + int out_len) > +{ > + struct ib_uverbs_create_qp cmd; > + struct ib_uverbs_ex_create_qp cmd_ex; > + struct ib_udata ucore; > + struct ib_udata uhw; > + ssize_t resp_size = sizeof(struct ib_uverbs_create_qp_resp); > + int err; I would expect a check here that in_len >= sizeof(cmd). But I see the previous code didn't have it either. Any reason adding the check would break user-space? > + > + if (out_len < resp_size) > + return -ENOSPC; > + > + if (copy_from_user(&cmd, buf, sizeof(cmd))) > + return -EFAULT; > + > + INIT_UDATA(&ucore, buf, (unsigned long)cmd.response, sizeof(cmd), > + resp_size); > + INIT_UDATA(&uhw, buf + sizeof(cmd), > + (unsigned long)cmd.response + resp_size, > + in_len - sizeof(cmd), out_len - resp_size); > + > + memset(&cmd_ex, 0, sizeof(cmd_ex)); > + cmd_ex.user_handle = cmd.user_handle; > + cmd_ex.pd_handle = cmd.pd_handle; > + cmd_ex.send_cq_handle = cmd.send_cq_handle; > + cmd_ex.recv_cq_handle = cmd.recv_cq_handle; > + cmd_ex.srq_handle = cmd.srq_handle; > + cmd_ex.max_send_wr = cmd.max_send_wr; > + cmd_ex.max_recv_wr = cmd.max_recv_wr; > + cmd_ex.max_send_sge = cmd.max_send_sge; > + cmd_ex.max_recv_sge = cmd.max_recv_sge; > + cmd_ex.max_inline_data = cmd.max_inline_data; > + cmd_ex.sq_sig_all = cmd.sq_sig_all; > + cmd_ex.qp_type = cmd.qp_type; > + cmd_ex.is_srq = cmd.is_srq; > + > + err = create_qp(file, &ucore, &uhw, &cmd_ex, > + offsetof(typeof(cmd_ex), is_srq) + > + sizeof(cmd.is_srq), ib_uverbs_create_qp_cb, > + NULL); > + > + if (err) > + return err; > + > + return in_len; > +} -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <56274E02.7010002-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <56274E02.7010002-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-10-21 13:46 ` eran ben elisha [not found] ` <CAKHjkj=k6QSKj0gMe0z5=Kjbe2e7cVCbPhkq9Z457gpvnRyitg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: eran ben elisha @ 2015-10-21 13:46 UTC (permalink / raw) To: Haggai Eran Cc: Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On Wed, Oct 21, 2015 at 11:34 AM, Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > On 15/10/2015 14:44, Eran Ben Elisha wrote: >> ib_uverbs_ex_create_qp follows the extension verbs >> mechanism. New features (for example, QP creation flags >> field which is added in a downstream patch) could used >> via user-space libraries without breaking the ABI. >> >> Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> >> --- > >> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c >> index be4cb9f..e795d59 100644 >> --- a/drivers/infiniband/core/uverbs_cmd.c >> +++ b/drivers/infiniband/core/uverbs_cmd.c >> @@ -1741,66 +1741,65 @@ ssize_t ib_uverbs_destroy_cq(struct ib_uverbs_file *file, >> return in_len; >> } >> >> -ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> - struct ib_device *ib_dev, >> - const char __user *buf, int in_len, >> - int out_len) >> -{ >> - struct ib_uverbs_create_qp cmd; >> - struct ib_uverbs_create_qp_resp resp; >> - struct ib_udata udata; >> - struct ib_uqp_object *obj; >> - struct ib_device *device; >> - struct ib_pd *pd = NULL; >> - struct ib_xrcd *xrcd = NULL; >> - struct ib_uobject *uninitialized_var(xrcd_uobj); >> - struct ib_cq *scq = NULL, *rcq = NULL; >> - struct ib_srq *srq = NULL; >> - struct ib_qp *qp; >> - struct ib_qp_init_attr attr; >> - int ret; >> - >> - if (out_len < sizeof resp) >> - return -ENOSPC; >> - >> - if (copy_from_user(&cmd, buf, sizeof cmd)) >> - return -EFAULT; >> +static int create_qp(struct ib_uverbs_file *file, >> + struct ib_udata *ucore, >> + struct ib_udata *uhw, >> + struct ib_uverbs_ex_create_qp *cmd, >> + size_t cmd_sz, >> + int (*cb)(struct ib_uverbs_file *file, >> + struct ib_uverbs_ex_create_qp_resp *resp, >> + struct ib_udata *udata), >> + void *context) >> +{ >> + struct ib_uqp_object *obj; >> + struct ib_device *device; >> + struct ib_pd *pd = NULL; >> + struct ib_xrcd *xrcd = NULL; >> + struct ib_uobject *uninitialized_var(xrcd_uobj); >> + struct ib_cq *scq = NULL, *rcq = NULL; >> + struct ib_srq *srq = NULL; >> + struct ib_qp *qp; >> + char *buf; >> + struct ib_qp_init_attr attr; >> + struct ib_uverbs_ex_create_qp_resp resp; >> + int ret; >> >> - if (cmd.qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) >> + if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) >> return -EPERM; >> >> - INIT_UDATA(&udata, buf + sizeof cmd, >> - (unsigned long) cmd.response + sizeof resp, >> - in_len - sizeof cmd, out_len - sizeof resp); >> - >> obj = kzalloc(sizeof *obj, GFP_KERNEL); >> if (!obj) >> return -ENOMEM; >> >> - init_uobj(&obj->uevent.uobject, cmd.user_handle, file->ucontext, &qp_lock_class); >> + init_uobj(&obj->uevent.uobject, cmd->user_handle, file->ucontext, >> + &qp_lock_class); >> down_write(&obj->uevent.uobject.mutex); >> >> - if (cmd.qp_type == IB_QPT_XRC_TGT) { >> - xrcd = idr_read_xrcd(cmd.pd_handle, file->ucontext, &xrcd_uobj); >> + if (cmd->qp_type == IB_QPT_XRC_TGT) { >> + xrcd = idr_read_xrcd(cmd->pd_handle, file->ucontext, >> + &xrcd_uobj); >> if (!xrcd) { >> ret = -EINVAL; >> goto err_put; >> } >> device = xrcd->device; >> } else { >> - if (cmd.qp_type == IB_QPT_XRC_INI) { >> - cmd.max_recv_wr = cmd.max_recv_sge = 0; >> + if (cmd->qp_type == IB_QPT_XRC_INI) { >> + cmd->max_recv_wr = 0; >> + cmd->max_recv_sge = 0; >> } else { >> - if (cmd.is_srq) { >> - srq = idr_read_srq(cmd.srq_handle, file->ucontext); >> + if (cmd->is_srq) { >> + srq = idr_read_srq(cmd->srq_handle, >> + file->ucontext); >> if (!srq || srq->srq_type != IB_SRQT_BASIC) { >> ret = -EINVAL; >> goto err_put; >> } >> } >> >> - if (cmd.recv_cq_handle != cmd.send_cq_handle) { >> - rcq = idr_read_cq(cmd.recv_cq_handle, file->ucontext, 0); >> + if (cmd->recv_cq_handle != cmd->send_cq_handle) { >> + rcq = idr_read_cq(cmd->recv_cq_handle, >> + file->ucontext, 0); >> if (!rcq) { >> ret = -EINVAL; >> goto err_put; >> @@ -1808,9 +1807,9 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> } >> } >> >> - scq = idr_read_cq(cmd.send_cq_handle, file->ucontext, !!rcq); >> + scq = idr_read_cq(cmd->send_cq_handle, file->ucontext, !!rcq); >> rcq = rcq ?: scq; >> - pd = idr_read_pd(cmd.pd_handle, file->ucontext); >> + pd = idr_read_pd(cmd->pd_handle, file->ucontext); >> if (!pd || !scq) { >> ret = -EINVAL; >> goto err_put; >> @@ -1825,31 +1824,54 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> attr.recv_cq = rcq; >> attr.srq = srq; >> attr.xrcd = xrcd; >> - attr.sq_sig_type = cmd.sq_sig_all ? IB_SIGNAL_ALL_WR : IB_SIGNAL_REQ_WR; >> - attr.qp_type = cmd.qp_type; >> + attr.sq_sig_type = cmd->sq_sig_all ? IB_SIGNAL_ALL_WR : >> + IB_SIGNAL_REQ_WR; >> + attr.qp_type = cmd->qp_type; >> attr.create_flags = 0; >> >> - attr.cap.max_send_wr = cmd.max_send_wr; >> - attr.cap.max_recv_wr = cmd.max_recv_wr; >> - attr.cap.max_send_sge = cmd.max_send_sge; >> - attr.cap.max_recv_sge = cmd.max_recv_sge; >> - attr.cap.max_inline_data = cmd.max_inline_data; >> + attr.cap.max_send_wr = cmd->max_send_wr; >> + attr.cap.max_recv_wr = cmd->max_recv_wr; >> + attr.cap.max_send_sge = cmd->max_send_sge; >> + attr.cap.max_recv_sge = cmd->max_recv_sge; >> + attr.cap.max_inline_data = cmd->max_inline_data; >> >> obj->uevent.events_reported = 0; >> INIT_LIST_HEAD(&obj->uevent.event_list); >> INIT_LIST_HEAD(&obj->mcast_list); >> >> - if (cmd.qp_type == IB_QPT_XRC_TGT) >> + if (cmd_sz >= offsetof(typeof(*cmd), create_flags) + >> + sizeof(cmd->create_flags)) >> + attr.create_flags = cmd->create_flags; >> + >> + if (attr.create_flags) { >> + ret = -EINVAL; >> + goto err_put; >> + } >> + >> + if (cmd->comp_mask) { > Do you need this check in addition to the check in ib_uverbs_ex_create_qp()? You are right, I will remove this extra check and resend. > >> + ret = -EINVAL; >> + goto err_put; >> + } >> + >> + buf = (void *)cmd + sizeof(*cmd); >> + if (cmd_sz > sizeof(*cmd)) >> + if (!(buf[0] == 0 && !memcmp(buf, buf + 1, >> + cmd_sz - sizeof(*cmd) - 1))) { >> + ret = -EINVAL; >> + goto err_put; >> + } >> + >> + if (cmd->qp_type == IB_QPT_XRC_TGT) >> qp = ib_create_qp(pd, &attr); >> else >> - qp = device->create_qp(pd, &attr, &udata); >> + qp = device->create_qp(pd, &attr, uhw); >> >> if (IS_ERR(qp)) { >> ret = PTR_ERR(qp); >> goto err_put; >> } >> >> - if (cmd.qp_type != IB_QPT_XRC_TGT) { >> + if (cmd->qp_type != IB_QPT_XRC_TGT) { >> qp->real_qp = qp; >> qp->device = device; >> qp->pd = pd; >> @@ -1875,19 +1897,20 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> goto err_destroy; >> >> memset(&resp, 0, sizeof resp); >> - resp.qpn = qp->qp_num; >> - resp.qp_handle = obj->uevent.uobject.id; >> - resp.max_recv_sge = attr.cap.max_recv_sge; >> - resp.max_send_sge = attr.cap.max_send_sge; >> - resp.max_recv_wr = attr.cap.max_recv_wr; >> - resp.max_send_wr = attr.cap.max_send_wr; >> - resp.max_inline_data = attr.cap.max_inline_data; >> + resp.base.qpn = qp->qp_num; >> + resp.base.qp_handle = obj->uevent.uobject.id; >> + resp.base.max_recv_sge = attr.cap.max_recv_sge; >> + resp.base.max_send_sge = attr.cap.max_send_sge; >> + resp.base.max_recv_wr = attr.cap.max_recv_wr; >> + resp.base.max_send_wr = attr.cap.max_send_wr; >> + resp.base.max_inline_data = attr.cap.max_inline_data; >> >> - if (copy_to_user((void __user *) (unsigned long) cmd.response, >> - &resp, sizeof resp)) { >> - ret = -EFAULT; >> - goto err_copy; >> - } >> + resp.response_length = offsetof(typeof(resp), response_length) + >> + sizeof(resp.response_length); >> + >> + ret = cb(file, &resp, ucore); >> + if (ret) >> + goto err_cb; >> >> if (xrcd) { >> obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, >> @@ -1913,9 +1936,8 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> >> up_write(&obj->uevent.uobject.mutex); >> >> - return in_len; >> - >> -err_copy: >> + return 0; >> +err_cb: >> idr_remove_uobj(&ib_uverbs_qp_idr, &obj->uevent.uobject); >> >> err_destroy: >> @@ -1937,6 +1959,113 @@ err_put: >> return ret; >> } >> >> +static int ib_uverbs_create_qp_cb(struct ib_uverbs_file *file, >> + struct ib_uverbs_ex_create_qp_resp *resp, >> + struct ib_udata *ucore) >> +{ >> + if (ib_copy_to_udata(ucore, &resp->base, sizeof(resp->base))) >> + return -EFAULT; >> + >> + return 0; >> +} >> + >> +ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >> + struct ib_device *ib_dev, >> + const char __user *buf, int in_len, >> + int out_len) >> +{ >> + struct ib_uverbs_create_qp cmd; >> + struct ib_uverbs_ex_create_qp cmd_ex; >> + struct ib_udata ucore; >> + struct ib_udata uhw; >> + ssize_t resp_size = sizeof(struct ib_uverbs_create_qp_resp); >> + int err; > > I would expect a check here that in_len >= sizeof(cmd). But I see the > previous code didn't have it either. Any reason adding the check would > break user-space? This patch just refactor in ib_uverbs_create_qp and doesn't change any of it's logic or fix any bug. we can consider such a fix for the future. > >> + >> + if (out_len < resp_size) >> + return -ENOSPC; >> + >> + if (copy_from_user(&cmd, buf, sizeof(cmd))) >> + return -EFAULT; >> + >> + INIT_UDATA(&ucore, buf, (unsigned long)cmd.response, sizeof(cmd), >> + resp_size); >> + INIT_UDATA(&uhw, buf + sizeof(cmd), >> + (unsigned long)cmd.response + resp_size, >> + in_len - sizeof(cmd), out_len - resp_size); >> + >> + memset(&cmd_ex, 0, sizeof(cmd_ex)); >> + cmd_ex.user_handle = cmd.user_handle; >> + cmd_ex.pd_handle = cmd.pd_handle; >> + cmd_ex.send_cq_handle = cmd.send_cq_handle; >> + cmd_ex.recv_cq_handle = cmd.recv_cq_handle; >> + cmd_ex.srq_handle = cmd.srq_handle; >> + cmd_ex.max_send_wr = cmd.max_send_wr; >> + cmd_ex.max_recv_wr = cmd.max_recv_wr; >> + cmd_ex.max_send_sge = cmd.max_send_sge; >> + cmd_ex.max_recv_sge = cmd.max_recv_sge; >> + cmd_ex.max_inline_data = cmd.max_inline_data; >> + cmd_ex.sq_sig_all = cmd.sq_sig_all; >> + cmd_ex.qp_type = cmd.qp_type; >> + cmd_ex.is_srq = cmd.is_srq; >> + >> + err = create_qp(file, &ucore, &uhw, &cmd_ex, >> + offsetof(typeof(cmd_ex), is_srq) + >> + sizeof(cmd.is_srq), ib_uverbs_create_qp_cb, >> + NULL); >> + >> + if (err) >> + return err; >> + >> + return in_len; >> +} > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CAKHjkj=k6QSKj0gMe0z5=Kjbe2e7cVCbPhkq9Z457gpvnRyitg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <CAKHjkj=k6QSKj0gMe0z5=Kjbe2e7cVCbPhkq9Z457gpvnRyitg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-10-21 14:42 ` Haggai Eran 0 siblings, 0 replies; 29+ messages in thread From: Haggai Eran @ 2015-10-21 14:42 UTC (permalink / raw) To: eran ben elisha Cc: Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On 21/10/2015 16:46, eran ben elisha wrote: >>> +ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file, >>> >> + struct ib_device *ib_dev, >>> >> + const char __user *buf, int in_len, >>> >> + int out_len) >>> >> +{ >>> >> + struct ib_uverbs_create_qp cmd; >>> >> + struct ib_uverbs_ex_create_qp cmd_ex; >>> >> + struct ib_udata ucore; >>> >> + struct ib_udata uhw; >>> >> + ssize_t resp_size = sizeof(struct ib_uverbs_create_qp_resp); >>> >> + int err; >> > >> > I would expect a check here that in_len >= sizeof(cmd). But I see the >> > previous code didn't have it either. Any reason adding the check would >> > break user-space? > This patch just refactor in ib_uverbs_create_qp and doesn't change any > of it's logic or fix any bug. we can consider such a fix for the > future. Fair enough. Haggai -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <1444909482-17113-2-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 8:34 ` Haggai Eran @ 2015-10-21 9:53 ` Sagi Grimberg [not found] ` <56276095.6020803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Sagi Grimberg @ 2015-10-21 9:53 UTC (permalink / raw) To: Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On 10/15/2015 2:44 PM, Eran Ben Elisha wrote: > ib_uverbs_ex_create_qp follows the extension verbs > mechanism. New features (for example, QP creation flags > field which is added in a downstream patch) could used > via user-space libraries without breaking the ABI. This is an important addition, I'll need it soon for stuff I'm working on... > +struct ib_uverbs_ex_create_qp { > + __u64 user_handle; > + __u32 pd_handle; > + __u32 send_cq_handle; > + __u32 recv_cq_handle; > + __u32 srq_handle; > + __u32 max_send_wr; > + __u32 max_recv_wr; > + __u32 max_send_sge; > + __u32 max_recv_sge; > + __u32 max_inline_data; > + __u8 sq_sig_all; > + __u8 qp_type; > + __u8 is_srq; > + __u8 reserved; > + __u32 comp_mask; > + __u32 create_flags; Can we please make create_flags u64 to begin with? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <56276095.6020803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <56276095.6020803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-10-21 10:04 ` Or Gerlitz [not found] ` <56276336.7000704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 10:26 ` Haggai Eran 1 sibling, 1 reply; 29+ messages in thread From: Or Gerlitz @ 2015-10-21 10:04 UTC (permalink / raw) To: Sagi Grimberg, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter On 10/21/2015 12:53 PM, Sagi Grimberg wrote: > On 10/15/2015 2:44 PM, Eran Ben Elisha wrote: > > >> +struct ib_uverbs_ex_create_qp { >> + __u64 user_handle; >> + __u32 pd_handle; >> + __u32 send_cq_handle; >> + __u32 recv_cq_handle; >> + __u32 srq_handle; >> + __u32 max_send_wr; >> + __u32 max_recv_wr; >> + __u32 max_send_sge; >> + __u32 max_recv_sge; >> + __u32 max_inline_data; >> + __u8 sq_sig_all; >> + __u8 qp_type; >> + __u8 is_srq; >> + __u8 reserved; >> + __u32 comp_mask; >> + __u32 create_flags; > > Can we please make create_flags u64 to begin with? In the kernel we used 4-5 flags over 10 years, why worry? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <56276336.7000704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <56276336.7000704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-10-21 14:09 ` Sagi Grimberg [not found] ` <56279CA4.8040201-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Sagi Grimberg @ 2015-10-21 14:09 UTC (permalink / raw) To: Or Gerlitz, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter On 10/21/2015 1:04 PM, Or Gerlitz wrote: > On 10/21/2015 12:53 PM, Sagi Grimberg wrote: >> On 10/15/2015 2:44 PM, Eran Ben Elisha wrote: >> >> >>> +struct ib_uverbs_ex_create_qp { >>> + __u64 user_handle; >>> + __u32 pd_handle; >>> + __u32 send_cq_handle; >>> + __u32 recv_cq_handle; >>> + __u32 srq_handle; >>> + __u32 max_send_wr; >>> + __u32 max_recv_wr; >>> + __u32 max_send_sge; >>> + __u32 max_recv_sge; >>> + __u32 max_inline_data; >>> + __u8 sq_sig_all; >>> + __u8 qp_type; >>> + __u8 is_srq; >>> + __u8 reserved; >>> + __u32 comp_mask; >>> + __u32 create_flags; >> >> Can we please make create_flags u64 to begin with? > > > In the kernel we used 4-5 flags over 10 years, why worry? I don't see why we should assume that we won't occupy enough bits in a long time just for saving extra 4 bytes. But if you are strongly against it then I won't persist here. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <56279CA4.8040201-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <56279CA4.8040201-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-10-21 18:15 ` Or Gerlitz [not found] ` <CAJ3xEMgWcpKbKL3jhiK2RP=H6_iY0eriirAXn1TJaSG7u8frAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Or Gerlitz @ 2015-10-21 18:15 UTC (permalink / raw) To: Sagi Grimberg Cc: Or Gerlitz, Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter On Wed, Oct 21, 2015 at 5:09 PM, Sagi Grimberg wrote: > On 10/21/2015 1:04 PM, Or Gerlitz wrote: >> On 10/21/2015 12:53 PM, Sagi Grimberg wrote: >>> On 10/15/2015 2:44 PM, Eran Ben Elisha wrote: >>>> +struct ib_uverbs_ex_create_qp { >>>> + __u64 user_handle; >>>> + __u32 pd_handle; >>>> + __u32 send_cq_handle; >>>> + __u32 recv_cq_handle; >>>> + __u32 srq_handle; >>>> + __u32 max_send_wr; >>>> + __u32 max_recv_wr; >>>> + __u32 max_send_sge; >>>> + __u32 max_recv_sge; >>>> + __u32 max_inline_data; >>>> + __u8 sq_sig_all; >>>> + __u8 qp_type; >>>> + __u8 is_srq; >>>> + __u8 reserved; >>>> + __u32 comp_mask; >>>> + __u32 create_flags; >>> >>> >>> Can we please make create_flags u64 to begin with? >> In the kernel we used 4-5 flags over 10 years, why worry? > I don't see why we should assume that we won't occupy enough bits in a > long time just for saving extra 4 bytes. But if you are strongly against > it then I won't persist here. Again, the kernel stack consuming rate here was < 1 bit a year when averaging over time since this was introduced. So we should be doing well for the coming ~10-20 years with this 32 bit field, and we can easily extend it later, I verified that with Haggai, so yes, don't want 64 bits now. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CAJ3xEMgWcpKbKL3jhiK2RP=H6_iY0eriirAXn1TJaSG7u8frAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <CAJ3xEMgWcpKbKL3jhiK2RP=H6_iY0eriirAXn1TJaSG7u8frAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-10-21 18:21 ` Christoph Lameter [not found] ` <alpine.DEB.2.20.1510211319520.10833-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Christoph Lameter @ 2015-10-21 18:21 UTC (permalink / raw) To: Or Gerlitz Cc: Sagi Grimberg, Or Gerlitz, Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Wed, 21 Oct 2015, Or Gerlitz wrote: > Again, the kernel stack consuming rate here was < 1 bit a year when > averaging over time since this was introduced. So we should be doing > well for the coming ~10-20 years with this 32 bit field, and we can > easily extend it later, I verified that with Haggai, so yes, don't > want 64 bits now. Hmmm... We are running out of dev_cap flags on mlx4 already. They are 32 bits. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <alpine.DEB.2.20.1510211319520.10833-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org>]
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <alpine.DEB.2.20.1510211319520.10833-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org> @ 2015-10-21 18:52 ` Or Gerlitz 0 siblings, 0 replies; 29+ messages in thread From: Or Gerlitz @ 2015-10-21 18:52 UTC (permalink / raw) To: Christoph Lameter Cc: Sagi Grimberg, Or Gerlitz, Eran Ben Elisha, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Wed, Oct 21, 2015 at 9:21 PM, Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org> wrote: > Hmmm... We are running out of dev_cap flags on mlx4 already. They are 32 > bits. Christoph, These are QP creation flags (IB_QP_CREATE_XXX) not device capability flags (IB_DEVICE_YYY) Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp [not found] ` <56276095.6020803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-10-21 10:04 ` Or Gerlitz @ 2015-10-21 10:26 ` Haggai Eran 1 sibling, 0 replies; 29+ messages in thread From: Haggai Eran @ 2015-10-21 10:26 UTC (permalink / raw) To: Sagi Grimberg, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On 21/10/2015 12:53, Sagi Grimberg wrote: > On 10/15/2015 2:44 PM, Eran Ben Elisha wrote: >> ib_uverbs_ex_create_qp follows the extension verbs >> mechanism. New features (for example, QP creation flags >> field which is added in a downstream patch) could used >> via user-space libraries without breaking the ABI. > > This is an important addition, I'll need it soon for stuff > I'm working on... > >> +struct ib_uverbs_ex_create_qp { >> + __u64 user_handle; >> + __u32 pd_handle; >> + __u32 send_cq_handle; >> + __u32 recv_cq_handle; >> + __u32 srq_handle; >> + __u32 max_send_wr; >> + __u32 max_recv_wr; >> + __u32 max_send_sge; >> + __u32 max_recv_sge; >> + __u32 max_inline_data; >> + __u8 sq_sig_all; >> + __u8 qp_type; >> + __u8 is_srq; >> + __u8 reserved; >> + __u32 comp_mask; >> + __u32 create_flags; > > Can we please make create_flags u64 to begin with? Note that the size of the struct must be a round number of 64-bit words to avoid different sizeof values on 32-bit systems. If you change the size of create_flags you'll need another 32-bit padding field in the struct. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 2/7] IB/core: Allow setting create flags in QP init attribute [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha [not found] ` <1444909482-17113-3-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha ` (4 subsequent siblings) 6 siblings, 1 reply; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter Allow setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK at create_flags in ib_uverbs_create_qp_ex. Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/core/uverbs_cmd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index e795d59..e9bafa3 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1843,7 +1843,7 @@ static int create_qp(struct ib_uverbs_file *file, sizeof(cmd->create_flags)) attr.create_flags = cmd->create_flags; - if (attr.create_flags) { + if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) { ret = -EINVAL; goto err_put; } -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <1444909482-17113-3-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 2/7] IB/core: Allow setting create flags in QP init attribute [not found] ` <1444909482-17113-3-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-10-21 8:46 ` Haggai Eran 0 siblings, 0 replies; 29+ messages in thread From: Haggai Eran @ 2015-10-21 8:46 UTC (permalink / raw) To: Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter On 15/10/2015 14:44, Eran Ben Elisha wrote: > Allow setting IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK at create_flags in > ib_uverbs_create_qp_ex. > > Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> > --- > drivers/infiniband/core/uverbs_cmd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c > index e795d59..e9bafa3 100644 > --- a/drivers/infiniband/core/uverbs_cmd.c > +++ b/drivers/infiniband/core/uverbs_cmd.c > @@ -1843,7 +1843,7 @@ static int create_qp(struct ib_uverbs_file *file, > sizeof(cmd->create_flags)) > attr.create_flags = cmd->create_flags; > > - if (attr.create_flags) { > + if (attr.create_flags & ~IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) { > ret = -EINVAL; > goto err_put; > } > FWIW Reviewed-by: Haggai Eran <haggaie-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha ` (3 subsequent siblings) 6 siblings, 0 replies; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter, Maor Gottlieb From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Update device capabilities regarding HW filtering multicast loopback support. Add MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB attribute to mlx4_update_qp to enable changing QP context to support filtering incoming multicast loopback traffic according the sender's counter index. Set the corresponding bits in QP context to force the loopback source checks if attribute is given and HW supports it. Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/net/ethernet/mellanox/mlx4/fw.c | 6 +++++ drivers/net/ethernet/mellanox/mlx4/qp.c | 19 +++++++++++++- .../net/ethernet/mellanox/mlx4/resource_tracker.c | 30 +++++++++++++++++----- include/linux/mlx4/device.h | 2 ++ include/linux/mlx4/qp.h | 24 +++++++++++++---- 5 files changed, 68 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c index e8ec1de..b3be3a0 100644 --- a/drivers/net/ethernet/mellanox/mlx4/fw.c +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c @@ -155,6 +155,8 @@ static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags) [27] = "Port beacon support", [28] = "RX-ALL support", [29] = "802.1ad offload support", + [31] = "Modifying loopback source checks using UPDATE_QP support", + [32] = "Loopback source checks support", }; int i; @@ -964,6 +966,10 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) MLX4_GET(field32, outbox, QUERY_DEV_CAP_EXT_2_FLAGS_OFFSET); if (field32 & (1 << 16)) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP; + if (field32 & (1 << 18)) + dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB; + if (field32 & (1 << 19)) + dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_LB_SRC_CHK; if (field32 & (1 << 26)) dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_VLAN_CONTROL; if (field32 & (1 << 20)) diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c b/drivers/net/ethernet/mellanox/mlx4/qp.c index 2026863..b162495 100644 --- a/drivers/net/ethernet/mellanox/mlx4/qp.c +++ b/drivers/net/ethernet/mellanox/mlx4/qp.c @@ -436,6 +436,23 @@ int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn, cmd->qp_context.pri_path.grh_mylmc = params->smac_index; } + if (attr & MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB) { + if (!(dev->caps.flags2 + & MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) { + mlx4_warn(dev, + "Trying to set src check LB, but it isn't supported\n"); + err = -ENOTSUPP; + goto out; + } + pri_addr_path_mask |= + 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB; + if (params->flags & + MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB) { + cmd->qp_context.pri_path.fl |= + MLX4_FL_ETH_SRC_CHECK_MC_LB; + } + } + if (attr & MLX4_UPDATE_QP_VSD) { qp_mask |= 1ULL << MLX4_UPD_QP_MASK_VSD; if (params->flags & MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE) @@ -458,7 +475,7 @@ int mlx4_update_qp(struct mlx4_dev *dev, u32 qpn, err = mlx4_cmd(dev, mailbox->dma, qpn & 0xffffff, 0, MLX4_CMD_UPDATE_QP, MLX4_CMD_TIME_CLASS_A, MLX4_CMD_NATIVE); - +out: mlx4_free_cmd_mailbox(dev, mailbox); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c index 731423c..502f335 100644 --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -770,9 +770,12 @@ static int update_vport_qp_param(struct mlx4_dev *dev, } } + /* preserve IF_COUNTER flag */ + qpc->pri_path.vlan_control &= + MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER; if (vp_oper->state.link_state == IFLA_VF_LINK_STATE_DISABLE && dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_UPDATE_QP) { - qpc->pri_path.vlan_control = + qpc->pri_path.vlan_control |= MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED | MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED | MLX4_VLAN_CTRL_ETH_TX_BLOCK_UNTAGGED | @@ -780,12 +783,12 @@ static int update_vport_qp_param(struct mlx4_dev *dev, MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED | MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED; } else if (0 != vp_oper->state.default_vlan) { - qpc->pri_path.vlan_control = + qpc->pri_path.vlan_control |= MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED | MLX4_VLAN_CTRL_ETH_RX_BLOCK_PRIO_TAGGED | MLX4_VLAN_CTRL_ETH_RX_BLOCK_UNTAGGED; } else { /* priority tagged */ - qpc->pri_path.vlan_control = + qpc->pri_path.vlan_control |= MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED | MLX4_VLAN_CTRL_ETH_RX_BLOCK_TAGGED; } @@ -3762,9 +3765,6 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave, update_gid(dev, inbox, (u8)slave); adjust_proxy_tun_qkey(dev, vhcr, qpc); orig_sched_queue = qpc->pri_path.sched_queue; - err = update_vport_qp_param(dev, inbox, slave, qpn); - if (err) - return err; err = get_res(dev, slave, qpn, RES_QP, &qp); if (err) @@ -3774,6 +3774,10 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave, goto out; } + err = update_vport_qp_param(dev, inbox, slave, qpn); + if (err) + goto out; + err = mlx4_DMA_wrapper(dev, slave, vhcr, inbox, outbox, cmd); out: /* if no error, save sched queue value passed in by VF. This is @@ -4208,7 +4212,9 @@ static int add_eth_header(struct mlx4_dev *dev, int slave, } -#define MLX4_UPD_QP_PATH_MASK_SUPPORTED (1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX) +#define MLX4_UPD_QP_PATH_MASK_SUPPORTED ( \ + 1ULL << MLX4_UPD_QP_PATH_MASK_MAC_INDEX |\ + 1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB) int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave, struct mlx4_vhcr *vhcr, struct mlx4_cmd_mailbox *inbox, @@ -4231,6 +4237,16 @@ int mlx4_UPDATE_QP_wrapper(struct mlx4_dev *dev, int slave, (pri_addr_path_mask & ~MLX4_UPD_QP_PATH_MASK_SUPPORTED)) return -EPERM; + if ((pri_addr_path_mask & + (1ULL << MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB)) && + !(dev->caps.flags2 & + MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB)) { + mlx4_warn(dev, + "Src check LB for slave %d isn't supported\n", + slave); + return -ENOTSUPP; + } + /* Just change the smac for the QP */ err = get_res(dev, slave, qpn, RES_QP, &rqp); if (err) { diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index baad4cb..dac6872 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -214,6 +214,8 @@ enum { MLX4_DEV_CAP_FLAG2_IGNORE_FCS = 1LL << 28, MLX4_DEV_CAP_FLAG2_PHV_EN = 1LL << 29, MLX4_DEV_CAP_FLAG2_SKIP_OUTER_VLAN = 1LL << 30, + MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB = 1ULL << 31, + MLX4_DEV_CAP_FLAG2_LB_SRC_CHK = 1ULL << 32, }; enum { diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h index de45a51..fe052e2 100644 --- a/include/linux/mlx4/qp.h +++ b/include/linux/mlx4/qp.h @@ -135,7 +135,10 @@ struct mlx4_rss_context { struct mlx4_qp_path { u8 fl; - u8 vlan_control; + union { + u8 vlan_control; + u8 control; + }; u8 disable_pkey_check; u8 pkey_index; u8 counter_index; @@ -156,9 +159,16 @@ struct mlx4_qp_path { }; enum { /* fl */ - MLX4_FL_CV = 1 << 6, - MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2 + MLX4_FL_CV = 1 << 6, + MLX4_FL_ETH_HIDE_CQE_VLAN = 1 << 2, + MLX4_FL_ETH_SRC_CHECK_MC_LB = 1 << 1, + MLX4_FL_ETH_SRC_CHECK_UC_LB = 1 << 0, }; + +enum { /* control */ + MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER = 1 << 7, +}; + enum { /* vlan_control */ MLX4_VLAN_CTRL_ETH_TX_BLOCK_TAGGED = 1 << 6, MLX4_VLAN_CTRL_ETH_TX_BLOCK_PRIO_TAGGED = 1 << 5, /* 802.1p priority tag */ @@ -254,6 +264,8 @@ enum { MLX4_UPD_QP_PATH_MASK_SCHED_QUEUE = 14 + 32, MLX4_UPD_QP_PATH_MASK_IF_COUNTER_INDEX = 15 + 32, MLX4_UPD_QP_PATH_MASK_FVL_RX = 16 + 32, + MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_UC_LB = 18 + 32, + MLX4_UPD_QP_PATH_MASK_ETH_SRC_CHECK_MC_LB = 19 + 32, }; enum { /* param3 */ @@ -436,11 +448,13 @@ enum mlx4_update_qp_attr { MLX4_UPDATE_QP_VSD = 1 << 1, MLX4_UPDATE_QP_RATE_LIMIT = 1 << 2, MLX4_UPDATE_QP_QOS_VPORT = 1 << 3, - MLX4_UPDATE_QP_SUPPORTED_ATTRS = (1 << 4) - 1 + MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB = 1 << 4, + MLX4_UPDATE_QP_SUPPORTED_ATTRS = (1 << 5) - 1 }; enum mlx4_update_qp_params_flags { - MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE = 1 << 0, + MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB = 1 << 0, + MLX4_UPDATE_QP_PARAMS_FLAGS_VSD_ENABLE = 1 << 1, }; struct mlx4_update_qp_params { -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> ` (2 preceding siblings ...) 2015-10-15 11:44 ` [PATCH v2 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha ` (2 subsequent siblings) 6 siblings, 0 replies; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter, Maor Gottlieb From: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Set the mcast loopback prevention bit in the QPC for ETH MLX QPs (not RSS QPs), when the firmware supports this feature. In addition, all rx ring QPs need to be updated in order not to enforce loopback checks. This prevents getting packets we sent both from the network stack and the HCA. Loopback prevention is done by comparing the counter indices of the sent and receiving QPs. If they're equal, packets aren't loopback-ed. Signed-off-by: Maor Gottlieb <maorg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/net/ethernet/mellanox/mlx4/en_main.c | 22 ++++++++++++++++++++ drivers/net/ethernet/mellanox/mlx4/en_resources.c | 25 +++++++++++++++++++++++ drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 ++- 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_main.c b/drivers/net/ethernet/mellanox/mlx4/en_main.c index a946e4b..005f910 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_main.c @@ -123,6 +123,28 @@ void mlx4_en_update_loopback_state(struct net_device *dev, */ if (mlx4_is_mfunc(priv->mdev->dev) || priv->validate_loopback) priv->flags |= MLX4_EN_FLAG_ENABLE_HW_LOOPBACK; + + mutex_lock(&priv->mdev->state_lock); + if (priv->mdev->dev->caps.flags2 & + MLX4_DEV_CAP_FLAG2_UPDATE_QP_SRC_CHECK_LB && + priv->rss_map.indir_qp.qpn) { + int i; + int err = 0; + int loopback = !!(features & NETIF_F_LOOPBACK); + + for (i = 0; i < priv->rx_ring_num; i++) { + int ret; + + ret = mlx4_en_change_mcast_lb(priv, + &priv->rss_map.qps[i], + loopback); + if (!err) + err = ret; + } + if (err) + mlx4_warn(priv->mdev, "failed to change mcast loopback\n"); + } + mutex_unlock(&priv->mdev->state_lock); } static int mlx4_en_get_profile(struct mlx4_en_dev *mdev) diff --git a/drivers/net/ethernet/mellanox/mlx4/en_resources.c b/drivers/net/ethernet/mellanox/mlx4/en_resources.c index e482fa1b..12aab5a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_resources.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_resources.c @@ -69,6 +69,15 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride, context->pri_path.counter_index = priv->counter_index; context->cqn_send = cpu_to_be32(cqn); context->cqn_recv = cpu_to_be32(cqn); + if (!rss && + (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK) && + context->pri_path.counter_index != + MLX4_SINK_COUNTER_INDEX(mdev->dev)) { + /* disable multicast loopback to qp with same counter */ + if (!(dev->features & NETIF_F_LOOPBACK)) + context->pri_path.fl |= MLX4_FL_ETH_SRC_CHECK_MC_LB; + context->pri_path.control |= MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER; + } context->db_rec_addr = cpu_to_be64(priv->res.db.dma << 2); if (!(dev->features & NETIF_F_HW_VLAN_CTAG_RX)) context->param3 |= cpu_to_be32(1 << 30); @@ -80,6 +89,22 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride, } } +int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp, + int loopback) +{ + int ret; + struct mlx4_update_qp_params qp_params; + + memset(&qp_params, 0, sizeof(qp_params)); + if (!loopback) + qp_params.flags = MLX4_UPDATE_QP_PARAMS_FLAGS_ETH_CHECK_MC_LB; + + ret = mlx4_update_qp(priv->mdev->dev, qp->qpn, + MLX4_UPDATE_QP_ETH_SRC_CHECK_MC_LB, + &qp_params); + + return ret; +} int mlx4_en_map_buffer(struct mlx4_buf *buf) { diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index defcf8c..c41f151 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -798,7 +798,8 @@ void mlx4_en_fill_qp_context(struct mlx4_en_priv *priv, int size, int stride, void mlx4_en_sqp_event(struct mlx4_qp *qp, enum mlx4_event event); int mlx4_en_map_buffer(struct mlx4_buf *buf); void mlx4_en_unmap_buffer(struct mlx4_buf *buf); - +int mlx4_en_change_mcast_lb(struct mlx4_en_priv *priv, struct mlx4_qp *qp, + int loopback); void mlx4_en_calc_rx_buf(struct net_device *dev); int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv); void mlx4_en_release_rss_steer(struct mlx4_en_priv *priv); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> ` (3 preceding siblings ...) 2015-10-15 11:44 ` [PATCH v2 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha [not found] ` <1444909482-17113-6-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha 6 siblings, 1 reply; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter This is an infrastructure step for allocating and attaching more than one counter to QPs on the same port. Allocate a counters table and manage the insertion and removals of the counters in load and unload of mlx4 IB. Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/mad.c | 25 ++++++++++---- drivers/infiniband/hw/mlx4/main.c | 63 ++++++++++++++++++++++++++++-------- drivers/infiniband/hw/mlx4/mlx4_ib.h | 9 +++++- drivers/infiniband/hw/mlx4/qp.c | 8 +++-- 4 files changed, 81 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c index 1cd75ff..68f2567 100644 --- a/drivers/infiniband/hw/mlx4/mad.c +++ b/drivers/infiniband/hw/mlx4/mad.c @@ -824,18 +824,29 @@ static int iboe_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, { struct mlx4_counter counter_stats; struct mlx4_ib_dev *dev = to_mdev(ibdev); - int err; + struct counter_index *tmp_counter; + int err = IB_MAD_RESULT_FAILURE, stats_avail = 0; if (in_mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_PERF_MGMT) return -EINVAL; memset(&counter_stats, 0, sizeof(counter_stats)); - err = mlx4_get_counter_stats(dev->dev, - dev->counters[port_num - 1].index, - &counter_stats, 0); - if (err) - err = IB_MAD_RESULT_FAILURE; - else { + mutex_lock(&dev->counters_table[port_num - 1].mutex); + list_for_each_entry(tmp_counter, + &dev->counters_table[port_num - 1].counters_list, + list) { + err = mlx4_get_counter_stats(dev->dev, + tmp_counter->index, + &counter_stats, 0); + if (err) { + err = IB_MAD_RESULT_FAILURE; + stats_avail = 0; + break; + } + stats_avail = 1; + } + mutex_unlock(&dev->counters_table[port_num - 1].mutex); + if (stats_avail) { memset(out_mad->data, 0, sizeof out_mad->data); switch (counter_stats.counter_mode & 0xf) { case 0: diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 38be8dc..232b104 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -1249,6 +1249,22 @@ static int add_gid_entry(struct ib_qp *ibqp, union ib_gid *gid) return 0; } +static void mlx4_ib_delete_counters_table(struct mlx4_ib_dev *ibdev, + struct mlx4_ib_counters *ctr_table) +{ + struct counter_index *counter, *tmp_count; + + mutex_lock(&ctr_table->mutex); + list_for_each_entry_safe(counter, tmp_count, &ctr_table->counters_list, + list) { + if (counter->allocated) + mlx4_counter_free(ibdev->dev, counter->index); + list_del(&counter->list); + kfree(counter); + } + mutex_unlock(&ctr_table->mutex); +} + int mlx4_ib_add_mc(struct mlx4_ib_dev *mdev, struct mlx4_ib_qp *mqp, union ib_gid *gid) { @@ -2133,6 +2149,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) int num_req_counters; int allocated; u32 counter_index; + struct counter_index *new_counter_index = NULL; pr_info_once("%s", mlx4_ib_version); @@ -2304,6 +2321,11 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) if (init_node_data(ibdev)) goto err_map; + for (i = 0; i < ibdev->num_ports; ++i) { + mutex_init(&ibdev->counters_table[i].mutex); + INIT_LIST_HEAD(&ibdev->counters_table[i].counters_list); + } + num_req_counters = mlx4_is_bonded(dev) ? 1 : ibdev->num_ports; for (i = 0; i < num_req_counters; ++i) { mutex_init(&ibdev->qp1_proxy_lock[i]); @@ -2322,15 +2344,34 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) counter_index = mlx4_get_default_counter_index(dev, i + 1); } - ibdev->counters[i].index = counter_index; - ibdev->counters[i].allocated = allocated; + new_counter_index = kmalloc(sizeof(*new_counter_index), + GFP_KERNEL); + if (!new_counter_index) { + if (allocated) + mlx4_counter_free(ibdev->dev, counter_index); + goto err_counter; + } + new_counter_index->index = counter_index; + new_counter_index->allocated = allocated; + list_add_tail(&new_counter_index->list, + &ibdev->counters_table[i].counters_list); + ibdev->counters_table[i].default_counter = counter_index; pr_info("counter index %d for port %d allocated %d\n", counter_index, i + 1, allocated); } if (mlx4_is_bonded(dev)) for (i = 1; i < ibdev->num_ports ; ++i) { - ibdev->counters[i].index = ibdev->counters[0].index; - ibdev->counters[i].allocated = 0; + new_counter_index = + kmalloc(sizeof(struct counter_index), + GFP_KERNEL); + if (!new_counter_index) + goto err_counter; + new_counter_index->index = counter_index; + new_counter_index->allocated = 0; + list_add_tail(&new_counter_index->list, + &ibdev->counters_table[i].counters_list); + ibdev->counters_table[i].default_counter = + counter_index; } mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB) @@ -2439,12 +2480,9 @@ err_steer_qp_release: mlx4_qp_release_range(dev, ibdev->steer_qpn_base, ibdev->steer_qpn_count); err_counter: - for (i = 0; i < ibdev->num_ports; ++i) { - if (ibdev->counters[i].index != -1 && - ibdev->counters[i].allocated) - mlx4_counter_free(ibdev->dev, - ibdev->counters[i].index); - } + for (i = 0; i < ibdev->num_ports; ++i) + mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[i]); + err_map: iounmap(ibdev->uar_map); @@ -2548,9 +2586,8 @@ static void mlx4_ib_remove(struct mlx4_dev *dev, void *ibdev_ptr) iounmap(ibdev->uar_map); for (p = 0; p < ibdev->num_ports; ++p) - if (ibdev->counters[p].index != -1 && - ibdev->counters[p].allocated) - mlx4_counter_free(ibdev->dev, ibdev->counters[p].index); + mlx4_ib_delete_counters_table(ibdev, &ibdev->counters_table[p]); + mlx4_foreach_port(p, dev, MLX4_PORT_TYPE_IB) mlx4_CLOSE_PORT(dev, p); diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 1e7b23b..4056dc1 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -528,10 +528,17 @@ struct mlx4_ib_iov_port { }; struct counter_index { + struct list_head list; u32 index; u8 allocated; }; +struct mlx4_ib_counters { + struct list_head counters_list; + struct mutex mutex; /* mutex for accessing counters list */ + u32 default_counter; +}; + struct mlx4_ib_dev { struct ib_device ib_dev; struct mlx4_dev *dev; @@ -550,7 +557,7 @@ struct mlx4_ib_dev { struct mutex cap_mask_mutex; bool ib_active; struct mlx4_ib_iboe iboe; - struct counter_index counters[MLX4_MAX_PORTS]; + struct mlx4_ib_counters counters_table[MLX4_MAX_PORTS]; int *eq_table; struct kobject *iov_parent; struct kobject *ports_parent; diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 4ad9be3..382913e 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -1460,6 +1460,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, int sqd_event; int steer_qp = 0; int err = -EINVAL; + int counter_index; /* APM is not supported under RoCE */ if (attr_mask & IB_QP_ALT_PATH && @@ -1543,9 +1544,10 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) { - if (dev->counters[qp->port - 1].index != -1) { - context->pri_path.counter_index = - dev->counters[qp->port - 1].index; + counter_index = + dev->counters_table[qp->port - 1].default_counter; + if (counter_index != -1) { + context->pri_path.counter_index = counter_index; optpar |= MLX4_QP_OPTPAR_COUNTER_INDEX; } else context->pri_path.counter_index = -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <1444909482-17113-6-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <1444909482-17113-6-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-12-24 10:12 ` Sagi Grimberg [not found] ` <567BC52B.4030801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Sagi Grimberg @ 2015-12-24 10:12 UTC (permalink / raw) To: Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter This patch seems to generate a list corruption [1] when I test with Doug's for-4.5 tree. Eran, care to take a look at this? [1]: mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) mlx4_core: Initializing 0000:04:00.0 mlx4_core 0000:04:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s mlx4_core 0000:04:00.0: PCIe link width is x8, device supports x8 <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014) <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0 <mlx4_ib> mlx4_ib_add: counter index 1 for port 2 allocated 0 BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81274e36>] __list_add+0x26/0xd0 PGD 46da14067 PUD 46daa0067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mlx4_ib(+) ib_sa ib_mad mlx4_core mlx5_ib mlx5_core ib_core ib_addr netconsole configfs nfsv3 nfs fscache cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32c_intel aesni_intel aes_x86_64 glue_helper lrw dm_mod gf128mul ablk_helper cryptd iTCO_wdt iTCO_vendor_support sb_edac shpchp ipmi_si ioatdma lpc_ich mfd_core edac_core pcspkr wmi ipmi_msghandler i2c_i801 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod isci libsas igb serio_raw ahci ptp pps_core libahci i2c_algo_bit scsi_transport_sas i2c_core dca ipv6 autofs4 [last unloaded: mlx5_core] CPU: 0 PID: 1737 Comm: modprobe Not tainted 4.4.0-rc6+ #107 Hardware name: Supermicro SYS-1027R-WRF/X9DRW, BIOS 3.0a 08/08/2013 task: ffff8804673da800 ti: ffff880466694000 task.ti: ffff880466694000 RIP: 0010:[<ffffffff81274e36>] [<ffffffff81274e36>] __list_add+0x26/0xd0 RSP: 0018:ffff880466697898 EFLAGS: 00010246 RAX: 00000000ffffffff RBX: ffff8804666978c8 RCX: ffff8804673da800 RDX: ffff88086b8539b8 RSI: 0000000000000000 RDI: ffff8804666978c8 RBP: ffff8804666978b8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 000000000000fffe R12: ffff88086b8539b8 R13: 0000000000000000 R14: ffff88086b8539b8 R15: ffff880466697908 FS: 00007f37a02cf700(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000046b6ee000 CR4: 00000000000406f0 Stack: ffff8804673da800 ffff88086b8539b0 ffff8804673da800 ffff88086b8539b4 ffff880466697958 ffffffff8154f7be ffff880466697904 0000000000000292 ffff880466697938 ffffffff81259bc1 0000000000007f49 80000000024000c0 Call Trace: [<ffffffff8154f7be>] __mutex_lock_slowpath+0x6e/0x110 [<ffffffff81259bc1>] ? ida_simple_get+0x91/0x100 [<ffffffff811d354e>] ? kernfs_next_descendant_post+0x1e/0x90 [<ffffffff811d3646>] ? kernfs_activate+0x86/0xf0 [<ffffffff8154f87e>] mutex_lock+0x1e/0x40 [<ffffffffa00fb083>] iboe_process_mad+0x73/0x180 [mlx4_ib] [<ffffffffa00fba36>] mlx4_ib_process_mad+0xd6/0x110 [mlx4_ib] [<ffffffffa06b4703>] get_perf_mad+0x103/0x140 [ib_core] [<ffffffffa06b4764>] get_counter_table+0x24/0x40 [ib_core] [<ffffffff8115846e>] ? __kmalloc+0xde/0xe0 [<ffffffffa06b4895>] add_port+0x115/0x3f0 [ib_core] [<ffffffffa06b4c5e>] ib_device_register_sysfs+0xee/0x160 [ib_core] [<ffffffffa06b5e05>] ib_register_device+0x1d5/0x300 [ib_core] [<ffffffffa010282b>] mlx4_ib_add+0x78b/0xd00 [mlx4_ib] [<ffffffffa08027ce>] mlx4_add_device+0x3e/0xb0 [mlx4_core] [<ffffffffa0802957>] mlx4_register_interface+0x87/0xe0 [mlx4_core] [<ffffffffa0096055>] mlx4_ib_init+0x55/0x72 [mlx4_ib] [<ffffffffa0096000>] ? 0xffffffffa0096000 [<ffffffff81000368>] do_one_initcall+0xa8/0x1c0 [<ffffffff810ca5bf>] do_init_module+0x5f/0x210 [<ffffffff810cc267>] load_module+0x5d7/0x700 [<ffffffff810c97a0>] ? mod_sysfs_teardown+0x140/0x140 [<ffffffff810c91f0>] ? module_sect_show+0x20/0x20 [<ffffffff810cc44b>] SyS_finit_module+0xbb/0xf0 [<ffffffff81551757>] entry_SYSCALL_64_fastpath+0x12/0x6a Code: 90 90 90 90 90 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 4c 8b 42 08 49 89 f5 49 89 d4 49 39 f0 75 31 <4d> 8b 45 00 4d 39 c4 75 6f 4c 39 e3 74 45 4c 39 eb 74 40 49 89 RIP [<ffffffff81274e36>] __list_add+0x26/0xd0 RSP <ffff880466697898> CR2: 0000000000000000 ---[ end trace 5f4fe0ca857661e6 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567BC52B.4030801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567BC52B.4030801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-12-24 10:30 ` Sagi Grimberg 2015-12-24 10:34 ` Or Gerlitz 1 sibling, 0 replies; 29+ messages in thread From: Sagi Grimberg @ 2015-12-24 10:30 UTC (permalink / raw) To: Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Christoph Lameter Doug, I'm also can't load mlx5 drivers in your tree [1] but I don't know where it's from, it can come from pretty much everything... Now I'm left with no useable HW to test with :( [1]: mlx5_core 0000:06:00.0: firmware version: 12.14.74 mlx5_core 0000:06:00.1: firmware version: 12.14.74 mlx5_ib: Mellanox Connect-IB Infiniband driver v2.2-1 (Feb 2014) command failed, status bad parameter(0x3), syndrome 0x7424da command failed, status bad parameter(0x3), syndrome 0x7424da -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567BC52B.4030801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-12-24 10:30 ` Sagi Grimberg @ 2015-12-24 10:34 ` Or Gerlitz [not found] ` <567BCA3F.7050502-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Or Gerlitz @ 2015-12-24 10:34 UTC (permalink / raw) To: Sagi Grimberg, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter On 12/24/2015 12:12 PM, Sagi Grimberg wrote: > This patch seems to generate a list corruption [1] when I test > with Doug's for-4.5 tree. Eran, care to take a look at this? This patch is part from a series that was introduced in 4.3-rc1 [1], did 4.4-rc5/6 worked for you before you uploaded there further patches? Or. [1] fbfb662 IB/mlx4: Add support for blocking multicast loopback QP creation user flag 7b59f0f IB/mlx4: Add counter based implementation for QP multicast loopback block 3ba8e31 IB/mlx4: Add IB counters table 74194fb net/mlx4_en: Implement mcast loopback prevention for ETH qps 9a89283 net/mlx4_core: Add support for filtering multicast loopback ddf9529 IB/core: Allow setting create flags in QP init attribute 6d8a749 IB/core: Extend ib_uverbs_create_qp -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567BCA3F.7050502-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567BCA3F.7050502-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-12-24 10:42 ` Sagi Grimberg [not found] ` <567BCBF8.30006-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Sagi Grimberg @ 2015-12-24 10:42 UTC (permalink / raw) To: Or Gerlitz, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter >> This patch seems to generate a list corruption [1] when I test >> with Doug's for-4.5 tree. Eran, care to take a look at this? > > This patch is part from a series that was introduced in 4.3-rc1 [1], Then something else broke it. Can people check their patches on doug's tree? At the moment it's unusable... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567BCBF8.30006-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567BCBF8.30006-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-12-24 12:38 ` Or Gerlitz [not found] ` <567BE756.9050005-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Or Gerlitz @ 2015-12-24 12:38 UTC (permalink / raw) To: Sagi Grimberg, Eran Ben Elisha, Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Christoph Lameter On 12/24/2015 12:42 PM, Sagi Grimberg wrote: > >>> This patch seems to generate a list corruption [1] when I test >>> with Doug's for-4.5 tree. Eran, care to take a look at this? >> >> This patch is part from a series that was introduced in 4.3-rc1 [1], > > Then something else broke it. Can people check their patches on doug's > tree? At the moment it's unusable... Yes, I checked the branch up to commit 882f3b3 "Merge branches '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, ibv_rc_pingpong over top of mlx4 VPI) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567BE756.9050005-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567BE756.9050005-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2015-12-24 14:07 ` Matan Barak [not found] ` <CAAKD3BBsv8Qe=RZvCTOa=0kuSAqYetx4NOok_Tcp5xtUz+Ggtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Matan Barak @ 2015-12-24 14:07 UTC (permalink / raw) To: Or Gerlitz Cc: Sagi Grimberg, Eran Ben Elisha, Doug Ledford, linux-rdma, Christoph Lameter On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > On 12/24/2015 12:42 PM, Sagi Grimberg wrote: >> >> >>>> This patch seems to generate a list corruption [1] when I test >>>> with Doug's for-4.5 tree. Eran, care to take a look at this? >>> >>> >>> This patch is part from a series that was introduced in 4.3-rc1 [1], >> >> >> Then something else broke it. Can people check their patches on doug's >> tree? At the moment it's unusable... > Leon and I have checked Doug's tree with mlx4_ib disabled and we didn't encounter any error. We ran ucmatose over IB connection (in mlx5) and it worked flawlessly. > > Yes, I checked the branch up to commit 882f3b3 "Merge branches > '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, > ibv_rc_pingpong over top of mlx4 VPI) > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <CAAKD3BBsv8Qe=RZvCTOa=0kuSAqYetx4NOok_Tcp5xtUz+Ggtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <CAAKD3BBsv8Qe=RZvCTOa=0kuSAqYetx4NOok_Tcp5xtUz+Ggtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-12-24 16:09 ` Matan Barak [not found] ` <CAAKD3BB5-hUC=0bDqxXxuag9TPqQEeMD74paHrAMOc-MsQwEow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Matan Barak @ 2015-12-24 16:09 UTC (permalink / raw) To: Or Gerlitz Cc: Sagi Grimberg, Eran Ben Elisha, Doug Ledford, linux-rdma, Christoph Lameter On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >> On 12/24/2015 12:42 PM, Sagi Grimberg wrote: >>> >>> >>>>> This patch seems to generate a list corruption [1] when I test >>>>> with Doug's for-4.5 tree. Eran, care to take a look at this? >>>> >>>> >>>> This patch is part from a series that was introduced in 4.3-rc1 [1], >>> >>> >>> Then something else broke it. Can people check their patches on doug's >>> tree? At the moment it's unusable... >> > > Leon and I have checked Doug's tree with mlx4_ib disabled and we > didn't encounter any error. > We ran ucmatose over IB connection (in mlx5) and it worked flawlessly. > >> >> Yes, I checked the branch up to commit 882f3b3 "Merge branches >> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, >> ibv_rc_pingpong over top of mlx4 VPI) >> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it seems like the bug is introduced in the 64bit counters test. Here's a proposal: diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c index 539040f..8da3c83 100644 --- a/drivers/infiniband/core/sysfs.c +++ b/drivers/infiniband/core/sysfs.c @@ -714,11 +714,12 @@ err: * Figure out which counter table to use depending on * the device capabilities. */ -static struct attribute_group *get_counter_table(struct ib_device *dev) +static struct attribute_group *get_counter_table(struct ib_device *dev, + int port_num) { struct ib_class_port_info cpi; - if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO, + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO, &cpi, 40, sizeof(cpi)) >= 0) { if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num, goto err_put; } - p->pma_table = get_counter_table(device); + p->pma_table = get_counter_table(device, port_num); ret = sysfs_create_group(&p->kobj, p->pma_table); if (ret) goto err_put_gid_attrs; >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
[parent not found: <CAAKD3BB5-hUC=0bDqxXxuag9TPqQEeMD74paHrAMOc-MsQwEow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <CAAKD3BB5-hUC=0bDqxXxuag9TPqQEeMD74paHrAMOc-MsQwEow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2015-12-25 14:50 ` Hal Rosenstock [not found] ` <567D57A5.2050908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Hal Rosenstock @ 2015-12-25 14:50 UTC (permalink / raw) To: Matan Barak, Or Gerlitz Cc: Sagi Grimberg, Eran Ben Elisha, Doug Ledford, linux-rdma, Christoph Lameter On 12/24/2015 11:09 AM, Matan Barak wrote: > On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: >> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote: >>>> >>>> >>>>>> This patch seems to generate a list corruption [1] when I test >>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this? >>>>> >>>>> >>>>> This patch is part from a series that was introduced in 4.3-rc1 [1], >>>> >>>> >>>> Then something else broke it. Can people check their patches on doug's >>>> tree? At the moment it's unusable... >>> >> >> Leon and I have checked Doug's tree with mlx4_ib disabled and we >> didn't encounter any error. >> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly. >> >>> >>> Yes, I checked the branch up to commit 882f3b3 "Merge branches >>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, >>> ibv_rc_pingpong over top of mlx4 VPI) >>> > > Regarding mlx4, Eran and I analyzed it. We didn't test that, but it > seems like the bug is introduced in the 64bit counters test. Here's a > proposal: > > diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c > index 539040f..8da3c83 100644 > --- a/drivers/infiniband/core/sysfs.c > +++ b/drivers/infiniband/core/sysfs.c > @@ -714,11 +714,12 @@ err: > * Figure out which counter table to use depending on > * the device capabilities. > */ > -static struct attribute_group *get_counter_table(struct ib_device *dev) > +static struct attribute_group *get_counter_table(struct ib_device *dev, > + int port_num) > { > struct ib_class_port_info cpi; > > - if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO, > + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO, > &cpi, 40, sizeof(cpi)) >= 0) { Your proposal is similar to earlier version of Christoph's patch but was changed since ClassPortInfo attribute does not have PortSelect field like other PerfMgt attributes which is where this port num would be placed. In ClassPortInfo attribute, that location would be the ClassVersion field that would be set to port number in PerfMgt Get query. -- Hal > > if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) > @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num, > goto err_put; > } > > - p->pma_table = get_counter_table(device); > + p->pma_table = get_counter_table(device, port_num); > ret = sysfs_create_group(&p->kobj, p->pma_table); > if (ret) > goto err_put_gid_attrs; > > >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567D57A5.2050908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567D57A5.2050908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-12-25 15:56 ` Hal Rosenstock [not found] ` <567D674B.9050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Hal Rosenstock @ 2015-12-25 15:56 UTC (permalink / raw) To: Matan Barak, Or Gerlitz Cc: Sagi Grimberg, Eran Ben Elisha, Doug Ledford, linux-rdma, Christoph Lameter On 12/25/2015 9:50 AM, Hal Rosenstock wrote: > On 12/24/2015 11:09 AM, Matan Barak wrote: >> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: >>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote: >>>>> >>>>> >>>>>>> This patch seems to generate a list corruption [1] when I test >>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this? >>>>>> >>>>>> >>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1], >>>>> >>>>> >>>>> Then something else broke it. Can people check their patches on doug's >>>>> tree? At the moment it's unusable... >>>> >>> >>> Leon and I have checked Doug's tree with mlx4_ib disabled and we >>> didn't encounter any error. >>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly. >>> >>>> >>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches >>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, >>>> ibv_rc_pingpong over top of mlx4 VPI) >>>> >> >> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it >> seems like the bug is introduced in the 64bit counters test. Here's a >> proposal: >> >> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c >> index 539040f..8da3c83 100644 >> --- a/drivers/infiniband/core/sysfs.c >> +++ b/drivers/infiniband/core/sysfs.c >> @@ -714,11 +714,12 @@ err: >> * Figure out which counter table to use depending on >> * the device capabilities. >> */ >> -static struct attribute_group *get_counter_table(struct ib_device *dev) >> +static struct attribute_group *get_counter_table(struct ib_device *dev, >> + int port_num) >> { >> struct ib_class_port_info cpi; >> >> - if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO, >> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO, >> &cpi, 40, sizeof(cpi)) >= 0) { > > Your proposal is similar to earlier version of Christoph's patch but was > changed since ClassPortInfo attribute does not have PortSelect field > like other PerfMgt attributes which is where this port num would be > placed. In ClassPortInfo attribute, that location would be the > ClassVersion field that would be set to port number in PerfMgt Get query. In actuality, I don't think it really matters as this is a Get not a Set and the PMA would do the right thing even if some field in the CPI were stepped on. > -- Hal > >> >> if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) >> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num, >> goto err_put; >> } >> >> - p->pma_table = get_counter_table(device); >> + p->pma_table = get_counter_table(device, port_num); >> ret = sysfs_create_group(&p->kobj, p->pma_table); >> if (ret) >> goto err_put_gid_attrs; >> >> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <567D674B.9050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table [not found] ` <567D674B.9050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2015-12-27 11:16 ` Matan Barak 0 siblings, 0 replies; 29+ messages in thread From: Matan Barak @ 2015-12-27 11:16 UTC (permalink / raw) To: Hal Rosenstock Cc: Or Gerlitz, Sagi Grimberg, Eran Ben Elisha, Doug Ledford, linux-rdma, Christoph Lameter On Fri, Dec 25, 2015 at 5:56 PM, Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: > On 12/25/2015 9:50 AM, Hal Rosenstock wrote: >> On 12/24/2015 11:09 AM, Matan Barak wrote: >>> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <matanb-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote: >>>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: >>>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote: >>>>>> >>>>>> >>>>>>>> This patch seems to generate a list corruption [1] when I test >>>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this? >>>>>>> >>>>>>> >>>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1], >>>>>> >>>>>> >>>>>> Then something else broke it. Can people check their patches on doug's >>>>>> tree? At the moment it's unusable... >>>>> >>>> >>>> Leon and I have checked Doug's tree with mlx4_ib disabled and we >>>> didn't encounter any error. >>>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly. >>>> >>>>> >>>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches >>>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping, >>>>> ibv_rc_pingpong over top of mlx4 VPI) >>>>> >>> >>> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it >>> seems like the bug is introduced in the 64bit counters test. Here's a >>> proposal: >>> >>> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c >>> index 539040f..8da3c83 100644 >>> --- a/drivers/infiniband/core/sysfs.c >>> +++ b/drivers/infiniband/core/sysfs.c >>> @@ -714,11 +714,12 @@ err: >>> * Figure out which counter table to use depending on >>> * the device capabilities. >>> */ >>> -static struct attribute_group *get_counter_table(struct ib_device *dev) >>> +static struct attribute_group *get_counter_table(struct ib_device *dev, >>> + int port_num) >>> { >>> struct ib_class_port_info cpi; >>> >>> - if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO, >>> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO, >>> &cpi, 40, sizeof(cpi)) >= 0) { >> >> Your proposal is similar to earlier version of Christoph's patch but was >> changed since ClassPortInfo attribute does not have PortSelect field >> like other PerfMgt attributes which is where this port num would be >> placed. In ClassPortInfo attribute, that location would be the >> ClassVersion field that would be set to port number in PerfMgt Get query. > > In actuality, I don't think it really matters as this is a Get not a Set > and the PMA would do the right thing even if some field in the CPI were > stepped on. > Well, it does matter as it calls the vendor driver with port_num = 0. Since the kernel is trusted, the vendor driver expects a valid port number. Giving it an invalid number might result in memory corruptions, as demonstrated in this case. >> -- Hal Matan >> >>> >>> if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH) >>> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int port_num, >>> goto err_put; >>> } >>> >>> - p->pma_table = get_counter_table(device); >>> + p->pma_table = get_counter_table(device, port_num); >>> ret = sysfs_create_group(&p->kobj, p->pma_table); >>> if (ret) >>> goto err_put_gid_attrs; >>> >>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> ` (4 preceding siblings ...) 2015-10-15 11:44 ` [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha 6 siblings, 0 replies; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter Current implementation for MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is not supported when link layer is Ethernet. This patch will add counter based implementation for multicast loopback prevention. HW can drop multicast loopback packets if sender QP counter index is equal to receiver QP counter index. If qp flag MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is set and link layer is Ethernet, create a new counter and attach it to the QP so it will continue receiving multicast loopback traffic but it's own. The decision if to create a new counter is being made at the qp modification to RTR after the QP's port is set. When QP is destroyed or moved back to reset state, delete the counter. Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/infiniband/hw/mlx4/qp.c | 67 ++++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+) diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h index 4056dc1..086416a 100644 --- a/drivers/infiniband/hw/mlx4/mlx4_ib.h +++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h @@ -320,6 +320,7 @@ struct mlx4_ib_qp { struct list_head qps_list; struct list_head cq_recv_list; struct list_head cq_send_list; + struct counter_index *counter_index; }; struct mlx4_ib_srq { diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 382913e..2871949 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -617,6 +617,18 @@ static int qp0_enabled_vf(struct mlx4_dev *dev, int qpn) return 0; } +static void mlx4_ib_free_qp_counter(struct mlx4_ib_dev *dev, + struct mlx4_ib_qp *qp) +{ + mutex_lock(&dev->counters_table[qp->port - 1].mutex); + mlx4_counter_free(dev->dev, qp->counter_index->index); + list_del(&qp->counter_index->list); + mutex_unlock(&dev->counters_table[qp->port - 1].mutex); + + kfree(qp->counter_index); + qp->counter_index = NULL; +} + static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, struct ib_qp_init_attr *init_attr, struct ib_udata *udata, int sqpn, struct mlx4_ib_qp **caller_qp, @@ -1189,6 +1201,9 @@ int mlx4_ib_destroy_qp(struct ib_qp *qp) mutex_unlock(&dev->qp1_proxy_lock[mqp->port - 1]); } + if (mqp->counter_index) + mlx4_ib_free_qp_counter(dev, mqp); + pd = get_pd(mqp); destroy_qp_common(dev, mqp, !!pd->ibpd.uobject); @@ -1447,6 +1462,40 @@ static int handle_eth_ud_smac_index(struct mlx4_ib_dev *dev, struct mlx4_ib_qp * return 0; } +static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp) +{ + struct counter_index *new_counter_index; + int err; + u32 tmp_idx; + + if (rdma_port_get_link_layer(&dev->ib_dev, qp->port) != + IB_LINK_LAYER_ETHERNET || + !(qp->flags & MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK) || + !(dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK)) + return 0; + + err = mlx4_counter_alloc(dev->dev, &tmp_idx); + if (err) + return err; + + new_counter_index = kmalloc(sizeof(*new_counter_index), GFP_KERNEL); + if (!new_counter_index) { + mlx4_counter_free(dev->dev, tmp_idx); + return -ENOMEM; + } + + new_counter_index->index = tmp_idx; + new_counter_index->allocated = 1; + qp->counter_index = new_counter_index; + + mutex_lock(&dev->counters_table[qp->port - 1].mutex); + list_add_tail(&new_counter_index->list, + &dev->counters_table[qp->port - 1].counters_list); + mutex_unlock(&dev->counters_table[qp->port - 1].mutex); + + return 0; +} + static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, const struct ib_qp_attr *attr, int attr_mask, enum ib_qp_state cur_state, enum ib_qp_state new_state) @@ -1520,6 +1569,9 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, context->sq_size_stride = ilog2(qp->sq.wqe_cnt) << 3; context->sq_size_stride |= qp->sq.wqe_shift - 4; + if (new_state == IB_QPS_RESET && qp->counter_index) + mlx4_ib_free_qp_counter(dev, qp); + if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { context->sq_size_stride |= !!qp->sq_no_prefetch << 7; context->xrcd = cpu_to_be32((u32) qp->xrcdn); @@ -1544,11 +1596,24 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } if (cur_state == IB_QPS_INIT && new_state == IB_QPS_RTR) { + err = create_qp_lb_counter(dev, qp); + if (err) + goto out; + counter_index = dev->counters_table[qp->port - 1].default_counter; + if (qp->counter_index) + counter_index = qp->counter_index->index; + if (counter_index != -1) { context->pri_path.counter_index = counter_index; optpar |= MLX4_QP_OPTPAR_COUNTER_INDEX; + if (qp->counter_index) { + context->pri_path.fl |= + MLX4_FL_ETH_SRC_CHECK_MC_LB; + context->pri_path.vlan_control |= + MLX4_CTRL_ETH_SRC_CHECK_IF_COUNTER; + } } else context->pri_path.counter_index = MLX4_SINK_COUNTER_INDEX(dev->dev); @@ -1850,6 +1915,8 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp, } } out: + if (err && qp->counter_index) + mlx4_ib_free_qp_counter(dev, qp); if (err && steer_qp) mlx4_ib_steer_qp_reg(dev, qp, 0); kfree(context); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
* [PATCH v2 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> ` (5 preceding siblings ...) 2015-10-15 11:44 ` [PATCH v2 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha @ 2015-10-15 11:44 ` Eran Ben Elisha 6 siblings, 0 replies; 29+ messages in thread From: Eran Ben Elisha @ 2015-10-15 11:44 UTC (permalink / raw) To: Doug Ledford Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Or Gerlitz, Eran Ben Elisha, Christoph Lameter MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK is now supported downstream. In addition, this flag was supported only for IB_QPT_UD, now, with the new implementation it is supported for all QP types. Support IB_USER_VERBS_EX_CMD_CREATE_QP in order to get the flag from user space using the extension create qp command. Signed-off-by: Eran Ben Elisha <eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> --- drivers/infiniband/hw/mlx4/main.c | 3 ++- drivers/infiniband/hw/mlx4/qp.c | 13 ++++++++----- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 232b104..8779d26 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -2312,7 +2312,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev) ibdev->ib_dev.uverbs_ex_cmd_mask |= (1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE) | - (1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ); + (1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ) | + (1ull << IB_USER_VERBS_EX_CMD_CREATE_QP); mlx4_ib_alloc_eqs(dev, ibdev); diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 2871949..ba25a1b 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -758,9 +758,6 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, } else { qp->sq_no_prefetch = 0; - if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) - qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; - if (init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO) qp->flags |= MLX4_IB_QP_LSO; @@ -834,6 +831,9 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, goto err_proxy; } + if (init_attr->create_flags & IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK) + qp->flags |= MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; + err = mlx4_qp_alloc(dev->dev, qpn, &qp->mqp, gfp); if (err) goto err_qpn; @@ -1098,6 +1098,7 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd, { struct mlx4_ib_qp *qp = NULL; int err; + int sup_u_create_flags = MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; u16 xrcdn = 0; gfp_t gfp; @@ -1121,8 +1122,10 @@ struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd, } if (init_attr->create_flags && - (udata || - ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP | MLX4_IB_QP_CREATE_USE_GFP_NOIO)) && + ((udata && init_attr->create_flags & ~(sup_u_create_flags)) || + ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP | + MLX4_IB_QP_CREATE_USE_GFP_NOIO | + MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK)) && init_attr->qp_type != IB_QPT_UD) || ((init_attr->create_flags & MLX4_IB_SRIOV_SQP) && init_attr->qp_type > IB_QPT_GSI))) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 29+ messages in thread
end of thread, other threads:[~2015-12-27 11:16 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-10-15 11:44 [PATCH v2 for-next 0/7] Add support for multicast loopback prevention to mlx4 Eran Ben Elisha [not found] ` <1444909482-17113-1-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-15 11:44 ` [PATCH v2 for-next 1/7] IB/core: Extend ib_uverbs_create_qp Eran Ben Elisha [not found] ` <1444909482-17113-2-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 8:34 ` Haggai Eran [not found] ` <56274E02.7010002-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 13:46 ` eran ben elisha [not found] ` <CAKHjkj=k6QSKj0gMe0z5=Kjbe2e7cVCbPhkq9Z457gpvnRyitg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-10-21 14:42 ` Haggai Eran 2015-10-21 9:53 ` Sagi Grimberg [not found] ` <56276095.6020803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-10-21 10:04 ` Or Gerlitz [not found] ` <56276336.7000704-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 14:09 ` Sagi Grimberg [not found] ` <56279CA4.8040201-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-10-21 18:15 ` Or Gerlitz [not found] ` <CAJ3xEMgWcpKbKL3jhiK2RP=H6_iY0eriirAXn1TJaSG7u8frAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-10-21 18:21 ` Christoph Lameter [not found] ` <alpine.DEB.2.20.1510211319520.10833-wcBtFHqTun5QOdAKl3ChDw@public.gmane.org> 2015-10-21 18:52 ` Or Gerlitz 2015-10-21 10:26 ` Haggai Eran 2015-10-15 11:44 ` [PATCH v2 for-next 2/7] IB/core: Allow setting create flags in QP init attribute Eran Ben Elisha [not found] ` <1444909482-17113-3-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-10-21 8:46 ` Haggai Eran 2015-10-15 11:44 ` [PATCH v2 for-next 3/7] net/mlx4_core: Add support for filtering multicast loopback Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 4/7] net/mlx4_en: Implement mcast loopback prevention for ETH qps Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 5/7] IB/mlx4: Add IB counters table Eran Ben Elisha [not found] ` <1444909482-17113-6-git-send-email-eranbe-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-12-24 10:12 ` Sagi Grimberg [not found] ` <567BC52B.4030801-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-12-24 10:30 ` Sagi Grimberg 2015-12-24 10:34 ` Or Gerlitz [not found] ` <567BCA3F.7050502-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-12-24 10:42 ` Sagi Grimberg [not found] ` <567BCBF8.30006-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-12-24 12:38 ` Or Gerlitz [not found] ` <567BE756.9050005-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2015-12-24 14:07 ` Matan Barak [not found] ` <CAAKD3BBsv8Qe=RZvCTOa=0kuSAqYetx4NOok_Tcp5xtUz+Ggtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-12-24 16:09 ` Matan Barak [not found] ` <CAAKD3BB5-hUC=0bDqxXxuag9TPqQEeMD74paHrAMOc-MsQwEow-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2015-12-25 14:50 ` Hal Rosenstock [not found] ` <567D57A5.2050908-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-12-25 15:56 ` Hal Rosenstock [not found] ` <567D674B.9050807-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2015-12-27 11:16 ` Matan Barak 2015-10-15 11:44 ` [PATCH v2 for-next 6/7] IB/mlx4: Add counter based implementation for QP multicast loopback block Eran Ben Elisha 2015-10-15 11:44 ` [PATCH v2 for-next 7/7] IB/mlx4: Add support for blocking multicast loopback QP creation user flag Eran Ben Elisha
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.