Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] Random fixes to IB/core
@ 2019-09-16  7:11 Leon Romanovsky
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16  7:11 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Jack Morgenstein, Mark Zhang

From: Leon Romanovsky <leonro@mellanox.com>

Hi,

This is a random set of various fixes in IB/core, which were in my
submission queue before LPC>

Thanks

Jack Morgenstein (2):
  RDMA/cm: Fix memory leak in cm_add/remove_one
  RDMA: Fix double-free in srq creation error flow

Leon Romanovsky (1):
  RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error
    path

Mark Zhang (1):
  RDMA/counter: Prevent QP counter manual binding in auto mode

 drivers/infiniband/core/cm.c         |  3 +++
 drivers/infiniband/core/counters.c   | 12 +++++++++++-
 drivers/infiniband/core/nldev.c      | 12 +++++-------
 drivers/infiniband/core/uverbs_cmd.c |  3 ++-
 4 files changed, 21 insertions(+), 9 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
@ 2019-09-16  7:11 ` Leon Romanovsky
  2019-09-16 18:45   ` Jason Gunthorpe
                     ` (2 more replies)
  2019-09-16  7:11 ` [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode Leon Romanovsky
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16  7:11 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Jack Morgenstein, Mark Zhang

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

In the process of moving the debug counters sysfs entries, the commit
mentioned below eliminated the cm_infiniband sysfs directory.

This sysfs directory was tied to the cm_port object allocated in procedure
cm_add_one().

Before the commit below, this cm_port object was freed via a call to
kobject_put(port->kobj) in procedure cm_remove_port_fs().

Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
This, however, meant that kfree was never called for the cm_port buffers.

Fix this by adding explicit kfree(port) calls to functions cm_add_one()
and cm_remove_one().

Note: the kfree call in the first chunk below (in the cm_add_one error
flow) fixes an old, undetected memory leak.

Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index da10e6ccb43c..5920c0085d35 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -4399,6 +4399,7 @@ static void cm_add_one(struct ib_device *ib_device)
 error1:
 	port_modify.set_port_cap_mask = 0;
 	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
+	kfree(port);
 	while (--i) {
 		if (!rdma_cap_ib_cm(ib_device, i))
 			continue;
@@ -4407,6 +4408,7 @@ static void cm_add_one(struct ib_device *ib_device)
 		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
 		ib_unregister_mad_agent(port->mad_agent);
 		cm_remove_port_fs(port);
+		kfree(port);
 	}
 free:
 	kfree(cm_dev);
@@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
 		spin_unlock_irq(&cm.state_lock);
 		ib_unregister_mad_agent(cur_mad_agent);
 		cm_remove_port_fs(port);
+		kfree(port);
 	}

 	kfree(cm_dev);
--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode
  2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
@ 2019-09-16  7:11 ` Leon Romanovsky
  2019-10-01 14:22   ` Jason Gunthorpe
  2019-09-16  7:11 ` [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Leon Romanovsky
  2019-09-16  7:11 ` [PATCH 4/4] RDMA: Fix double-free in srq creation error flow Leon Romanovsky
  3 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16  7:11 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Jack Morgenstein, Mark Zhang

From: Mark Zhang <markz@mellanox.com>

If auto mode is configured, manual counter allocation and QP bind is
not allowed.

Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/counters.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 680ad27f497d..736ab760025d 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -463,10 +463,15 @@ static struct rdma_counter *rdma_get_counter_by_id(struct ib_device *dev,
 int rdma_counter_bind_qpn(struct ib_device *dev, u8 port,
 			  u32 qp_num, u32 counter_id)
 {
+	struct rdma_port_counter *port_counter;
 	struct rdma_counter *counter;
 	struct ib_qp *qp;
 	int ret;

+	port_counter = &dev->port_data[port].port_counter;
+	if (port_counter->mode.mode == RDMA_COUNTER_MODE_AUTO)
+		return -EINVAL;
+
 	qp = rdma_counter_get_qp(dev, qp_num);
 	if (!qp)
 		return -ENOENT;
@@ -503,6 +508,7 @@ int rdma_counter_bind_qpn(struct ib_device *dev, u8 port,
 int rdma_counter_bind_qpn_alloc(struct ib_device *dev, u8 port,
 				u32 qp_num, u32 *counter_id)
 {
+	struct rdma_port_counter *port_counter;
 	struct rdma_counter *counter;
 	struct ib_qp *qp;
 	int ret;
@@ -510,9 +516,13 @@ int rdma_counter_bind_qpn_alloc(struct ib_device *dev, u8 port,
 	if (!rdma_is_port_valid(dev, port))
 		return -EINVAL;

-	if (!dev->port_data[port].port_counter.hstats)
+	port_counter = &dev->port_data[port].port_counter;
+	if (!port_counter->hstats)
 		return -EOPNOTSUPP;

+	if (port_counter->mode.mode == RDMA_COUNTER_MODE_AUTO)
+		return -EINVAL;
+
 	qp = rdma_counter_get_qp(dev, qp_num);
 	if (!qp)
 		return -ENOENT;
--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
  2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
  2019-09-16  7:11 ` [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode Leon Romanovsky
@ 2019-09-16  7:11 ` Leon Romanovsky
  2019-09-16 18:48   ` Jason Gunthorpe
  2019-09-16  7:11 ` [PATCH 4/4] RDMA: Fix double-free in srq creation error flow Leon Romanovsky
  3 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16  7:11 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Jack Morgenstein, Mark Zhang

From: Leon Romanovsky <leonro@mellanox.com>

Properly unwind QP counter rebinding in case of failure.

Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/nldev.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 5e2b7eb0761b..6eb14481a72e 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -1860,24 +1860,22 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,

 	cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
 	qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
-	ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
-	if (ret)
-		goto err_unbind;
-
 	if (fill_nldev_handle(msg, device) ||
 	    nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
 	    nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
 	    nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn)) {
 		ret = -EMSGSIZE;
-		goto err_fill;
+		goto err_unbind;
 	}

+	ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
+	if (ret)
+		goto err_unbind;
+
 	nlmsg_end(msg, nlh);
 	ib_device_put(device);
 	return rdma_nl_unicast(sock_net(skb->sk), msg, NETLINK_CB(skb).portid);

-err_fill:
-	rdma_counter_bind_qpn(device, port, qpn, cntn);
 err_unbind:
 	nlmsg_free(msg);
 err:
--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/4] RDMA: Fix double-free in srq creation error flow
  2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
                   ` (2 preceding siblings ...)
  2019-09-16  7:11 ` [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Leon Romanovsky
@ 2019-09-16  7:11 ` Leon Romanovsky
  2019-09-16 17:57   ` Jason Gunthorpe
  3 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16  7:11 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Jack Morgenstein, Mark Zhang

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

The cited commit introduced a double-free of the srq buffer
in the error flow of procedure __uverbs_create_xsrq().

The problem is that procedure ib_destroy_srq_user() called
in the error flow also frees the srq buffer.

Thus, if uverbs_response() fails in __uverbs_create_srq(),
the srq buffer will be freed twice.

Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 8f4fd4fac159..13af88da5f79 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3482,7 +3482,8 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,

 err_copy:
 	ib_destroy_srq_user(srq, uverbs_get_cleared_udata(attrs));
-
+	/* It was released in ib_destroy_srq_user */
+	srq = NULL;
 err_free:
 	kfree(srq);
 err_put:
--
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] RDMA: Fix double-free in srq creation error flow
  2019-09-16  7:11 ` [PATCH 4/4] RDMA: Fix double-free in srq creation error flow Leon Romanovsky
@ 2019-09-16 17:57   ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2019-09-16 17:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:54AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> The cited commit introduced a double-free of the srq buffer
> in the error flow of procedure __uverbs_create_xsrq().
> 
> The problem is that procedure ib_destroy_srq_user() called
> in the error flow also frees the srq buffer.
> 
> Thus, if uverbs_response() fails in __uverbs_create_srq(),
> the srq buffer will be freed twice.
> 
> Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/core/uverbs_cmd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 8f4fd4fac159..13af88da5f79 100644
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -3482,7 +3482,8 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
> 
>  err_copy:
>  	ib_destroy_srq_user(srq, uverbs_get_cleared_udata(attrs));
> -
> +	/* It was released in ib_destroy_srq_user */
> +	srq = NULL;

I really don't like that we ended up with such a mess of error unwind.

The proper outcome should be that uobj_alloc_abort() does a full
clean up, including destroying and freeing the HW object if it was
allocated.

When we forced the new uobj system into the write() path this was
never cleaned up, instead the abort just disables the HW object clean
up to avoid alot of code churn, while ioctl does actually clean the HW
object on failure.

Ie only one clean up path for uobjects.

I also wonder if there is another race here, this is also missing the
ib_uverbs_release_uevent() after destroying the HW object, but I don't
know if it could be possible for an event to be stuffed in..

In any event, the double free is clearly bad, so applied to for-next

Thanks,
Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
@ 2019-09-16 18:45   ` Jason Gunthorpe
  2019-09-18 11:36     ` Leon Romanovsky
  2019-09-25  6:16     ` Leon Romanovsky
  2019-10-02 11:58   ` Leon Romanovsky
  2019-10-04 17:59   ` Jason Gunthorpe
  2 siblings, 2 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2019-09-16 18:45 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory.
> 
> This sysfs directory was tied to the cm_port object allocated in procedure
> cm_add_one().
> 
> Before the commit below, this cm_port object was freed via a call to
> kobject_put(port->kobj) in procedure cm_remove_port_fs().
> 
> Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> This, however, meant that kfree was never called for the cm_port buffers.
> 
> Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> and cm_remove_one().
> 
> Note: the kfree call in the first chunk below (in the cm_add_one error
> flow) fixes an old, undetected memory leak.
> 
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/core/cm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index da10e6ccb43c..5920c0085d35 100644
> +++ b/drivers/infiniband/core/cm.c
> @@ -4399,6 +4399,7 @@ static void cm_add_one(struct ib_device *ib_device)
>  error1:
>  	port_modify.set_port_cap_mask = 0;
>  	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> +	kfree(port);
>  	while (--i) {
>  		if (!rdma_cap_ib_cm(ib_device, i))
>  			continue;
> @@ -4407,6 +4408,7 @@ static void cm_add_one(struct ib_device *ib_device)
>  		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
>  		ib_unregister_mad_agent(port->mad_agent);
>  		cm_remove_port_fs(port);
> +		kfree(port);
>  	}
>  free:
>  	kfree(cm_dev);
> @@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
>  		spin_unlock_irq(&cm.state_lock);
>  		ib_unregister_mad_agent(cur_mad_agent);
>  		cm_remove_port_fs(port);
> +		kfree(port);
>  	}

This whole thing is looking pretty goofy now, and I suspect there are
more error unwind bugs here.

How about this instead:

From e8dad20c7b69436e63b18f16cd9457ea27da5bc1 Mon Sep 17 00:00:00 2001
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
Date: Mon, 16 Sep 2019 10:11:51 +0300
Subject: [PATCH] RDMA/cm: Fix memory leak in cm_add/remove_one

In the process of moving the debug counters sysfs entries, the commit
mentioned below eliminated the cm_infiniband sysfs directory, and created
some missing cases where the port pointers were not being freed as the
kobject_put was also eliminated.

Rework this to not allocate port pointers and consolidate all the error
unwind into one sequence.

This also fixes unlikely racey bugs where error-unwind after unregistering
the MAD handler would miss flushing the WQ and other clean up that is
necessary once concurrency starts.

Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/core/cm.c | 187 ++++++++++++++++++-----------------
 1 file changed, 94 insertions(+), 93 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index da10e6ccb43cd0..30a764e763dec1 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -223,7 +223,7 @@ struct cm_device {
 	struct ib_device *ib_device;
 	u8 ack_delay;
 	int going_down;
-	struct cm_port *port[0];
+	struct cm_port port[];
 };
 
 struct cm_av {
@@ -520,7 +520,7 @@ get_cm_port_from_path(struct sa_path_rec *path, const struct ib_gid_attr *attr)
 		read_lock_irqsave(&cm.device_lock, flags);
 		list_for_each_entry(cm_dev, &cm.device_list, list) {
 			if (cm_dev->ib_device == attr->device) {
-				port = cm_dev->port[attr->port_num - 1];
+				port = &cm_dev->port[attr->port_num - 1];
 				break;
 			}
 		}
@@ -539,7 +539,7 @@ get_cm_port_from_path(struct sa_path_rec *path, const struct ib_gid_attr *attr)
 					     sa_conv_pathrec_to_gid_type(path),
 					     NULL);
 			if (!IS_ERR(attr)) {
-				port = cm_dev->port[attr->port_num - 1];
+				port = &cm_dev->port[attr->port_num - 1];
 				break;
 			}
 		}
@@ -4319,23 +4319,99 @@ static void cm_remove_port_fs(struct cm_port *port)
 
 }
 
-static void cm_add_one(struct ib_device *ib_device)
+static void cm_destroy_one_port(struct cm_device *cm_dev, unsigned int port_num)
 {
-	struct cm_device *cm_dev;
-	struct cm_port *port;
+	struct cm_port *port = &cm_dev->port[port_num - 1];
+	struct ib_port_modify port_modify = {
+		.clr_port_cap_mask = IB_PORT_CM_SUP
+	};
+	struct ib_mad_agent *cur_mad_agent;
+	struct cm_id_private *cm_id_priv;
+
+	if (!rdma_cap_ib_cm(cm_dev->ib_device, port_num))
+		return;
+
+	ib_modify_port(cm_dev->ib_device, port_num, 0, &port_modify);
+
+	/* Mark all the cm_id's as not valid */
+	spin_lock_irq(&cm.lock);
+	list_for_each_entry (cm_id_priv, &port->cm_priv_altr_list, altr_list)
+		cm_id_priv->altr_send_port_not_ready = 1;
+	list_for_each_entry (cm_id_priv, &port->cm_priv_prim_list, prim_list)
+		cm_id_priv->prim_send_port_not_ready = 1;
+	spin_unlock_irq(&cm.lock);
+
+	/*
+	 * We flush the queue here after the going_down set, this verifies
+	 * that no new works will be queued in the recv handler, after that we
+	 * can call the unregister_mad_agent
+	 */
+	flush_workqueue(cm.wq);
+
+	spin_lock_irq(&cm.state_lock);
+	cur_mad_agent = port->mad_agent;
+	port->mad_agent = NULL;
+	spin_unlock_irq(&cm.state_lock);
+
+	if (cur_mad_agent)
+		ib_unregister_mad_agent(cur_mad_agent);
+
+	cm_remove_port_fs(port);
+}
+
+static int cm_init_one_port(struct cm_device *cm_dev, unsigned int port_num)
+{
+	struct cm_port *port = &cm_dev->port[port_num - 1];
 	struct ib_mad_reg_req reg_req = {
 		.mgmt_class = IB_MGMT_CLASS_CM,
 		.mgmt_class_version = IB_CM_CLASS_VERSION,
 	};
 	struct ib_port_modify port_modify = {
-		.set_port_cap_mask = IB_PORT_CM_SUP
+		.set_port_cap_mask = IB_PORT_CM_SUP,
 	};
-	unsigned long flags;
 	int ret;
+
+	if (!rdma_cap_ib_cm(cm_dev->ib_device, port_num))
+		return 0;
+
+	set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
+
+	port->cm_dev = cm_dev;
+	port->port_num = port_num;
+
+	INIT_LIST_HEAD(&port->cm_priv_prim_list);
+	INIT_LIST_HEAD(&port->cm_priv_altr_list);
+
+	ret = cm_create_port_fs(port);
+	if (ret)
+		return ret;
+
+	port->mad_agent =
+		ib_register_mad_agent(cm_dev->ib_device, port_num, IB_QPT_GSI,
+				      &reg_req, 0, cm_send_handler,
+				      cm_recv_handler, port, 0);
+	if (IS_ERR(port->mad_agent)) {
+		cm_destroy_one_port(cm_dev, port_num);
+		return PTR_ERR(port->mad_agent);
+	}
+
+	ret = ib_modify_port(cm_dev->ib_device, port_num, 0, &port_modify);
+	if (ret) {
+		cm_destroy_one_port(cm_dev, port_num);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void cm_add_one(struct ib_device *ib_device)
+{
+	struct cm_device *cm_dev;
+	unsigned long flags;
 	int count = 0;
 	u8 i;
 
-	cm_dev = kzalloc(struct_size(cm_dev, port, ib_device->phys_port_cnt),
+	cm_dev = kvzalloc(struct_size(cm_dev, port, ib_device->phys_port_cnt),
 			 GFP_KERNEL);
 	if (!cm_dev)
 		return;
@@ -4344,41 +4420,9 @@ static void cm_add_one(struct ib_device *ib_device)
 	cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
 	cm_dev->going_down = 0;
 
-	set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
 	for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-		if (!rdma_cap_ib_cm(ib_device, i))
-			continue;
-
-		port = kzalloc(sizeof *port, GFP_KERNEL);
-		if (!port)
-			goto error1;
-
-		cm_dev->port[i-1] = port;
-		port->cm_dev = cm_dev;
-		port->port_num = i;
-
-		INIT_LIST_HEAD(&port->cm_priv_prim_list);
-		INIT_LIST_HEAD(&port->cm_priv_altr_list);
-
-		ret = cm_create_port_fs(port);
-		if (ret)
-			goto error1;
-
-		port->mad_agent = ib_register_mad_agent(ib_device, i,
-							IB_QPT_GSI,
-							&reg_req,
-							0,
-							cm_send_handler,
-							cm_recv_handler,
-							port,
-							0);
-		if (IS_ERR(port->mad_agent))
-			goto error2;
-
-		ret = ib_modify_port(ib_device, i, 0, &port_modify);
-		if (ret)
-			goto error3;
-
+		if (!cm_init_one_port(cm_dev, i))
+			goto error;
 		count++;
 	}
 
@@ -4392,35 +4436,16 @@ static void cm_add_one(struct ib_device *ib_device)
 	write_unlock_irqrestore(&cm.device_lock, flags);
 	return;
 
-error3:
-	ib_unregister_mad_agent(port->mad_agent);
-error2:
-	cm_remove_port_fs(port);
-error1:
-	port_modify.set_port_cap_mask = 0;
-	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
-	while (--i) {
-		if (!rdma_cap_ib_cm(ib_device, i))
-			continue;
-
-		port = cm_dev->port[i-1];
-		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
-		ib_unregister_mad_agent(port->mad_agent);
-		cm_remove_port_fs(port);
-	}
+error:
+	while (--i)
+		cm_destroy_one_port(cm_dev, i);
 free:
-	kfree(cm_dev);
+	kvfree(cm_dev);
 }
 
 static void cm_remove_one(struct ib_device *ib_device, void *client_data)
 {
 	struct cm_device *cm_dev = client_data;
-	struct cm_port *port;
-	struct cm_id_private *cm_id_priv;
-	struct ib_mad_agent *cur_mad_agent;
-	struct ib_port_modify port_modify = {
-		.clr_port_cap_mask = IB_PORT_CM_SUP
-	};
 	unsigned long flags;
 	int i;
 
@@ -4435,34 +4460,10 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
 	cm_dev->going_down = 1;
 	spin_unlock_irq(&cm.lock);
 
-	for (i = 1; i <= ib_device->phys_port_cnt; i++) {
-		if (!rdma_cap_ib_cm(ib_device, i))
-			continue;
-
-		port = cm_dev->port[i-1];
-		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
-		/* Mark all the cm_id's as not valid */
-		spin_lock_irq(&cm.lock);
-		list_for_each_entry(cm_id_priv, &port->cm_priv_altr_list, altr_list)
-			cm_id_priv->altr_send_port_not_ready = 1;
-		list_for_each_entry(cm_id_priv, &port->cm_priv_prim_list, prim_list)
-			cm_id_priv->prim_send_port_not_ready = 1;
-		spin_unlock_irq(&cm.lock);
-		/*
-		 * We flush the queue here after the going_down set, this
-		 * verify that no new works will be queued in the recv handler,
-		 * after that we can call the unregister_mad_agent
-		 */
-		flush_workqueue(cm.wq);
-		spin_lock_irq(&cm.state_lock);
-		cur_mad_agent = port->mad_agent;
-		port->mad_agent = NULL;
-		spin_unlock_irq(&cm.state_lock);
-		ib_unregister_mad_agent(cur_mad_agent);
-		cm_remove_port_fs(port);
-	}
+	for (i = 1; i <= ib_device->phys_port_cnt; i++)
+		cm_destroy_one_port(cm_dev, i);
 
-	kfree(cm_dev);
+	kvfree(cm_dev);
 }
 
 static int __init ib_cm_init(void)
-- 
2.23.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
  2019-09-16  7:11 ` [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Leon Romanovsky
@ 2019-09-16 18:48   ` Jason Gunthorpe
  2019-09-16 18:53     ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2019-09-16 18:48 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:53AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Properly unwind QP counter rebinding in case of failure.

What is the actual problem here? Calling 'bind' in an error
unwind seems insane, is that the issue?
 
> Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink")
> Reviewed-by: Mark Zhang <markz@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/core/nldev.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
> index 5e2b7eb0761b..6eb14481a72e 100644
> +++ b/drivers/infiniband/core/nldev.c
> @@ -1860,24 +1860,22 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
> 
>  	cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
>  	qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
> -	ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
> -	if (ret)
> -		goto err_unbind;
> -
>  	if (fill_nldev_handle(msg, device) ||
>  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
>  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
>  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn)) {
>  		ret = -EMSGSIZE;
> -		goto err_fill;
> +		goto err_unbind;

These label names don't make much sense anymore

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
  2019-09-16 18:48   ` Jason Gunthorpe
@ 2019-09-16 18:53     ` Leon Romanovsky
  0 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-16 18:53 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 06:48:24PM +0000, Jason Gunthorpe wrote:
> On Mon, Sep 16, 2019 at 10:11:53AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Properly unwind QP counter rebinding in case of failure.
>
> What is the actual problem here? Calling 'bind' in an error
> unwind seems insane, is that the issue?

Yep

>
> > Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink")
> > Reviewed-by: Mark Zhang <markz@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/nldev.c | 12 +++++-------
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
> > index 5e2b7eb0761b..6eb14481a72e 100644
> > +++ b/drivers/infiniband/core/nldev.c
> > @@ -1860,24 +1860,22 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
> >
> >  	cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
> >  	qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
> > -	ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
> > -	if (ret)
> > -		goto err_unbind;
> > -
> >  	if (fill_nldev_handle(msg, device) ||
> >  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
> >  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
> >  	    nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_LQPN, qpn)) {
> >  		ret = -EMSGSIZE;
> > -		goto err_fill;
> > +		goto err_unbind;
>
> These label names don't make much sense anymore
>
> Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16 18:45   ` Jason Gunthorpe
@ 2019-09-18 11:36     ` Leon Romanovsky
  2019-09-25  6:16     ` Leon Romanovsky
  1 sibling, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-18 11:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 06:45:48PM +0000, Jason Gunthorpe wrote:
> On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > In the process of moving the debug counters sysfs entries, the commit
> > mentioned below eliminated the cm_infiniband sysfs directory.
> >
> > This sysfs directory was tied to the cm_port object allocated in procedure
> > cm_add_one().
> >
> > Before the commit below, this cm_port object was freed via a call to
> > kobject_put(port->kobj) in procedure cm_remove_port_fs().
> >
> > Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> > This, however, meant that kfree was never called for the cm_port buffers.
> >
> > Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> > and cm_remove_one().
> >
> > Note: the kfree call in the first chunk below (in the cm_add_one error
> > flow) fixes an old, undetected memory leak.
> >
> > Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/cm.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> > index da10e6ccb43c..5920c0085d35 100644
> > +++ b/drivers/infiniband/core/cm.c
> > @@ -4399,6 +4399,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  error1:
> >  	port_modify.set_port_cap_mask = 0;
> >  	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> > +	kfree(port);
> >  	while (--i) {
> >  		if (!rdma_cap_ib_cm(ib_device, i))
> >  			continue;
> > @@ -4407,6 +4408,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> >  		ib_unregister_mad_agent(port->mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
> >  free:
> >  	kfree(cm_dev);
> > @@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
> >  		spin_unlock_irq(&cm.state_lock);
> >  		ib_unregister_mad_agent(cur_mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
>
> This whole thing is looking pretty goofy now, and I suspect there are
> more error unwind bugs here.
>
> How about this instead:

It looks OK to me.

Thanks

>
> From e8dad20c7b69436e63b18f16cd9457ea27da5bc1 Mon Sep 17 00:00:00 2001
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Date: Mon, 16 Sep 2019 10:11:51 +0300
> Subject: [PATCH] RDMA/cm: Fix memory leak in cm_add/remove_one
>
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory, and created
> some missing cases where the port pointers were not being freed as the
> kobject_put was also eliminated.
>
> Rework this to not allocate port pointers and consolidate all the error
> unwind into one sequence.
>
> This also fixes unlikely racey bugs where error-unwind after unregistering
> the MAD handler would miss flushing the WQ and other clean up that is
> necessary once concurrency starts.
>
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/cm.c | 187 ++++++++++++++++++-----------------
>  1 file changed, 94 insertions(+), 93 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index da10e6ccb43cd0..30a764e763dec1 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -223,7 +223,7 @@ struct cm_device {
>  	struct ib_device *ib_device;
>  	u8 ack_delay;
>  	int going_down;
> -	struct cm_port *port[0];
> +	struct cm_port port[];
>  };
>
>  struct cm_av {
> @@ -520,7 +520,7 @@ get_cm_port_from_path(struct sa_path_rec *path, const struct ib_gid_attr *attr)
>  		read_lock_irqsave(&cm.device_lock, flags);
>  		list_for_each_entry(cm_dev, &cm.device_list, list) {
>  			if (cm_dev->ib_device == attr->device) {
> -				port = cm_dev->port[attr->port_num - 1];
> +				port = &cm_dev->port[attr->port_num - 1];
>  				break;
>  			}
>  		}
> @@ -539,7 +539,7 @@ get_cm_port_from_path(struct sa_path_rec *path, const struct ib_gid_attr *attr)
>  					     sa_conv_pathrec_to_gid_type(path),
>  					     NULL);
>  			if (!IS_ERR(attr)) {
> -				port = cm_dev->port[attr->port_num - 1];
> +				port = &cm_dev->port[attr->port_num - 1];
>  				break;
>  			}
>  		}
> @@ -4319,23 +4319,99 @@ static void cm_remove_port_fs(struct cm_port *port)
>
>  }
>
> -static void cm_add_one(struct ib_device *ib_device)
> +static void cm_destroy_one_port(struct cm_device *cm_dev, unsigned int port_num)
>  {
> -	struct cm_device *cm_dev;
> -	struct cm_port *port;
> +	struct cm_port *port = &cm_dev->port[port_num - 1];
> +	struct ib_port_modify port_modify = {
> +		.clr_port_cap_mask = IB_PORT_CM_SUP
> +	};
> +	struct ib_mad_agent *cur_mad_agent;
> +	struct cm_id_private *cm_id_priv;
> +
> +	if (!rdma_cap_ib_cm(cm_dev->ib_device, port_num))
> +		return;
> +
> +	ib_modify_port(cm_dev->ib_device, port_num, 0, &port_modify);
> +
> +	/* Mark all the cm_id's as not valid */
> +	spin_lock_irq(&cm.lock);
> +	list_for_each_entry (cm_id_priv, &port->cm_priv_altr_list, altr_list)
> +		cm_id_priv->altr_send_port_not_ready = 1;
> +	list_for_each_entry (cm_id_priv, &port->cm_priv_prim_list, prim_list)
> +		cm_id_priv->prim_send_port_not_ready = 1;
> +	spin_unlock_irq(&cm.lock);
> +
> +	/*
> +	 * We flush the queue here after the going_down set, this verifies
> +	 * that no new works will be queued in the recv handler, after that we
> +	 * can call the unregister_mad_agent
> +	 */
> +	flush_workqueue(cm.wq);
> +
> +	spin_lock_irq(&cm.state_lock);
> +	cur_mad_agent = port->mad_agent;
> +	port->mad_agent = NULL;
> +	spin_unlock_irq(&cm.state_lock);
> +
> +	if (cur_mad_agent)
> +		ib_unregister_mad_agent(cur_mad_agent);
> +
> +	cm_remove_port_fs(port);
> +}
> +
> +static int cm_init_one_port(struct cm_device *cm_dev, unsigned int port_num)
> +{
> +	struct cm_port *port = &cm_dev->port[port_num - 1];
>  	struct ib_mad_reg_req reg_req = {
>  		.mgmt_class = IB_MGMT_CLASS_CM,
>  		.mgmt_class_version = IB_CM_CLASS_VERSION,
>  	};
>  	struct ib_port_modify port_modify = {
> -		.set_port_cap_mask = IB_PORT_CM_SUP
> +		.set_port_cap_mask = IB_PORT_CM_SUP,
>  	};
> -	unsigned long flags;
>  	int ret;
> +
> +	if (!rdma_cap_ib_cm(cm_dev->ib_device, port_num))
> +		return 0;
> +
> +	set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
> +
> +	port->cm_dev = cm_dev;
> +	port->port_num = port_num;
> +
> +	INIT_LIST_HEAD(&port->cm_priv_prim_list);
> +	INIT_LIST_HEAD(&port->cm_priv_altr_list);
> +
> +	ret = cm_create_port_fs(port);
> +	if (ret)
> +		return ret;
> +
> +	port->mad_agent =
> +		ib_register_mad_agent(cm_dev->ib_device, port_num, IB_QPT_GSI,
> +				      &reg_req, 0, cm_send_handler,
> +				      cm_recv_handler, port, 0);
> +	if (IS_ERR(port->mad_agent)) {
> +		cm_destroy_one_port(cm_dev, port_num);
> +		return PTR_ERR(port->mad_agent);
> +	}
> +
> +	ret = ib_modify_port(cm_dev->ib_device, port_num, 0, &port_modify);
> +	if (ret) {
> +		cm_destroy_one_port(cm_dev, port_num);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static void cm_add_one(struct ib_device *ib_device)
> +{
> +	struct cm_device *cm_dev;
> +	unsigned long flags;
>  	int count = 0;
>  	u8 i;
>
> -	cm_dev = kzalloc(struct_size(cm_dev, port, ib_device->phys_port_cnt),
> +	cm_dev = kvzalloc(struct_size(cm_dev, port, ib_device->phys_port_cnt),
>  			 GFP_KERNEL);
>  	if (!cm_dev)
>  		return;
> @@ -4344,41 +4420,9 @@ static void cm_add_one(struct ib_device *ib_device)
>  	cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
>  	cm_dev->going_down = 0;
>
> -	set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
>  	for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> -		if (!rdma_cap_ib_cm(ib_device, i))
> -			continue;
> -
> -		port = kzalloc(sizeof *port, GFP_KERNEL);
> -		if (!port)
> -			goto error1;
> -
> -		cm_dev->port[i-1] = port;
> -		port->cm_dev = cm_dev;
> -		port->port_num = i;
> -
> -		INIT_LIST_HEAD(&port->cm_priv_prim_list);
> -		INIT_LIST_HEAD(&port->cm_priv_altr_list);
> -
> -		ret = cm_create_port_fs(port);
> -		if (ret)
> -			goto error1;
> -
> -		port->mad_agent = ib_register_mad_agent(ib_device, i,
> -							IB_QPT_GSI,
> -							&reg_req,
> -							0,
> -							cm_send_handler,
> -							cm_recv_handler,
> -							port,
> -							0);
> -		if (IS_ERR(port->mad_agent))
> -			goto error2;
> -
> -		ret = ib_modify_port(ib_device, i, 0, &port_modify);
> -		if (ret)
> -			goto error3;
> -
> +		if (!cm_init_one_port(cm_dev, i))
> +			goto error;
>  		count++;
>  	}
>
> @@ -4392,35 +4436,16 @@ static void cm_add_one(struct ib_device *ib_device)
>  	write_unlock_irqrestore(&cm.device_lock, flags);
>  	return;
>
> -error3:
> -	ib_unregister_mad_agent(port->mad_agent);
> -error2:
> -	cm_remove_port_fs(port);
> -error1:
> -	port_modify.set_port_cap_mask = 0;
> -	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> -	while (--i) {
> -		if (!rdma_cap_ib_cm(ib_device, i))
> -			continue;
> -
> -		port = cm_dev->port[i-1];
> -		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> -		ib_unregister_mad_agent(port->mad_agent);
> -		cm_remove_port_fs(port);
> -	}
> +error:
> +	while (--i)
> +		cm_destroy_one_port(cm_dev, i);
>  free:
> -	kfree(cm_dev);
> +	kvfree(cm_dev);
>  }
>
>  static void cm_remove_one(struct ib_device *ib_device, void *client_data)
>  {
>  	struct cm_device *cm_dev = client_data;
> -	struct cm_port *port;
> -	struct cm_id_private *cm_id_priv;
> -	struct ib_mad_agent *cur_mad_agent;
> -	struct ib_port_modify port_modify = {
> -		.clr_port_cap_mask = IB_PORT_CM_SUP
> -	};
>  	unsigned long flags;
>  	int i;
>
> @@ -4435,34 +4460,10 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
>  	cm_dev->going_down = 1;
>  	spin_unlock_irq(&cm.lock);
>
> -	for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> -		if (!rdma_cap_ib_cm(ib_device, i))
> -			continue;
> -
> -		port = cm_dev->port[i-1];
> -		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> -		/* Mark all the cm_id's as not valid */
> -		spin_lock_irq(&cm.lock);
> -		list_for_each_entry(cm_id_priv, &port->cm_priv_altr_list, altr_list)
> -			cm_id_priv->altr_send_port_not_ready = 1;
> -		list_for_each_entry(cm_id_priv, &port->cm_priv_prim_list, prim_list)
> -			cm_id_priv->prim_send_port_not_ready = 1;
> -		spin_unlock_irq(&cm.lock);
> -		/*
> -		 * We flush the queue here after the going_down set, this
> -		 * verify that no new works will be queued in the recv handler,
> -		 * after that we can call the unregister_mad_agent
> -		 */
> -		flush_workqueue(cm.wq);
> -		spin_lock_irq(&cm.state_lock);
> -		cur_mad_agent = port->mad_agent;
> -		port->mad_agent = NULL;
> -		spin_unlock_irq(&cm.state_lock);
> -		ib_unregister_mad_agent(cur_mad_agent);
> -		cm_remove_port_fs(port);
> -	}
> +	for (i = 1; i <= ib_device->phys_port_cnt; i++)
> +		cm_destroy_one_port(cm_dev, i);
>
> -	kfree(cm_dev);
> +	kvfree(cm_dev);
>  }
>
>  static int __init ib_cm_init(void)
> --
> 2.23.0
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16 18:45   ` Jason Gunthorpe
  2019-09-18 11:36     ` Leon Romanovsky
@ 2019-09-25  6:16     ` Leon Romanovsky
  1 sibling, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-09-25  6:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 06:45:48PM +0000, Jason Gunthorpe wrote:
> On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > In the process of moving the debug counters sysfs entries, the commit
> > mentioned below eliminated the cm_infiniband sysfs directory.
> >
> > This sysfs directory was tied to the cm_port object allocated in procedure
> > cm_add_one().
> >
> > Before the commit below, this cm_port object was freed via a call to
> > kobject_put(port->kobj) in procedure cm_remove_port_fs().
> >
> > Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> > This, however, meant that kfree was never called for the cm_port buffers.
> >
> > Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> > and cm_remove_one().
> >
> > Note: the kfree call in the first chunk below (in the cm_add_one error
> > flow) fixes an old, undetected memory leak.
> >
> > Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/cm.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> > index da10e6ccb43c..5920c0085d35 100644
> > +++ b/drivers/infiniband/core/cm.c
> > @@ -4399,6 +4399,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  error1:
> >  	port_modify.set_port_cap_mask = 0;
> >  	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> > +	kfree(port);
> >  	while (--i) {
> >  		if (!rdma_cap_ib_cm(ib_device, i))
> >  			continue;
> > @@ -4407,6 +4408,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> >  		ib_unregister_mad_agent(port->mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
> >  free:
> >  	kfree(cm_dev);
> > @@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
> >  		spin_unlock_irq(&cm.state_lock);
> >  		ib_unregister_mad_agent(cur_mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
>
> This whole thing is looking pretty goofy now, and I suspect there are
> more error unwind bugs here.
>
> How about this instead:
>
> From e8dad20c7b69436e63b18f16cd9457ea27da5bc1 Mon Sep 17 00:00:00 2001
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Date: Mon, 16 Sep 2019 10:11:51 +0300
> Subject: [PATCH] RDMA/cm: Fix memory leak in cm_add/remove_one
>
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory, and created
> some missing cases where the port pointers were not being freed as the
> kobject_put was also eliminated.
>
> Rework this to not allocate port pointers and consolidate all the error
> unwind into one sequence.
>
> This also fixes unlikely racey bugs where error-unwind after unregistering
> the MAD handler would miss flushing the WQ and other clean up that is
> necessary once concurrency starts.
>
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/cm.c | 187 ++++++++++++++++++-----------------
>  1 file changed, 94 insertions(+), 93 deletions(-)

The proposed patch doesn't pass sanity check and unfortunately, I don't
have time now to debug it.

$ ./rping -V -S 256 -C 100 -p 19663 -P -a 192.168.0.1 -s &
$ ./rping -V -S 256 -C 100 -p 19663 -a 192.168.0.1 -c -I 192.168.0.1
   rdma_connect: Invalid argument
   connect error -1
   <hang>

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode
  2019-09-16  7:11 ` [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode Leon Romanovsky
@ 2019-10-01 14:22   ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2019-10-01 14:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:52AM +0300, Leon Romanovsky wrote:
> From: Mark Zhang <markz@mellanox.com>
> 
> If auto mode is configured, manual counter allocation and QP bind is
> not allowed.
> 
> Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support")
> Signed-off-by: Mark Zhang <markz@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/counters.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)

Applied to for-next
 
> --
> 2.20.1

This line ended up in a weird place

Thanks,
Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
  2019-09-16 18:45   ` Jason Gunthorpe
@ 2019-10-02 11:58   ` Leon Romanovsky
  2019-10-04 17:59   ` Jason Gunthorpe
  2 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2019-10-02 11:58 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: RDMA mailing list, Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
>
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory.
>
> This sysfs directory was tied to the cm_port object allocated in procedure
> cm_add_one().
>
> Before the commit below, this cm_port object was freed via a call to
> kobject_put(port->kobj) in procedure cm_remove_port_fs().
>
> Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> This, however, meant that kfree was never called for the cm_port buffers.
>
> Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> and cm_remove_one().
>
> Note: the kfree call in the first chunk below (in the cm_add_one error
> flow) fixes an old, undetected memory leak.
>
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/cm.c | 3 +++
>  1 file changed, 3 insertions(+)
>

Can we take this patch till you will have time to debug better one?

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
  2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
  2019-09-16 18:45   ` Jason Gunthorpe
  2019-10-02 11:58   ` Leon Romanovsky
@ 2019-10-04 17:59   ` Jason Gunthorpe
  2 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2019-10-04 17:59 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Jack Morgenstein, Mark Zhang

On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory.
> 
> This sysfs directory was tied to the cm_port object allocated in procedure
> cm_add_one().
> 
> Before the commit below, this cm_port object was freed via a call to
> kobject_put(port->kobj) in procedure cm_remove_port_fs().
> 
> Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> This, however, meant that kfree was never called for the cm_port buffers.
> 
> Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> and cm_remove_one().
> 
> Note: the kfree call in the first chunk below (in the cm_add_one error
> flow) fixes an old, undetected memory leak.
> 
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/cm.c | 3 +++
>  1 file changed, 3 insertions(+)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
2019-09-16 18:45   ` Jason Gunthorpe
2019-09-18 11:36     ` Leon Romanovsky
2019-09-25  6:16     ` Leon Romanovsky
2019-10-02 11:58   ` Leon Romanovsky
2019-10-04 17:59   ` Jason Gunthorpe
2019-09-16  7:11 ` [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode Leon Romanovsky
2019-10-01 14:22   ` Jason Gunthorpe
2019-09-16  7:11 ` [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Leon Romanovsky
2019-09-16 18:48   ` Jason Gunthorpe
2019-09-16 18:53     ` Leon Romanovsky
2019-09-16  7:11 ` [PATCH 4/4] RDMA: Fix double-free in srq creation error flow Leon Romanovsky
2019-09-16 17:57   ` Jason Gunthorpe

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org linux-rdma@archiver.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/ public-inbox