All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Jack Morgenstein <jackm@dev.mellanox.co.il>,
	Mark Zhang <markz@mellanox.com>
Subject: Re: [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one
Date: Wed, 25 Sep 2019 09:16:17 +0300	[thread overview]
Message-ID: <20190925061617.GV14368@unreal> (raw)
In-Reply-To: <20190916184544.GG2585@mellanox.com>

On Mon, Sep 16, 2019 at 06:45:48PM +0000, Jason Gunthorpe wrote:
> On Mon, Sep 16, 2019 at 10:11:51AM +0300, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > In the process of moving the debug counters sysfs entries, the commit
> > mentioned below eliminated the cm_infiniband sysfs directory.
> >
> > This sysfs directory was tied to the cm_port object allocated in procedure
> > cm_add_one().
> >
> > Before the commit below, this cm_port object was freed via a call to
> > kobject_put(port->kobj) in procedure cm_remove_port_fs().
> >
> > Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
> > This, however, meant that kfree was never called for the cm_port buffers.
> >
> > Fix this by adding explicit kfree(port) calls to functions cm_add_one()
> > and cm_remove_one().
> >
> > Note: the kfree call in the first chunk below (in the cm_add_one error
> > flow) fixes an old, undetected memory leak.
> >
> > Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/core/cm.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> > index da10e6ccb43c..5920c0085d35 100644
> > +++ b/drivers/infiniband/core/cm.c
> > @@ -4399,6 +4399,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  error1:
> >  	port_modify.set_port_cap_mask = 0;
> >  	port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> > +	kfree(port);
> >  	while (--i) {
> >  		if (!rdma_cap_ib_cm(ib_device, i))
> >  			continue;
> > @@ -4407,6 +4408,7 @@ static void cm_add_one(struct ib_device *ib_device)
> >  		ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> >  		ib_unregister_mad_agent(port->mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
> >  free:
> >  	kfree(cm_dev);
> > @@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
> >  		spin_unlock_irq(&cm.state_lock);
> >  		ib_unregister_mad_agent(cur_mad_agent);
> >  		cm_remove_port_fs(port);
> > +		kfree(port);
> >  	}
>
> This whole thing is looking pretty goofy now, and I suspect there are
> more error unwind bugs here.
>
> How about this instead:
>
> From e8dad20c7b69436e63b18f16cd9457ea27da5bc1 Mon Sep 17 00:00:00 2001
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Date: Mon, 16 Sep 2019 10:11:51 +0300
> Subject: [PATCH] RDMA/cm: Fix memory leak in cm_add/remove_one
>
> In the process of moving the debug counters sysfs entries, the commit
> mentioned below eliminated the cm_infiniband sysfs directory, and created
> some missing cases where the port pointers were not being freed as the
> kobject_put was also eliminated.
>
> Rework this to not allocate port pointers and consolidate all the error
> unwind into one sequence.
>
> This also fixes unlikely racey bugs where error-unwind after unregistering
> the MAD handler would miss flushing the WQ and other clean up that is
> necessary once concurrency starts.
>
> Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/core/cm.c | 187 ++++++++++++++++++-----------------
>  1 file changed, 94 insertions(+), 93 deletions(-)

The proposed patch doesn't pass sanity check and unfortunately, I don't
have time now to debug it.

$ ./rping -V -S 256 -C 100 -p 19663 -P -a 192.168.0.1 -s &
$ ./rping -V -S 256 -C 100 -p 19663 -a 192.168.0.1 -c -I 192.168.0.1
   rdma_connect: Invalid argument
   connect error -1
   <hang>

Thanks

  parent reply	other threads:[~2019-09-25  6:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16  7:11 [PATCH 0/4] Random fixes to IB/core Leon Romanovsky
2019-09-16  7:11 ` [PATCH 1/4] RDMA/cm: Fix memory leak in cm_add/remove_one Leon Romanovsky
2019-09-16 18:45   ` Jason Gunthorpe
2019-09-18 11:36     ` Leon Romanovsky
2019-09-25  6:16     ` Leon Romanovsky [this message]
2019-10-02 11:58   ` Leon Romanovsky
2019-10-04 17:59   ` Jason Gunthorpe
2019-09-16  7:11 ` [PATCH 2/4] RDMA/counter: Prevent QP counter manual binding in auto mode Leon Romanovsky
2019-10-01 14:22   ` Jason Gunthorpe
2019-09-16  7:11 ` [PATCH 3/4] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Leon Romanovsky
2019-09-16 18:48   ` Jason Gunthorpe
2019-09-16 18:53     ` Leon Romanovsky
2019-09-16  7:11 ` [PATCH 4/4] RDMA: Fix double-free in srq creation error flow Leon Romanovsky
2019-09-16 17:57   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190925061617.GV14368@unreal \
    --to=leon@kernel.org \
    --cc=dledford@redhat.com \
    --cc=jackm@dev.mellanox.co.il \
    --cc=jgg@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=markz@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.