All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yanjun Zhu <yanjun.zhu@oracle.com>
To: santosh.shilimkar@oracle.com, davem@davemloft.net,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	rds-devel@oss.oracle.com,
	"haakon.bugge@oracle.com >> Håkon Bugge"
	<haakon.bugge@oracle.com>
Subject: Re: [PATCH 1/1] net: rds: fix memory leak when unload rds_rdma
Date: Mon, 3 Jun 2019 20:43:42 +0800	[thread overview]
Message-ID: <80d57531-761e-9465-23bd-a9b8d9e30e66@oracle.com> (raw)
In-Reply-To: <1559566099-30289-1-git-send-email-yanjun.zhu@oracle.com>

Sorry. Add Håkon Bugge <haakon.bugge@oracle.com>

He told me to notice the memory leak when caches are freed.

Zhu Yanjun

On 2019/6/3 20:48, Zhu Yanjun wrote:
> When KASAN is enabled, after several rds connections are
> created, then "rmmod rds_rdma" is run. The following will
> appear.
>
> "
> BUG rds_ib_incoming (Not tainted): Objects remaining
> in rds_ib_incoming on __kmem_cache_shutdown()
>
> Call Trace:
>   dump_stack+0x71/0xab
>   slab_err+0xad/0xd0
>   __kmem_cache_shutdown+0x17d/0x370
>   shutdown_cache+0x17/0x130
>   kmem_cache_destroy+0x1df/0x210
>   rds_ib_recv_exit+0x11/0x20 [rds_rdma]
>   rds_ib_exit+0x7a/0x90 [rds_rdma]
>   __x64_sys_delete_module+0x224/0x2c0
>   ? __ia32_sys_delete_module+0x2c0/0x2c0
>   do_syscall_64+0x73/0x190
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> "
> This is rds connection memory leak. The root cause is:
> When "rmmod rds_rdma" is run, rds_ib_remove_one will call
> rds_ib_dev_shutdown to drop the rds connections.
> rds_ib_dev_shutdown will call rds_conn_drop to drop rds
> connections as below.
> "
> rds_conn_path_drop(&conn->c_path[0], false);
> "
> In the above, destroy is set to false.
> void rds_conn_path_drop(struct rds_conn_path *cp, bool destroy)
> {
>          atomic_set(&cp->cp_state, RDS_CONN_ERROR);
>
>          rcu_read_lock();
>          if (!destroy && rds_destroy_pending(cp->cp_conn)) {
>                  rcu_read_unlock();
>                  return;
>          }
>          queue_work(rds_wq, &cp->cp_down_w);
>          rcu_read_unlock();
> }
> In the above function, destroy is set to false. rds_destroy_pending
> is called. This does not move rds connections to ib_nodev_conns.
> So destroy is set to true to move rds connections to ib_nodev_conns.
> In rds_ib_unregister_client, flush_workqueue is called to make rds_wq
> finsh shutdown rds connections. The function rds_ib_destroy_nodev_conns
> is called to shutdown rds connections finally.
> Then rds_ib_recv_exit is called to destroy slab.
>
> void rds_ib_recv_exit(void)
> {
>          kmem_cache_destroy(rds_ib_incoming_slab);
>          kmem_cache_destroy(rds_ib_frag_slab);
> }
> The above slab memory leak will not occur again.
>
> >From tests,
> 256 rds connections
> [root@ca-dev14 ~]# time rmmod rds_rdma
>
> real    0m16.522s
> user    0m0.000s
> sys     0m8.152s
> 512 rds connections
> [root@ca-dev14 ~]# time rmmod rds_rdma
>
> real    0m32.054s
> user    0m0.000s
> sys     0m15.568s
>
> To rmmod rds_rdma with 256 rds connections, about 16 seconds are needed.
> And with 512 rds connections, about 32 seconds are needed.
> >From ftrace, when one rds connection is destroyed,
>
> "
>   19)               |  rds_conn_destroy [rds]() {
>   19)   7.782 us    |    rds_conn_path_drop [rds]();
>   15)               |  rds_shutdown_worker [rds]() {
>   15)               |    rds_conn_shutdown [rds]() {
>   15)   1.651 us    |      rds_send_path_reset [rds]();
>   15)   7.195 us    |    }
>   15) + 11.434 us   |  }
>   19)   2.285 us    |    rds_cong_remove_conn [rds]();
>   19) * 24062.76 us |  }
> "
> So if many rds connections will be destroyed, this function
> rds_ib_destroy_nodev_conns uses most of time.
>
> Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
> ---
>   net/rds/ib.c      | 2 +-
>   net/rds/ib_recv.c | 3 +++
>   2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/rds/ib.c b/net/rds/ib.c
> index f9baf2d..ec05d91 100644
> --- a/net/rds/ib.c
> +++ b/net/rds/ib.c
> @@ -87,7 +87,7 @@ static void rds_ib_dev_shutdown(struct rds_ib_device *rds_ibdev)
>   
>   	spin_lock_irqsave(&rds_ibdev->spinlock, flags);
>   	list_for_each_entry(ic, &rds_ibdev->conn_list, ib_node)
> -		rds_conn_drop(ic->conn);
> +		rds_conn_path_drop(&ic->conn->c_path[0], true);
>   	spin_unlock_irqrestore(&rds_ibdev->spinlock, flags);
>   }
>   
> diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
> index 8946c89..3cae88c 100644
> --- a/net/rds/ib_recv.c
> +++ b/net/rds/ib_recv.c
> @@ -168,6 +168,7 @@ void rds_ib_recv_free_caches(struct rds_ib_connection *ic)
>   		list_del(&inc->ii_cache_entry);
>   		WARN_ON(!list_empty(&inc->ii_frags));
>   		kmem_cache_free(rds_ib_incoming_slab, inc);
> +		atomic_dec(&rds_ib_allocation);
>   	}
>   
>   	rds_ib_cache_xfer_to_ready(&ic->i_cache_frags);
> @@ -1057,6 +1058,8 @@ int rds_ib_recv_init(void)
>   
>   void rds_ib_recv_exit(void)
>   {
> +	WARN_ON(atomic_read(&rds_ib_allocation));
> +
>   	kmem_cache_destroy(rds_ib_incoming_slab);
>   	kmem_cache_destroy(rds_ib_frag_slab);
>   }

  reply	other threads:[~2019-06-03 12:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03 12:48 [PATCH 1/1] net: rds: fix memory leak when unload rds_rdma Zhu Yanjun
2019-06-03 12:48 ` Zhu Yanjun
2019-06-03 12:43 ` Yanjun Zhu [this message]
2019-06-04  6:58   ` Yanjun Zhu
2019-06-06  0:08 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80d57531-761e-9465-23bd-a9b8d9e30e66@oracle.com \
    --to=yanjun.zhu@oracle.com \
    --cc=davem@davemloft.net \
    --cc=haakon.bugge@oracle.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rds-devel@oss.oracle.com \
    --cc=santosh.shilimkar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.