linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Waiman Long <longman@redhat.com>
Cc: "Jens Axboe" <axboe@kernel.dk>, "Tejun Heo" <tj@kernel.org>,
	"Josef Bacik" <josef@toxicpanda.com>,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	"Michal Koutný" <mkoutny@suse.com>,
	"Dennis Zhou (Facebook)" <dennisszhou@gmail.com>,
	ming.lei@redhat.com
Subject: Re: [PATCH v4 2/2] blk-cgroup: Flush stats at blkgs destruction path
Date: Thu, 2 Feb 2023 12:15:52 +0800	[thread overview]
Message-ID: <Y9s4+Nop1eluWmJ4@T590> (raw)
In-Reply-To: <20221215033132.230023-3-longman@redhat.com>

On Wed, Dec 14, 2022 at 10:31:32PM -0500, Waiman Long wrote:
> As noted by Michal, the blkg_iostat_set's in the lockless list
> hold reference to blkg's to protect against their removal. Those
> blkg's hold reference to blkcg. When a cgroup is being destroyed,
> cgroup_rstat_flush() is only called at css_release_work_fn() which
> is called when the blkcg reference count reaches 0. This circular
> dependency will prevent blkcg and some blkgs from being freed after
> they are made offline.
> 
> It is less a problem if the cgroup to be destroyed also has other
> controllers like memory that will call cgroup_rstat_flush() which will
> clean up the reference count. If block is the only controller that uses
> rstat, these offline blkcg and blkgs may never be freed leaking more
> and more memory over time.
> 
> To prevent this potential memory leak, a new cgroup_rstat_css_cpu_flush()
> function is added to flush stats for a given css and cpu. This new
> function will be called at blkgs destruction path, blkcg_destroy_blkgs(),
> whenever there are still pending stats to be flushed. This will release
> the references to blkgs allowing them to be freed and indirectly allow
> the freeing of blkcg.
> 
> Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()")
> Signed-off-by: Waiman Long <longman@redhat.com>
> Acked-by: Tejun Heo <tj@kernel.org>
> ---
>  block/blk-cgroup.c     | 16 ++++++++++++++++
>  include/linux/cgroup.h |  1 +
>  kernel/cgroup/rstat.c  | 18 ++++++++++++++++++
>  3 files changed, 35 insertions(+)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index ca28306aa1b1..a2a1081d9d1d 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1084,6 +1084,8 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup_subsys_state *css)
>   */
>  static void blkcg_destroy_blkgs(struct blkcg *blkcg)
>  {
> +	int cpu;
> +
>  	/*
>  	 * blkcg_destroy_blkgs() shouldn't be called with all the blkcg
>  	 * references gone.
> @@ -1093,6 +1095,20 @@ static void blkcg_destroy_blkgs(struct blkcg *blkcg)
>  
>  	might_sleep();
>  
> +	/*
> +	 * Flush all the non-empty percpu lockless lists to release the
> +	 * blkg references held by those lists which, in turn, will
> +	 * allow those blkgs to be freed and release their references to
> +	 * blkcg. Otherwise, they may not be freed at all becase of this
> +	 * circular dependency resulting in memory leak.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		struct llist_head *lhead = per_cpu_ptr(blkcg->lhead, cpu);
> +
> +		if (!llist_empty(lhead))
> +			cgroup_rstat_css_cpu_flush(&blkcg->css, cpu);
> +	}

I guess it is possible for new iostat_cpu to be added just after the
llist_empty() check.


Thanks,
Ming


  reply	other threads:[~2023-02-02  4:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-15  3:31 [PATCH v4 0/2] blk-cgroup: Fix potential UAF & flush rstat at blkgs destruction path Waiman Long
2022-12-15  3:31 ` [PATCH v4 1/2] bdi, blk-cgroup: Fix potential UAF of blkcg Waiman Long
2022-12-15  3:31 ` [PATCH v4 2/2] blk-cgroup: Flush stats at blkgs destruction path Waiman Long
2023-02-02  4:15   ` Ming Lei [this message]
2023-02-02 22:35     ` Tejun Heo
2023-02-02  3:26 ` [PATCH v4 0/2] blk-cgroup: Fix potential UAF & flush rstat " Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9s4+Nop1eluWmJ4@T590 \
    --to=ming.lei@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=dennisszhou@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=longman@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).