linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: axboe@kernel.dk
Cc: linux-kernel@vger.kernel.org, jack@suse.cz, hch@infradead.org,
	hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org,
	vgoyal@redhat.com, lizefan@huawei.com, cgroups@vger.kernel.org,
	linux-mm@kvack.org, mhocko@suse.cz, clm@fb.com,
	fengguang.wu@intel.com, david@fromorbit.com, gthelen@google.com,
	khlebnikov@yandex-team.ru
Subject: [PATCH v4 11/51] memcg: implement mem_cgroup_css_from_page()
Date: Wed, 27 May 2015 20:00:02 -0400	[thread overview]
Message-ID: <20150528000002.GT7099@htj.duckdns.org> (raw)
In-Reply-To: <1432329245-5844-12-git-send-email-tj@kernel.org>

>From 7cdfc9bd2cafd3f87f9e771fef25630a81c65886 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 27 May 2015 19:53:39 -0400

Implement mem_cgroup_css_from_page() which returns the
cgroup_subsys_state of the memcg associated with a given page on the
default hierarchy.  This will be used by cgroup writeback support.

This function assumes that page->mem_cgroup association doesn't change
until the page is released, which is true on the default hierarchy as
long as replace_page_cache_page() is not used.  As the only user of
replace_page_cache_page() is FUSE which won't support cgroup writeback
for the time being, this works for now, and replace_page_cache_page()
will soon be updated so that the invariant actually holds.

Note that the RCU protected page->mem_cgroup access is consistent with
other usages across memcg but ultimately incorrect.  These unlocked
accesses are missing required barriers.  page->mem_cgroup should be
made an RCU pointer and updated and accessed using RCU operations.

v4: Instead of triggering WARN, return the root css on the traditional
    hierarchies.  This makes the function a lot easier to deal with
    especially as there's no light way to synchronize against
    hierarchy rebinding.

v3: s/mem_cgroup_migrate()/mem_cgroup_css_from_page()/

v2: Trigger WARN if the function is used on the traditional
    hierarchies and add comment about the assumed invariant.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
---
Hello,

Heh, this is turning out to be more tricker than I expected.  Because
memcg may be moved between traditional and default hierarchies and
there's no cheap way to synchronize against such rebinding, we can't
simply require the caller to not use this function if memcg is not
associated with the default hierarchy.  Instead, the function now
returns root css if the associated css is not on the default
hierarchy.

While working on this, I noticed that some read accesses to
page->mem_cgroup is RCU protected but the accesses themselves aren't
RCU.  This patch follows the same pattern but this is broken.  These
are missing the requisite barriers.  We'll need to make
page->mem_cgroup an RCU pointer and use rcu accessors to deref it when
accessing locklessly.

Thanks.

 include/linux/memcontrol.h |  1 +
 mm/memcontrol.c            | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 294498f..637ef62 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -115,6 +115,7 @@ static inline bool mm_match_cgroup(struct mm_struct *mm,
 }
 
 extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
+extern struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page);
 
 struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *,
 				   struct mem_cgroup *,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b22a92b..5c270a0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -598,6 +598,39 @@ struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg)
 	return &memcg->css;
 }
 
+/**
+ * mem_cgroup_css_from_page - css of the memcg associated with a page
+ * @page: page of interest
+ *
+ * If memcg is bound to the default hierarchy, css of the memcg associated
+ * with @page is returned.  The returned css remains associated with @page
+ * until it is released.
+ *
+ * If memcg is bound to a traditional hierarchy, the css of root_mem_cgroup
+ * is returned.
+ *
+ * XXX: The above description of behavior on the default hierarchy isn't
+ * strictly true yet as replace_page_cache_page() can modify the
+ * association before @page is released even on the default hierarchy;
+ * however, the current and planned usages don't mix the the two functions
+ * and replace_page_cache_page() will soon be updated to make the invariant
+ * actually true.
+ */
+struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page)
+{
+	struct mem_cgroup *memcg;
+
+	rcu_read_lock();
+
+	memcg = page->mem_cgroup;
+
+	if (!memcg || !cgroup_on_dfl(memcg->css.cgroup))
+		memcg = root_mem_cgroup;
+
+	rcu_read_unlock();
+	return &memcg->css;
+}
+
 static struct mem_cgroup_per_zone *
 mem_cgroup_page_zoneinfo(struct mem_cgroup *memcg, struct page *page)
 {
-- 
2.4.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-05-28  0:00 UTC|newest]

Thread overview: 131+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-22 21:13 [PATCHSET 1/3 v4 block/for-4.2/core] writeback: cgroup writeback support Tejun Heo
2015-05-22 21:13 ` [PATCH 01/51] page_writeback: revive cancel_dirty_page() in a restricted form Tejun Heo
2015-05-22 21:13 ` [PATCH 02/51] memcg: add per cgroup dirty page accounting Tejun Heo
2015-05-22 21:13 ` [PATCH 03/51] blkcg: move block/blk-cgroup.h to include/linux/blk-cgroup.h Tejun Heo
2015-05-22 21:13 ` [PATCH 04/51] update !CONFIG_BLK_CGROUP dummies in include/linux/blk-cgroup.h Tejun Heo
2015-05-22 21:13 ` [PATCH 05/51] blkcg: always create the blkcg_gq for the root blkcg Tejun Heo
2015-05-22 21:13 ` [PATCH 06/51] memcg: add mem_cgroup_root_css Tejun Heo
2015-06-17 14:56   ` Michal Hocko
     [not found]     ` <20150617145642.GI25056-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-06-17 18:25       ` Tejun Heo
     [not found]         ` <20150617182500.GI22637-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-06-18 11:12           ` Michal Hocko
2015-06-18 17:49             ` Tejun Heo
     [not found]               ` <20150618174930.GA12934-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-06-19  9:18                 ` Michal Hocko
     [not found]                   ` <20150619091848.GE4913-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-06-19 15:17                     ` Tejun Heo
2015-05-22 21:13 ` [PATCH 07/51] blkcg: add blkcg_root_css Tejun Heo
2015-05-22 21:13 ` [PATCH 08/51] cgroup, block: implement task_get_css() and use it in bio_associate_current() Tejun Heo
2015-05-22 21:13 ` [PATCH 09/51] blkcg: implement task_get_blkcg_css() Tejun Heo
2015-05-22 21:13 ` [PATCH 10/51] blkcg: implement bio_associate_blkcg() Tejun Heo
2015-05-22 21:13 ` [PATCH 11/51] memcg: implement mem_cgroup_css_from_page() Tejun Heo
     [not found]   ` <1432329245-5844-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-05-22 23:28     ` Johannes Weiner
2015-05-24 21:24       ` Tejun Heo
     [not found]         ` <20150524212440.GD7099-piEFEHQLUPpN0TnZuCh8vA@public.gmane.org>
2015-05-27 12:58           ` Johannes Weiner
2015-05-27 16:13     ` [PATCH v2 " Tejun Heo
2015-05-27 17:09       ` Johannes Weiner
     [not found]         ` <20150527170955.GA25324-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2015-05-27 17:48           ` Tejun Heo
2015-05-27 17:57     ` [PATCH v3 " Tejun Heo
2015-05-28  0:00   ` Tejun Heo [this message]
2015-05-22 21:13 ` [PATCH 12/51] writeback: move backing_dev_info->state into bdi_writeback Tejun Heo
2015-05-22 21:13 ` [PATCH 13/51] writeback: move backing_dev_info->bdi_stat[] " Tejun Heo
2015-05-22 21:13 ` [PATCH 14/51] writeback: move bandwidth related fields from backing_dev_info " Tejun Heo
2015-05-22 21:13 ` [PATCH 15/51] writeback: s/bdi/wb/ in mm/page-writeback.c Tejun Heo
2015-05-22 21:13 ` [PATCH 16/51] writeback: move backing_dev_info->wb_lock and ->worklist into bdi_writeback Tejun Heo
2015-06-07  0:49   ` Sasha Levin
2015-06-08  5:57     ` [PATCH block/for-4.2-writeback] v9fs: fix error handling in v9fs_session_init() Tejun Heo
2015-06-08 15:10       ` Jens Axboe
2015-05-22 21:13 ` [PATCH 17/51] writeback: reorganize mm/backing-dev.c Tejun Heo
2015-05-22 21:13 ` [PATCH 18/51] writeback: separate out include/linux/backing-dev-defs.h Tejun Heo
2015-05-22 21:13 ` [PATCH 19/51] bdi: make inode_to_bdi() inline Tejun Heo
2015-06-30  6:47   ` Jan Kara
2015-05-22 21:13 ` [PATCH 20/51] writeback: add @gfp to wb_init() Tejun Heo
2015-05-22 21:13 ` [PATCH 21/51] bdi: separate out congested state into a separate struct Tejun Heo
2015-06-30  9:21   ` Jan Kara
2015-05-22 21:13 ` [PATCH 22/51] writeback: add {CONFIG|BDI_CAP|FS}_CGROUP_WRITEBACK Tejun Heo
2015-06-30  9:37   ` Jan Kara
2015-07-02  1:10     ` Tejun Heo
2015-07-03 10:49       ` Jan Kara
2015-07-03 17:14         ` Tejun Heo
2015-05-22 21:13 ` [PATCH 23/51] writeback: make backing_dev_info host cgroup-specific bdi_writebacks Tejun Heo
2015-06-30 10:14   ` Jan Kara
2015-05-22 21:13 ` [PATCH 24/51] writeback, blkcg: associate each blkcg_gq with the corresponding bdi_writeback_congested Tejun Heo
     [not found]   ` <1432329245-5844-25-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-30  9:08     ` Jan Kara
2015-05-22 21:13 ` [PATCH 25/51] writeback: attribute stats to the matching per-cgroup bdi_writeback Tejun Heo
     [not found]   ` <1432329245-5844-26-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-30 14:17     ` Jan Kara
2015-05-22 21:13 ` [PATCH 26/51] writeback: let balance_dirty_pages() work on the matching cgroup bdi_writeback Tejun Heo
2015-06-30 14:31   ` Jan Kara
2015-07-02  1:26     ` Tejun Heo
2015-05-22 21:13 ` [PATCH 27/51] writeback: make congestion functions per bdi_writeback Tejun Heo
2015-06-30 14:50   ` Jan Kara
2015-05-22 21:13 ` [PATCH 28/51] writeback, blkcg: restructure blk_{set|clear}_queue_congested() Tejun Heo
     [not found]   ` <1432329245-5844-29-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-30 15:02     ` Jan Kara
2015-07-02  1:38       ` Tejun Heo
     [not found]         ` <20150702013815.GE26440-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-07-03 12:16           ` Jan Kara
2015-05-22 21:13 ` [PATCH 29/51] writeback, blkcg: propagate non-root blkcg congestion state Tejun Heo
2015-06-30 15:03   ` Jan Kara
2015-05-22 21:13 ` [PATCH 30/51] writeback: implement and use inode_congested() Tejun Heo
2015-06-30 15:21   ` Jan Kara
2015-07-02  1:46     ` Tejun Heo
2015-07-03 12:17       ` Jan Kara
     [not found]         ` <20150703121721.GJ23329-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-03 17:07           ` Tejun Heo
2015-07-04 15:12         ` [PATCH block/for-4.3] writeback: explain why @inode is allowed to be NULL for inode_congested() Tejun Heo
2015-07-08  8:12           ` Jan Kara
2015-05-22 21:13 ` [PATCH 31/51] writeback: implement WB_has_dirty_io wb_state flag Tejun Heo
2015-06-30 15:42   ` Jan Kara
2015-05-22 21:13 ` [PATCH 32/51] writeback: implement backing_dev_info->tot_write_bandwidth Tejun Heo
     [not found]   ` <1432329245-5844-33-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-30 16:14     ` Jan Kara
2015-06-30 16:42       ` Jan Kara
2015-05-22 21:13 ` [PATCH 33/51] writeback: make bdi_has_dirty_io() take multiple bdi_writeback's into account Tejun Heo
2015-06-30 16:48   ` Jan Kara
     [not found]     ` <20150630164824.GU7252-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-02  2:01       ` Tejun Heo
2015-05-22 21:13 ` [PATCH 34/51] writeback: don't issue wb_writeback_work if clean Tejun Heo
     [not found]   ` <1432329245-5844-35-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-06-30 16:18     ` Jan Kara
2015-05-22 21:13 ` [PATCH 35/51] writeback: make bdi->min/max_ratio handling cgroup writeback aware Tejun Heo
     [not found]   ` <1432329245-5844-36-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01  7:00     ` Jan Kara
2015-05-22 21:13 ` [PATCH 36/51] writeback: implement bdi_for_each_wb() Tejun Heo
     [not found]   ` <1432329245-5844-37-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01  7:27     ` Jan Kara
2015-07-02  2:22       ` Tejun Heo
     [not found]         ` <20150702022226.GH26440-qYNAdHglDFBN0TnZuCh8vA@public.gmane.org>
2015-07-03 12:26           ` Jan Kara
2015-07-03 17:06             ` Tejun Heo
2015-05-22 21:13 ` [PATCH 37/51] writeback: remove bdi_start_writeback() Tejun Heo
     [not found]   ` <1432329245-5844-38-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01  7:30     ` Jan Kara
2015-05-22 21:13 ` [PATCH 38/51] writeback: make laptop_mode_timer_fn() handle multiple bdi_writeback's Tejun Heo
     [not found]   ` <1432329245-5844-39-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01  7:32     ` Jan Kara
2015-05-22 21:13 ` [PATCH 39/51] writeback: make writeback_in_progress() take bdi_writeback instead of backing_dev_info Tejun Heo
     [not found]   ` <1432329245-5844-40-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01  7:47     ` Jan Kara
2015-07-02  2:28       ` Tejun Heo
2015-05-22 21:13 ` [PATCH 40/51] writeback: make bdi_start_background_writeback() " Tejun Heo
2015-07-01  7:50   ` Jan Kara
2015-07-02  2:29     ` Tejun Heo
     [not found]     ` <20150701075009.GA7252-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-06 19:36       ` [PATCH block/for-4.3] writeback: update writeback tracepoints to report cgroup Tejun Heo
2015-07-08  8:17         ` Jan Kara
2015-05-22 21:13 ` [PATCH 41/51] writeback: make wakeup_flusher_threads() handle multiple bdi_writeback's Tejun Heo
2015-07-01  8:15   ` Jan Kara
2015-07-02  2:37     ` Tejun Heo
2015-07-03 13:02       ` Jan Kara
     [not found]         ` <20150703130213.GM23329-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-03 16:33           ` Tejun Heo
2015-05-22 21:13 ` [PATCH 42/51] writeback: make wakeup_dirtytime_writeback() " Tejun Heo
2015-07-01  8:20   ` Jan Kara
2015-05-22 21:13 ` [PATCH 43/51] writeback: add wb_writeback_work->auto_free Tejun Heo
2015-05-22 21:13 ` [PATCH 44/51] writeback: implement bdi_wait_for_completion() Tejun Heo
     [not found]   ` <1432329245-5844-45-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01 16:04     ` Jan Kara
2015-07-02  3:06       ` Tejun Heo
2015-07-03 12:36         ` Jan Kara
2015-07-03 17:02           ` Tejun Heo
2015-07-01 16:09     ` Jan Kara
     [not found]       ` <20150701160918.GH7252-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-02  3:01         ` Tejun Heo
2015-05-22 21:13 ` [PATCH 45/51] writeback: implement wb_wait_for_single_work() Tejun Heo
2015-07-01 19:07   ` Jan Kara
2015-07-02  3:07     ` Tejun Heo
2015-07-03 22:12     ` [PATCH block/for-4.3] writeback: remove wb_writeback_work->single_wait/done Tejun Heo
2015-07-08  8:24       ` Jan Kara
2015-05-22 21:14 ` [PATCH 46/51] writeback: restructure try_writeback_inodes_sb[_nr]() Tejun Heo
2015-05-22 21:14 ` [PATCH 47/51] writeback: make writeback initiation functions handle multiple bdi_writeback's Tejun Heo
2015-05-22 21:14 ` [PATCH 48/51] writeback: dirty inodes against their matching cgroup bdi_writeback's Tejun Heo
     [not found]   ` <1432329245-5844-49-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-07-01 19:16     ` Jan Kara
2015-05-22 21:14 ` [PATCH 49/51] buffer, writeback: make __block_write_full_page() honor cgroup writeback Tejun Heo
2015-07-01 19:21   ` Jan Kara
     [not found]     ` <20150701192102.GK7252-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-07-01 19:28       ` Jan Kara
2015-05-22 21:14 ` [PATCH 50/51] mpage: make __mpage_writepage() " Tejun Heo
2015-07-01 19:26   ` Jan Kara
2015-05-22 21:14 ` [PATCH 51/51] ext2: enable cgroup writeback support Tejun Heo
2015-07-01 19:29   ` Jan Kara
2015-07-02  3:08     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150528000002.GT7099@htj.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=mhocko@suse.cz \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).