linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] memcontrol code cleanup and simplification
@ 2021-04-16  5:13 Muchun Song
  2021-04-16  5:14 ` [PATCH v2 1/8] mm: memcontrol: fix page charging in page replacement Muchun Song
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:13 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song

This patch series is part of [1] patch series. Because those patches are
code cleanup or simplification. I gather those patches into a separate
series to make it easier to review.

[1] https://lore.kernel.org/linux-mm/20210409122959.82264-1-songmuchun@bytedance.com/

Changlogs in v2:
  1. Collect Acked-by and Review-by tags.
  2. Add a new patch to rename lruvec_holds_page_lru_lock to page_matches_lruvec.
  3. Add a comment to patch 2.

  Thanks to Roman, Johannes, Shakeel and Michal's review.

Muchun Song (8):
  mm: memcontrol: fix page charging in page replacement
  mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm
  mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec
  mm: memcontrol: simplify lruvec_holds_page_lru_lock
  mm: memcontrol: rename lruvec_holds_page_lru_lock to
    page_matches_lruvec
  mm: memcontrol: simplify the logic of objcg pinning memcg
  mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock
  mm: vmscan: remove noinline_for_stack

 include/linux/memcontrol.h | 42 +++++++++---------------------
 mm/compaction.c            |  2 +-
 mm/memcontrol.c            | 65 +++++++++++++++++++++-------------------------
 mm/swap.c                  |  2 +-
 mm/vmscan.c                |  8 +++---
 mm/workingset.c            |  2 +-
 6 files changed, 49 insertions(+), 72 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/8] mm: memcontrol: fix page charging in page replacement
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 2/8] mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm Muchun Song
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song, Michal Hocko

The pages aren't accounted at the root level, so do not charge the page
to the root memcg in page replacement. Although we do not display the
value (mem_cgroup_usage) so there shouldn't be any actual problem, but
there is a WARN_ON_ONCE in the page_counter_cancel(). Who knows if it
will trigger? So it is better to fix it.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/memcontrol.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 64ada9e650a5..f229de925aa5 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6806,9 +6806,11 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage)
 	/* Force-charge the new page. The old one will be freed soon */
 	nr_pages = thp_nr_pages(newpage);
 
-	page_counter_charge(&memcg->memory, nr_pages);
-	if (do_memsw_account())
-		page_counter_charge(&memcg->memsw, nr_pages);
+	if (!mem_cgroup_is_root(memcg)) {
+		page_counter_charge(&memcg->memory, nr_pages);
+		if (do_memsw_account())
+			page_counter_charge(&memcg->memsw, nr_pages);
+	}
 
 	css_get(&memcg->css);
 	commit_charge(newpage, memcg);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/8] mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
  2021-04-16  5:14 ` [PATCH v2 1/8] mm: memcontrol: fix page charging in page replacement Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 3/8] mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec Muchun Song
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song, Michal Hocko

When mm is NULL, we do not need to hold rcu lock and call css_tryget for
the root memcg. And we also do not need to check !mm in every loop of
while. So bail out early when !mm.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/memcontrol.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f229de925aa5..50e3cf1e263e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -901,20 +901,23 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
 	if (mem_cgroup_disabled())
 		return NULL;
 
+	/*
+	 * Page cache insertions can happen without an
+	 * actual mm context, e.g. during disk probing
+	 * on boot, loopback IO, acct() writes etc.
+	 *
+	 * No need to css_get on root memcg as the reference
+	 * counting is disabled on the root level in the
+	 * cgroup core. See CSS_NO_REF.
+	 */
+	if (unlikely(!mm))
+		return root_mem_cgroup;
+
 	rcu_read_lock();
 	do {
-		/*
-		 * Page cache insertions can happen without an
-		 * actual mm context, e.g. during disk probing
-		 * on boot, loopback IO, acct() writes etc.
-		 */
-		if (unlikely(!mm))
+		memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
+		if (unlikely(!memcg))
 			memcg = root_mem_cgroup;
-		else {
-			memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
-			if (unlikely(!memcg))
-				memcg = root_mem_cgroup;
-		}
 	} while (!css_tryget(&memcg->css));
 	rcu_read_unlock();
 	return memcg;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/8] mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
  2021-04-16  5:14 ` [PATCH v2 1/8] mm: memcontrol: fix page charging in page replacement Muchun Song
  2021-04-16  5:14 ` [PATCH v2 2/8] mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 4/8] mm: memcontrol: simplify lruvec_holds_page_lru_lock Muchun Song
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song, Michal Hocko

All the callers of mem_cgroup_page_lruvec() just pass page_pgdat(page)
as the 2nd parameter to it (except isolate_migratepages_block()). But
for isolate_migratepages_block(), the page_pgdat(page) is also equal
to the local variable of @pgdat. So mem_cgroup_page_lruvec() do not
need the pgdat parameter. Just remove it to simplify the code.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/memcontrol.h | 10 +++++-----
 mm/compaction.c            |  2 +-
 mm/memcontrol.c            |  9 +++------
 mm/swap.c                  |  2 +-
 mm/workingset.c            |  2 +-
 5 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c193be760709..f2a5aaba3577 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -743,13 +743,12 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
 /**
  * mem_cgroup_page_lruvec - return lruvec for isolating/putting an LRU page
  * @page: the page
- * @pgdat: pgdat of the page
  *
  * This function relies on page->mem_cgroup being stable.
  */
-static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
-						struct pglist_data *pgdat)
+static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page)
 {
+	pg_data_t *pgdat = page_pgdat(page);
 	struct mem_cgroup *memcg = page_memcg(page);
 
 	VM_WARN_ON_ONCE_PAGE(!memcg && !mem_cgroup_disabled(), page);
@@ -1221,9 +1220,10 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
 	return &pgdat->__lruvec;
 }
 
-static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
-						    struct pglist_data *pgdat)
+static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page)
 {
+	pg_data_t *pgdat = page_pgdat(page);
+
 	return &pgdat->__lruvec;
 }
 
diff --git a/mm/compaction.c b/mm/compaction.c
index 8c5028bfbd56..1c500e697c88 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -994,7 +994,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		if (!TestClearPageLRU(page))
 			goto isolate_fail_put;
 
-		lruvec = mem_cgroup_page_lruvec(page, pgdat);
+		lruvec = mem_cgroup_page_lruvec(page);
 
 		/* If we already hold the lock, we can skip some rechecking */
 		if (lruvec != locked) {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 50e3cf1e263e..caf193088beb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1181,9 +1181,8 @@ void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page)
 struct lruvec *lock_page_lruvec(struct page *page)
 {
 	struct lruvec *lruvec;
-	struct pglist_data *pgdat = page_pgdat(page);
 
-	lruvec = mem_cgroup_page_lruvec(page, pgdat);
+	lruvec = mem_cgroup_page_lruvec(page);
 	spin_lock(&lruvec->lru_lock);
 
 	lruvec_memcg_debug(lruvec, page);
@@ -1194,9 +1193,8 @@ struct lruvec *lock_page_lruvec(struct page *page)
 struct lruvec *lock_page_lruvec_irq(struct page *page)
 {
 	struct lruvec *lruvec;
-	struct pglist_data *pgdat = page_pgdat(page);
 
-	lruvec = mem_cgroup_page_lruvec(page, pgdat);
+	lruvec = mem_cgroup_page_lruvec(page);
 	spin_lock_irq(&lruvec->lru_lock);
 
 	lruvec_memcg_debug(lruvec, page);
@@ -1207,9 +1205,8 @@ struct lruvec *lock_page_lruvec_irq(struct page *page)
 struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags)
 {
 	struct lruvec *lruvec;
-	struct pglist_data *pgdat = page_pgdat(page);
 
-	lruvec = mem_cgroup_page_lruvec(page, pgdat);
+	lruvec = mem_cgroup_page_lruvec(page);
 	spin_lock_irqsave(&lruvec->lru_lock, *flags);
 
 	lruvec_memcg_debug(lruvec, page);
diff --git a/mm/swap.c b/mm/swap.c
index a75a8265302b..e0d5699213cc 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -313,7 +313,7 @@ void lru_note_cost(struct lruvec *lruvec, bool file, unsigned int nr_pages)
 
 void lru_note_cost_page(struct page *page)
 {
-	lru_note_cost(mem_cgroup_page_lruvec(page, page_pgdat(page)),
+	lru_note_cost(mem_cgroup_page_lruvec(page),
 		      page_is_file_lru(page), thp_nr_pages(page));
 }
 
diff --git a/mm/workingset.c b/mm/workingset.c
index b7cdeca5a76d..4f7a306ce75a 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -408,7 +408,7 @@ void workingset_activation(struct page *page)
 	memcg = page_memcg_rcu(page);
 	if (!mem_cgroup_disabled() && !memcg)
 		goto out;
-	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
+	lruvec = mem_cgroup_page_lruvec(page);
 	workingset_age_nonresident(lruvec, thp_nr_pages(page));
 out:
 	rcu_read_unlock();
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/8] mm: memcontrol: simplify lruvec_holds_page_lru_lock
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
                   ` (2 preceding siblings ...)
  2021-04-16  5:14 ` [PATCH v2 3/8] mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song, Michal Hocko

We already have a helper lruvec_memcg() to get the memcg from lruvec, we
do not need to do it ourselves in the lruvec_holds_page_lru_lock(). So use
lruvec_memcg() instead. And if mem_cgroup_disabled() returns false, the
page_memcg(page) (the LRU pages) cannot be NULL. So remove the odd logic
of "memcg = page_memcg(page) ? : root_mem_cgroup". And use lruvec_pgdat
to simplify the code. We can have a single definition for this function
that works for !CONFIG_MEMCG, CONFIG_MEMCG + mem_cgroup_disabled() and
CONFIG_MEMCG.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/memcontrol.h | 31 +++++++------------------------
 1 file changed, 7 insertions(+), 24 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index f2a5aaba3577..2fc728492c9b 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -755,22 +755,6 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page)
 	return mem_cgroup_lruvec(memcg, pgdat);
 }
 
-static inline bool lruvec_holds_page_lru_lock(struct page *page,
-					      struct lruvec *lruvec)
-{
-	pg_data_t *pgdat = page_pgdat(page);
-	const struct mem_cgroup *memcg;
-	struct mem_cgroup_per_node *mz;
-
-	if (mem_cgroup_disabled())
-		return lruvec == &pgdat->__lruvec;
-
-	mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
-	memcg = page_memcg(page) ? : root_mem_cgroup;
-
-	return lruvec->pgdat == pgdat && mz->memcg == memcg;
-}
-
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 
 struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm);
@@ -1227,14 +1211,6 @@ static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page)
 	return &pgdat->__lruvec;
 }
 
-static inline bool lruvec_holds_page_lru_lock(struct page *page,
-					      struct lruvec *lruvec)
-{
-	pg_data_t *pgdat = page_pgdat(page);
-
-	return lruvec == &pgdat->__lruvec;
-}
-
 static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page)
 {
 }
@@ -1516,6 +1492,13 @@ static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec,
 	spin_unlock_irqrestore(&lruvec->lru_lock, flags);
 }
 
+static inline bool lruvec_holds_page_lru_lock(struct page *page,
+					      struct lruvec *lruvec)
+{
+	return lruvec_pgdat(lruvec) == page_pgdat(page) &&
+	       lruvec_memcg(lruvec) == page_memcg(page);
+}
+
 /* Don't lock again iff page's lruvec locked */
 static inline struct lruvec *relock_page_lruvec_irq(struct page *page,
 		struct lruvec *locked_lruvec)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
                   ` (3 preceding siblings ...)
  2021-04-16  5:14 ` [PATCH v2 4/8] mm: memcontrol: simplify lruvec_holds_page_lru_lock Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  6:01   ` Michal Hocko
                     ` (2 more replies)
  2021-04-16  5:14 ` [PATCH v2 6/8] mm: memcontrol: simplify the logic of objcg pinning memcg Muchun Song
                   ` (2 subsequent siblings)
  7 siblings, 3 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song

lruvec_holds_page_lru_lock() doesn't check anything about locking and is
used to check whether the page belongs to the lruvec. So rename it to
page_matches_lruvec().

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 include/linux/memcontrol.h | 7 +++----
 mm/vmscan.c                | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 2fc728492c9b..40b0c31ea4ba 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1492,8 +1492,7 @@ static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec,
 	spin_unlock_irqrestore(&lruvec->lru_lock, flags);
 }
 
-static inline bool lruvec_holds_page_lru_lock(struct page *page,
-					      struct lruvec *lruvec)
+static inline bool page_matches_lruvec(struct page *page, struct lruvec *lruvec)
 {
 	return lruvec_pgdat(lruvec) == page_pgdat(page) &&
 	       lruvec_memcg(lruvec) == page_memcg(page);
@@ -1504,7 +1503,7 @@ static inline struct lruvec *relock_page_lruvec_irq(struct page *page,
 		struct lruvec *locked_lruvec)
 {
 	if (locked_lruvec) {
-		if (lruvec_holds_page_lru_lock(page, locked_lruvec))
+		if (page_matches_lruvec(page, locked_lruvec))
 			return locked_lruvec;
 
 		unlock_page_lruvec_irq(locked_lruvec);
@@ -1518,7 +1517,7 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
 		struct lruvec *locked_lruvec, unsigned long *flags)
 {
 	if (locked_lruvec) {
-		if (lruvec_holds_page_lru_lock(page, locked_lruvec))
+		if (page_matches_lruvec(page, locked_lruvec))
 			return locked_lruvec;
 
 		unlock_page_lruvec_irqrestore(locked_lruvec, *flags);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index bb8321026c0c..2bc5cf409958 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2062,7 +2062,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
 		 * All pages were isolated from the same lruvec (and isolation
 		 * inhibits memcg migration).
 		 */
-		VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
+		VM_BUG_ON_PAGE(!page_matches_lruvec(page, lruvec), page);
 		add_page_to_lru_list(page, lruvec);
 		nr_pages = thp_nr_pages(page);
 		nr_moved += nr_pages;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/8] mm: memcontrol: simplify the logic of objcg pinning memcg
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
                   ` (4 preceding siblings ...)
  2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 7/8] mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock Muchun Song
  2021-04-16  5:14 ` [PATCH v2 8/8] mm: vmscan: remove noinline_for_stack Muchun Song
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song

The obj_cgroup_release() and memcg_reparent_objcgs() are serialized by
the css_set_lock. We do not need to care about objcg->memcg being
released in the process of obj_cgroup_release(). So there is no need
to pin memcg before releasing objcg. Remove those pinning logic to
simplfy the code.

There are only two places that modifies the objcg->memcg. One is the
initialization to objcg->memcg in the memcg_online_kmem(), another
is objcgs reparenting in the memcg_reparent_objcgs(). It is also
impossible for the two to run in parallel. So xchg() is unnecessary
and it is enough to use WRITE_ONCE().

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
---
 mm/memcontrol.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index caf193088beb..c4eebe2a2914 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -261,7 +261,6 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static void obj_cgroup_release(struct percpu_ref *ref)
 {
 	struct obj_cgroup *objcg = container_of(ref, struct obj_cgroup, refcnt);
-	struct mem_cgroup *memcg;
 	unsigned int nr_bytes;
 	unsigned int nr_pages;
 	unsigned long flags;
@@ -291,11 +290,9 @@ static void obj_cgroup_release(struct percpu_ref *ref)
 	nr_pages = nr_bytes >> PAGE_SHIFT;
 
 	spin_lock_irqsave(&css_set_lock, flags);
-	memcg = obj_cgroup_memcg(objcg);
 	if (nr_pages)
 		obj_cgroup_uncharge_pages(objcg, nr_pages);
 	list_del(&objcg->list);
-	mem_cgroup_put(memcg);
 	spin_unlock_irqrestore(&css_set_lock, flags);
 
 	percpu_ref_exit(ref);
@@ -330,17 +327,12 @@ static void memcg_reparent_objcgs(struct mem_cgroup *memcg,
 
 	spin_lock_irq(&css_set_lock);
 
-	/* Move active objcg to the parent's list */
-	xchg(&objcg->memcg, parent);
-	css_get(&parent->css);
-	list_add(&objcg->list, &parent->objcg_list);
-
-	/* Move already reparented objcgs to the parent's list */
-	list_for_each_entry(iter, &memcg->objcg_list, list) {
-		css_get(&parent->css);
-		xchg(&iter->memcg, parent);
-		css_put(&memcg->css);
-	}
+	/* 1) Ready to reparent active objcg. */
+	list_add(&objcg->list, &memcg->objcg_list);
+	/* 2) Reparent active objcg and already reparented objcgs to parent. */
+	list_for_each_entry(iter, &memcg->objcg_list, list)
+		WRITE_ONCE(iter->memcg, parent);
+	/* 3) Move already reparented objcgs to the parent's list */
 	list_splice(&memcg->objcg_list, &parent->objcg_list);
 
 	spin_unlock_irq(&css_set_lock);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 7/8] mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
                   ` (5 preceding siblings ...)
  2021-04-16  5:14 ` [PATCH v2 6/8] mm: memcontrol: simplify the logic of objcg pinning memcg Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  2021-04-16  5:14 ` [PATCH v2 8/8] mm: vmscan: remove noinline_for_stack Muchun Song
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song

The css_set_lock is used to guard the list of inherited objcgs. So there
is no need to uncharge kernel memory under css_set_lock. Just move it
out of the lock.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c4eebe2a2914..e0c398fe7443 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -289,9 +289,10 @@ static void obj_cgroup_release(struct percpu_ref *ref)
 	WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1));
 	nr_pages = nr_bytes >> PAGE_SHIFT;
 
-	spin_lock_irqsave(&css_set_lock, flags);
 	if (nr_pages)
 		obj_cgroup_uncharge_pages(objcg, nr_pages);
+
+	spin_lock_irqsave(&css_set_lock, flags);
 	list_del(&objcg->list);
 	spin_unlock_irqrestore(&css_set_lock, flags);
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 8/8] mm: vmscan: remove noinline_for_stack
  2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
                   ` (6 preceding siblings ...)
  2021-04-16  5:14 ` [PATCH v2 7/8] mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock Muchun Song
@ 2021-04-16  5:14 ` Muchun Song
  7 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-16  5:14 UTC (permalink / raw)
  To: guro, hannes, mhocko, akpm, shakeelb, vdavydov.dev
  Cc: linux-kernel, linux-mm, duanxiongchun, Muchun Song, Michal Hocko

The noinline_for_stack is introduced by commit 666356297ec4 ("vmscan:
set up pagevec as late as possible in shrink_inactive_list()"), its
purpose is to delay the allocation of pagevec as late as possible to
save stack memory. But the commit 2bcf88796381 ("mm: take pagevecs off
reclaim stack") replace pagevecs by lists of pages_to_free. So we do
not need noinline_for_stack, just remove it (let the compiler decide
whether to inline).

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/vmscan.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2bc5cf409958..2d2727b78df9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2014,8 +2014,8 @@ static int too_many_isolated(struct pglist_data *pgdat, int file,
  *
  * Returns the number of pages moved to the given lruvec.
  */
-static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
-						     struct list_head *list)
+static unsigned int move_pages_to_lru(struct lruvec *lruvec,
+				      struct list_head *list)
 {
 	int nr_pages, nr_moved = 0;
 	LIST_HEAD(pages_to_free);
@@ -2095,7 +2095,7 @@ static int current_may_throttle(void)
  * shrink_inactive_list() is a helper for shrink_node().  It returns the number
  * of reclaimed pages
  */
-static noinline_for_stack unsigned long
+static unsigned long
 shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		     struct scan_control *sc, enum lru_list lru)
 {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
  2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
@ 2021-04-16  6:01   ` Michal Hocko
  2021-04-16 15:20   ` Johannes Weiner
  2021-04-16 15:34   ` Shakeel Butt
  2 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2021-04-16  6:01 UTC (permalink / raw)
  To: Muchun Song
  Cc: guro, hannes, akpm, shakeelb, vdavydov.dev, linux-kernel,
	linux-mm, duanxiongchun

On Fri 16-04-21 13:14:04, Muchun Song wrote:
> lruvec_holds_page_lru_lock() doesn't check anything about locking and is
> used to check whether the page belongs to the lruvec. So rename it to
> page_matches_lruvec().
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks

> ---
>  include/linux/memcontrol.h | 7 +++----
>  mm/vmscan.c                | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 2fc728492c9b..40b0c31ea4ba 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1492,8 +1492,7 @@ static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec,
>  	spin_unlock_irqrestore(&lruvec->lru_lock, flags);
>  }
>  
> -static inline bool lruvec_holds_page_lru_lock(struct page *page,
> -					      struct lruvec *lruvec)
> +static inline bool page_matches_lruvec(struct page *page, struct lruvec *lruvec)
>  {
>  	return lruvec_pgdat(lruvec) == page_pgdat(page) &&
>  	       lruvec_memcg(lruvec) == page_memcg(page);
> @@ -1504,7 +1503,7 @@ static inline struct lruvec *relock_page_lruvec_irq(struct page *page,
>  		struct lruvec *locked_lruvec)
>  {
>  	if (locked_lruvec) {
> -		if (lruvec_holds_page_lru_lock(page, locked_lruvec))
> +		if (page_matches_lruvec(page, locked_lruvec))
>  			return locked_lruvec;
>  
>  		unlock_page_lruvec_irq(locked_lruvec);
> @@ -1518,7 +1517,7 @@ static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
>  		struct lruvec *locked_lruvec, unsigned long *flags)
>  {
>  	if (locked_lruvec) {
> -		if (lruvec_holds_page_lru_lock(page, locked_lruvec))
> +		if (page_matches_lruvec(page, locked_lruvec))
>  			return locked_lruvec;
>  
>  		unlock_page_lruvec_irqrestore(locked_lruvec, *flags);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bb8321026c0c..2bc5cf409958 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2062,7 +2062,7 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
>  		 * All pages were isolated from the same lruvec (and isolation
>  		 * inhibits memcg migration).
>  		 */
> -		VM_BUG_ON_PAGE(!lruvec_holds_page_lru_lock(page, lruvec), page);
> +		VM_BUG_ON_PAGE(!page_matches_lruvec(page, lruvec), page);
>  		add_page_to_lru_list(page, lruvec);
>  		nr_pages = thp_nr_pages(page);
>  		nr_moved += nr_pages;
> -- 
> 2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
  2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
  2021-04-16  6:01   ` Michal Hocko
@ 2021-04-16 15:20   ` Johannes Weiner
  2021-04-17  2:52     ` [External] " Muchun Song
  2021-04-16 15:34   ` Shakeel Butt
  2 siblings, 1 reply; 13+ messages in thread
From: Johannes Weiner @ 2021-04-16 15:20 UTC (permalink / raw)
  To: Muchun Song
  Cc: guro, mhocko, akpm, shakeelb, vdavydov.dev, linux-kernel,
	linux-mm, duanxiongchun

On Fri, Apr 16, 2021 at 01:14:04PM +0800, Muchun Song wrote:
> lruvec_holds_page_lru_lock() doesn't check anything about locking and is
> used to check whether the page belongs to the lruvec. So rename it to
> page_matches_lruvec().
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

The rename makes sense, since the previous name was defined by a
specific use case rather than what it does. That said, it did imply a
lock context that makes the test result stable. Without that the
function could use a short comment, IMO. How about:

/* Test requires a stable page->memcg binding, see page_memcg() */

With that,
Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
  2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
  2021-04-16  6:01   ` Michal Hocko
  2021-04-16 15:20   ` Johannes Weiner
@ 2021-04-16 15:34   ` Shakeel Butt
  2 siblings, 0 replies; 13+ messages in thread
From: Shakeel Butt @ 2021-04-16 15:34 UTC (permalink / raw)
  To: Muchun Song
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Andrew Morton,
	Vladimir Davydov, LKML, Linux MM, Xiongchun duan

On Thu, Apr 15, 2021 at 10:16 PM Muchun Song <songmuchun@bytedance.com> wrote:
>
> lruvec_holds_page_lru_lock() doesn't check anything about locking and is
> used to check whether the page belongs to the lruvec. So rename it to
> page_matches_lruvec().
>
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [External] Re: [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
  2021-04-16 15:20   ` Johannes Weiner
@ 2021-04-17  2:52     ` Muchun Song
  0 siblings, 0 replies; 13+ messages in thread
From: Muchun Song @ 2021-04-17  2:52 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Roman Gushchin, Michal Hocko, Andrew Morton, Shakeel Butt,
	Vladimir Davydov, LKML, Linux Memory Management List,
	Xiongchun duan

On Fri, Apr 16, 2021 at 11:20 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Fri, Apr 16, 2021 at 01:14:04PM +0800, Muchun Song wrote:
> > lruvec_holds_page_lru_lock() doesn't check anything about locking and is
> > used to check whether the page belongs to the lruvec. So rename it to
> > page_matches_lruvec().
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>
> The rename makes sense, since the previous name was defined by a
> specific use case rather than what it does. That said, it did imply a
> lock context that makes the test result stable. Without that the
> function could use a short comment, IMO. How about:
>
> /* Test requires a stable page->memcg binding, see page_memcg() */

Make sense. I will add this comment.

>
> With that,
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-04-17  2:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16  5:13 [PATCH v2 0/8] memcontrol code cleanup and simplification Muchun Song
2021-04-16  5:14 ` [PATCH v2 1/8] mm: memcontrol: fix page charging in page replacement Muchun Song
2021-04-16  5:14 ` [PATCH v2 2/8] mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm Muchun Song
2021-04-16  5:14 ` [PATCH v2 3/8] mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec Muchun Song
2021-04-16  5:14 ` [PATCH v2 4/8] mm: memcontrol: simplify lruvec_holds_page_lru_lock Muchun Song
2021-04-16  5:14 ` [PATCH v2 5/8] mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec Muchun Song
2021-04-16  6:01   ` Michal Hocko
2021-04-16 15:20   ` Johannes Weiner
2021-04-17  2:52     ` [External] " Muchun Song
2021-04-16 15:34   ` Shakeel Butt
2021-04-16  5:14 ` [PATCH v2 6/8] mm: memcontrol: simplify the logic of objcg pinning memcg Muchun Song
2021-04-16  5:14 ` [PATCH v2 7/8] mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock Muchun Song
2021-04-16  5:14 ` [PATCH v2 8/8] mm: vmscan: remove noinline_for_stack Muchun Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).