mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: alex.shi@linux.alibaba.com, guro@fb.com, hannes@cmpxchg.org,
	hughd@google.com, iamjoonsoo.kim@lge.com, kirill@shutemov.name,
	mhocko@suse.com, mm-commits@vger.kernel.org, shakeelb@google.com
Subject: + mm-memcontrol-fix-stat-corrupting-race-in-charge-moving.patch added to -mm tree
Date: Fri, 08 May 2020 16:19:24 -0700	[thread overview]
Message-ID: <20200508231924.U1LDWB55P%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200507183509.c5ef146c5aaeb118a25a39a8@linux-foundation.org>


The patch titled
     Subject: mm: memcontrol: fix stat-corrupting race in charge moving
has been added to the -mm tree.  Its filename is
     mm-memcontrol-fix-stat-corrupting-race-in-charge-moving.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-memcontrol-fix-stat-corrupting-race-in-charge-moving.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-memcontrol-fix-stat-corrupting-race-in-charge-moving.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: mm: memcontrol: fix stat-corrupting race in charge moving

The move_lock is a per-memcg lock, but the VM accounting code that needs
to acquire it comes from the page and follows page->mem_cgroup under RCU
protection.  That means that the page becomes unlocked not when we drop
the move_lock, but when we update page->mem_cgroup.  And that assignment
doesn't imply any memory ordering.  If that pointer write gets reordered
against the reads of the page state - page_mapped, PageDirty etc.  the
state may change while we rely on it being stable and we can end up
corrupting the counters.

Place an SMP memory barrier to make sure we're done with all page state by
the time the new page->mem_cgroup becomes visible.

Also replace the open-coded move_lock with a lock_page_memcg() to make it
more obvious what we're serializing against.

Link: http://lkml.kernel.org/r/20200508183105.225460-3-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

--- a/mm/memcontrol.c~mm-memcontrol-fix-stat-corrupting-race-in-charge-moving
+++ a/mm/memcontrol.c
@@ -5378,7 +5378,6 @@ static int mem_cgroup_move_account(struc
 {
 	struct lruvec *from_vec, *to_vec;
 	struct pglist_data *pgdat;
-	unsigned long flags;
 	unsigned int nr_pages = compound ? hpage_nr_pages(page) : 1;
 	int ret;
 	bool anon;
@@ -5405,18 +5404,13 @@ static int mem_cgroup_move_account(struc
 	from_vec = mem_cgroup_lruvec(from, pgdat);
 	to_vec = mem_cgroup_lruvec(to, pgdat);
 
-	spin_lock_irqsave(&from->move_lock, flags);
+	lock_page_memcg(page);
 
 	if (!anon && page_mapped(page)) {
 		__mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages);
 		__mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages);
 	}
 
-	/*
-	 * move_lock grabbed above and caller set from->moving_account, so
-	 * mod_memcg_page_state will serialize updates to PageDirty.
-	 * So mapping should be stable for dirty pages.
-	 */
 	if (!anon && PageDirty(page)) {
 		struct address_space *mapping = page_mapping(page);
 
@@ -5432,15 +5426,23 @@ static int mem_cgroup_move_account(struc
 	}
 
 	/*
+	 * All state has been migrated, let's switch to the new memcg.
+	 *
 	 * It is safe to change page->mem_cgroup here because the page
-	 * is referenced, charged, and isolated - we can't race with
-	 * uncharging, charging, migration, or LRU putback.
+	 * is referenced, charged, isolated, and locked: we can't race
+	 * with (un)charging, migration, LRU putback, or anything else
+	 * that would rely on a stable page->mem_cgroup.
+	 *
+	 * Note that lock_page_memcg is a memcg lock, not a page lock,
+	 * to save space. As soon as we switch page->mem_cgroup to a
+	 * new memcg that isn't locked, the above state can change
+	 * concurrently again. Make sure we're truly done with it.
 	 */
+	smp_mb();
 
-	/* caller should have done css_get */
-	page->mem_cgroup = to;
+	page->mem_cgroup = to; 	/* caller should have done css_get */
 
-	spin_unlock_irqrestore(&from->move_lock, flags);
+	__unlock_page_memcg(from);
 
 	ret = 0;
 
_

Patches currently in -mm which might be from hannes@cmpxchg.org are

mm-fix-numa-node-file-count-error-in-replace_page_cache.patch
mm-memcontrol-fix-stat-corrupting-race-in-charge-moving.patch
mm-memcontrol-drop-compound-parameter-from-memcg-charging-api.patch
mm-memcontrol-move-out-cgroup-swaprate-throttling.patch
mm-memcontrol-convert-page-cache-to-a-new-mem_cgroup_charge-api.patch
mm-memcontrol-prepare-uncharging-for-removal-of-private-page-type-counters.patch
mm-memcontrol-prepare-move_account-for-removal-of-private-page-type-counters.patch
mm-memcontrol-prepare-cgroup-vmstat-infrastructure-for-native-anon-counters.patch
mm-memcontrol-switch-to-native-nr_file_pages-and-nr_shmem-counters.patch
mm-memcontrol-switch-to-native-nr_anon_mapped-counter.patch
mm-memcontrol-switch-to-native-nr_anon_thps-counter.patch
mm-memcontrol-convert-anon-and-file-thp-to-new-mem_cgroup_charge-api.patch
mm-memcontrol-drop-unused-try-commit-cancel-charge-api.patch
mm-memcontrol-prepare-swap-controller-setup-for-integration.patch
mm-memcontrol-make-swap-tracking-an-integral-part-of-memory-control.patch
mm-memcontrol-charge-swapin-pages-on-instantiation.patch
mm-memcontrol-delete-unused-lrucare-handling.patch
mm-memcontrol-update-page-mem_cgroup-stability-rules.patch

  parent reply	other threads:[~2020-05-08 23:19 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08  1:35 incoming Andrew Morton
2020-05-08  1:35 ` [patch 01/15] ipc/mqueue.c: change __do_notify() to bypass check_kill_permission() Andrew Morton
2020-05-08  1:35 ` [patch 02/15] mm, memcg: fix error return value of mem_cgroup_css_alloc() Andrew Morton
2020-05-08  1:35 ` [patch 03/15] mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous() Andrew Morton
2020-05-08  1:35 ` [patch 04/15] kernel/kcov.c: fix typos in kcov_remote_start documentation Andrew Morton
2020-05-08  1:35 ` [patch 05/15] scripts/decodecode: fix trapping instruction formatting Andrew Morton
2020-05-08  1:35 ` [patch 06/15] arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory() Andrew Morton
2020-05-08  1:35 ` [patch 07/15] eventpoll: fix missing wakeup for ovflist in ep_poll_callback Andrew Morton
2020-05-08  1:36 ` [patch 08/15] scripts/gdb: repair rb_first() and rb_last() Andrew Morton
2020-05-08  1:36 ` [patch 09/15] mm/slub: fix incorrect interpretation of s->offset Andrew Morton
2020-05-08  1:36 ` [patch 10/15] percpu: make pcpu_alloc() aware of current gfp context Andrew Morton
2020-05-08  1:36 ` [patch 11/15] kselftests: introduce new epoll60 testcase for catching lost wakeups Andrew Morton
2020-05-08  1:36 ` [patch 12/15] epoll: atomically remove wait entry on wake up Andrew Morton
2020-05-08  1:36 ` [patch 13/15] mm/vmscan: remove unnecessary argument description of isolate_lru_pages() Andrew Morton
2020-05-08  1:36 ` [patch 14/15] ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST Andrew Morton
2020-05-08  1:36 ` [patch 15/15] mm: limit boost_watermark on small zones Andrew Morton
2020-05-08  2:08 ` + arch-kunmap-remove-duplicate-kunmap-implementations-fix.patch added to -mm tree Andrew Morton
2020-05-08 20:46 ` + ipc-utilc-sysvipc_find_ipc-incorrectly-updates-position-index-fix.patch " Andrew Morton
2020-05-08 20:50 ` + mm-introduce-external-memory-hinting-api-fix-2.patch " Andrew Morton
2020-05-08 21:32 ` + checkpatch-use-patch-subject-when-reading-from-stdin-fix.patch " Andrew Morton
2020-05-08 23:19 ` + mm-fix-numa-node-file-count-error-in-replace_page_cache.patch " Andrew Morton
2020-05-08 23:19 ` Andrew Morton [this message]
2020-05-08 23:19 ` + mm-memcontrol-drop-compound-parameter-from-memcg-charging-api.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-move-out-cgroup-swaprate-throttling.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-convert-page-cache-to-a-new-mem_cgroup_charge-api.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-prepare-uncharging-for-removal-of-private-page-type-counters.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-prepare-move_account-for-removal-of-private-page-type-counters.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-prepare-cgroup-vmstat-infrastructure-for-native-anon-counters.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-switch-to-native-nr_file_pages-and-nr_shmem-counters.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-switch-to-native-nr_anon_mapped-counter.patch " Andrew Morton
2020-05-08 23:19 ` + mm-memcontrol-switch-to-native-nr_anon_thps-counter.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-convert-anon-and-file-thp-to-new-mem_cgroup_charge-api.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-drop-unused-try-commit-cancel-charge-api.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-prepare-swap-controller-setup-for-integration.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-make-swap-tracking-an-integral-part-of-memory-control.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-charge-swapin-pages-on-instantiation.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-document-the-new-swap-control-behavior.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-delete-unused-lrucare-handling.patch " Andrew Morton
2020-05-08 23:20 ` + mm-memcontrol-update-page-mem_cgroup-stability-rules.patch " Andrew Morton
2020-05-08 23:38 ` + selftests-vm-pkeys-introduce-powerpc-support-fix.patch " Andrew Morton
2020-05-08 23:38 ` + selftests-vm-pkeys-override-access-right-definitions-on-powerpc-fix.patch " Andrew Morton
2020-05-08 23:40 ` + memcg-expose-root-cgroups-memorystat.patch " Andrew Morton
2020-05-08 23:43 ` + doc-cgroup-update-note-about-conditions-when-oom-killer-is-invoked.patch " Andrew Morton
2020-05-08 23:43 ` + doc-cgroup-update-note-about-conditions-when-oom-killer-is-invoked-fix.patch " Andrew Morton
2020-05-08 23:48 ` + zcomp-use-array_size-for-backends-list.patch " Andrew Morton
2020-05-08 23:53 ` + device-dax-dont-leak-kernel-memory-to-user-space-after-unloading-kmem.patch " Andrew Morton
2020-05-08 23:56 ` + mm-memory_hotplug-introduce-add_memory_driver_managed.patch " Andrew Morton
2020-05-08 23:56 ` + kexec_file-dont-place-kexec-images-on-ioresource_mem_driver_managed.patch " Andrew Morton
2020-05-08 23:56 ` + device-dax-add-memory-via-add_memory_driver_managed.patch " Andrew Morton
2020-05-09  0:45 ` + mm-memory_hotplug-allow-arch-override-of-non-boot-memory-resource-names-fix.patch " Andrew Morton
2020-05-11 19:08 ` + mm-reset-numa-stats-for-boot-pagesets-v3.patch " Andrew Morton
2020-05-11 19:54 ` + mm-shmem-remove-rare-optimization-when-swapin-races-with-hole-punching.patch " Andrew Morton
2020-05-11 20:31 ` [to-be-updated] kexec-prevent-removal-of-memory-in-use-by-a-loaded-kexec-image.patch removed from " Andrew Morton
2020-05-11 20:31 ` [to-be-updated] mm-memory_hotplug-allow-arch-override-of-non-boot-memory-resource-names.patch " Andrew Morton
2020-05-11 20:31 ` [to-be-updated] mm-memory_hotplug-allow-arch-override-of-non-boot-memory-resource-names-fix.patch " Andrew Morton
2020-05-11 20:31 ` [to-be-updated] arm64-memory-give-hotplug-memory-a-different-resource-name.patch " Andrew Morton
2020-05-11 20:38 ` + arm-add-support-for-folded-p4d-page-tables-fix.patch added to " Andrew Morton
2020-05-11 20:40 ` + mm-add-debug_wx-support-fix-2.patch " Andrew Morton
2020-05-11 20:41 ` + mm-add-debug_wx-support-fix-3.patch " Andrew Morton
2020-05-11 20:50 ` + arm64-mm-drop-__have_arch_huge_ptep_get.patch " Andrew Morton
2020-05-11 20:50 ` + mm-hugetlb-define-a-generic-fallback-for-is_hugepage_only_range.patch " Andrew Morton
2020-05-11 20:50 ` + mm-hugetlb-define-a-generic-fallback-for-arch_clear_hugepage_flags.patch " Andrew Morton
2020-05-11 21:00 ` + mm-introduce-external-memory-hinting-api-fix-2-fix.patch " Andrew Morton
2020-05-11 21:02 ` + mm-support-vector-address-ranges-for-process_madvise-fix-fix-fix-fix-fix.patch " Andrew Morton
2020-05-11 21:12 ` [alternative-merged] mm-vmscan-consistent-update-to-pgsteal-and-pgscan.patch removed from " Andrew Morton
2020-05-11 21:21 ` + seq_file-introduce-define_seq_attribute-helper-macro.patch added to " Andrew Morton
2020-05-11 21:21 ` + mm-vmstat-convert-to-use-define_seq_attribute-macro.patch " Andrew Morton
2020-05-11 21:21 ` + kernel-kprobes-convert-to-use-define_seq_attribute-macro.patch " Andrew Morton
2020-05-11 21:22 ` + seq_file-introduce-define_seq_attribute-helper-macro-checkpatch-fixes.patch " Andrew Morton
2020-05-11 21:23 ` + lib-flex_proportionsc-cleanup-__fprop_inc_percpu_max.patch " Andrew Morton
2020-05-11 21:31 ` [alternative-merged] lib-flex_proportionsc-aging-counts-when-fraction-smaller-than-max_frac-fprop_frac_base.patch removed from " Andrew Morton
2020-05-11 21:44 ` + xtensa-add-loglvl-to-show_trace-fix.patch added to " Andrew Morton
2020-05-11 22:44 ` mmotm 2020-05-11-15-43 uploaded Andrew Morton
2020-05-12 21:03 ` + kasan-consistently-disable-debugging-features.patch added to -mm tree Andrew Morton
2020-05-12 21:03 ` + kasan-add-missing-functions-declarations-to-kasanh.patch " Andrew Morton
2020-05-12 21:04 ` + kasan-move-kasan_report-into-reportc.patch " Andrew Morton
2020-05-13 18:31 ` + mm-compaction-avoid-vm_bug_onpageslab-in-page_mapcount.patch " Andrew Morton
2020-05-13 20:56 ` + kernel-sysctl-ignore-out-of-range-taint-bits-introduced-via-kerneltainted.patch " Andrew Morton
2020-05-13 21:26 ` + vfs-keep-inodes-with-page-cache-off-the-inode-shrinker-lru.patch " Andrew Morton
2020-05-13 22:00 ` + mm-memcontrol-convert-anon-and-file-thp-to-new-mem_cgroup_charge-api-fix.patch " Andrew Morton
2020-05-13 22:48 ` + mm-swap-use-prandom_u32_max.patch " Andrew Morton
2020-05-13 22:54 ` + mm-memcontrol-switch-to-native-nr_anon_thps-counter-fix.patch " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200508231924.U1LDWB55P%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=shakeelb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).