mm-commits Archive on lore.kernel.org
 help / color / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, dancol@google.com,
	dave.hansen@intel.com, jannh@google.com, joel@joelfernandes.org,
	linux-mm@kvack.org, mhocko@suse.com, minchan@kernel.org,
	mm-commits@vger.kernel.org, stable@vger.kernel.org,
	torvalds@linux-foundation.org, vbabka@suse.cz
Subject: [patch 06/10] mm: do not allow MADV_PAGEOUT for CoW pages
Date: Sat, 21 Mar 2020 18:22:26 -0700
Message-ID: <20200322012226.yY5JKgG__%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org>

From: Michal Hocko <mhocko@suse.com>
Subject: mm: do not allow MADV_PAGEOUT for CoW pages

Jann has brought up a very interesting point [1].  While shared pages are
excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
that way.  This can lead to all sorts of hard to debug problems.  E.g. 
performance problems outlined by Daniel [2].

There are runtime environments where there is a substantial memory shared
among security domains via CoW memory and a easy to reclaim way of that
memory, which MADV_{COLD,PAGEOUT} offers, can lead to either performance
degradation in for the parent process which might be more privileged or
even open side channel attacks.

The feasibility of the latter is not really clear to me TBH but there is
no real reason for exposure at this stage.  It seems there is no real use
case to depend on reclaiming CoW memory via madvise at this stage so it is
much easier to simply disallow it and this is what this patch does.  Put
it simply MADV_{PAGEOUT,COLD} can operate only on the exclusively owned
memory which is a straightforward semantic.

[1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
[2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com

Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.cz
Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/madvise.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

--- a/mm/madvise.c~mm-do-not-allow-madv_pageout-for-cow-pages
+++ a/mm/madvise.c
@@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_r
 		}
 
 		page = pmd_page(orig_pmd);
+
+		/* Do not interfere with other mappings of this page */
+		if (page_mapcount(page) != 1)
+			goto huge_unlock;
+
 		if (next - addr != HPAGE_PMD_SIZE) {
 			int err;
 
-			if (page_mapcount(page) != 1)
-				goto huge_unlock;

  parent reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-22  1:19 incoming Andrew Morton
2020-03-22  1:22 ` [patch 01/10] memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event Andrew Morton
2020-03-22  1:22 ` [patch 02/10] mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case Andrew Morton
2020-03-22  1:22 ` [patch 03/10] page-flags: fix a crash at SetPageError(THP_SWAP) Andrew Morton
2020-03-22  1:22 ` [patch 04/10] mm, memcg: fix corruption on 64-bit divisor in memory.high throttling Andrew Morton
2020-03-22  1:22 ` [patch 05/10] mm, memcg: throttle allocators based on ancestral memory.high Andrew Morton
2020-03-22  1:22 ` Andrew Morton [this message]
2020-03-22  1:22 ` [patch 07/10] epoll: fix possible lost wakeup on epoll_ctl() path Andrew Morton
2020-03-22  1:22 ` [patch 08/10] mm/mmu_notifier: silence PROVE_RCU_LIST warnings Andrew Morton
2020-03-22  1:22 ` [patch 09/10] mm, slub: prevent kmalloc_node crashes and memory leaks Andrew Morton
2020-03-22  1:22 ` [patch 10/10] x86/mm: split vmalloc_sync_all() Andrew Morton
2020-03-22  1:39 ` + tools-testing-selftests-vm-mlock2-tests-fix-mlock2-false-negative-errors.patch added to -mm tree Andrew Morton
2020-03-22  4:39 ` + libfs-fix-infoleak-in-simple_attr_read.patch " Andrew Morton
2020-03-22  4:41 ` + bus-mhi-fix-printk-format-for-size_t.patch " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200322012226.yY5JKgG__%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@intel.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

mm-commits Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/mm-commits/0 mm-commits/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 mm-commits mm-commits/ https://lore.kernel.org/mm-commits \
		mm-commits@vger.kernel.org
	public-inbox-index mm-commits

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.mm-commits


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git