Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Jann Horn <jannh@google.com>, Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Daniel Colascione <dancol@google.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?
Date: Thu, 12 Mar 2020 13:16:02 -0700
Message-ID: <20200312201602.GA68817@google.com> (raw)
In-Reply-To: <20200312082248.GS23944@dhcp22.suse.cz>

On Thu, Mar 12, 2020 at 09:22:48AM +0100, Michal Hocko wrote:
> [Cc akpm]
> 
> So what about this?

Thanks, Michal.

I't likde to wait Jann's reply since Dave gave his opinion about the vulnerability.
https://lore.kernel.org/linux-mm/cf95db88-968d-fee5-1c15-10d024c09d8a@intel.com/
Jann, could you give your insigh about that practically it's possible?

A real dumb question to understand vulnerability:

The attacker would be able to trigger heavy memory consumption so that he
could make paging them out without MADV_PAGEOUT. I know MADV_PAGEOUT makes
it easier but he still could do without MADV_PAGEOUT.
What makes difference here?

To clarify how MADV_PAGEWORK works:
If other process has accessed the page so that his page table has access
bit marked, MADV_PAGEOUT couldn't page it out.

> 
> From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 12 Mar 2020 09:04:35 +0100
> Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages
> 
> Jann has brought up a very interesting point [1]. While shared pages are
> excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
> that way. This can lead to all sorts of hard to debug problems. E.g.
> performance problems outlined by Daniel [2]. There are runtime
> environments where there is a substantial memory shared among security
> domains via CoW memory and a easy to reclaim way of that memory, which
> MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation
> in for the parent process which might be more privileged or even open
> side channel attacks. The feasibility of the later is not really clear

I am not sure it's a good idea to mention performance stuff because
it's rather arguble. You and Johannes already pointed it out when I sbumit
early draft which had shared page filtering out logic due to performance
reason. You guys suggested the shared pages has higher chance to be touched
so that if it's really hot pages, that whould keep in the memory. I agree.

I think the only reason at this moment is just vulnerability.

> to me TBH but there is no real reason for exposure at this stage. It
> seems there is no real use case to depend on reclaiming CoW memory via
> madvise at this stage so it is much easier to simply disallow it and
> this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can
> operate only on the exclusively owned memory which is a straightforward
> semantic.
> 
> [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
> [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/madvise.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 43b47d3fae02..4bb30ed6c8d2 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
>  		}
>  
>  		page = pmd_page(orig_pmd);
> +
> +		/* Do not interfere with other mappings of this page */


How about this?
/*
 * paging out only single mapped private pages for anonymous mapping,
 * otherwise, it opens a side channel.
 */

Otherwise, looks good to me.

> +		if (page_mapcount(page) != 1)
> +			goto huge_unlock;
> +
>  		if (next - addr != HPAGE_PMD_SIZE) {
>  			int err;
>  
> -			if (page_mapcount(page) != 1)
> -				goto huge_unlock;
> -
>  			get_page(page);
>  			spin_unlock(ptl);
>  			lock_page(page);
> @@ -426,6 +428,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
>  			continue;
>  		}
>  
> +		/* Do not interfere with other mappings of this page */
> +		if (page_mapcount(page) != 1)
> +			continue;
> +
>  		VM_BUG_ON_PAGE(PageTransCompound(page), page);
>  
>  		if (pte_young(ptent)) {
> -- 
> 2.24.1
> 
> -- 
> Michal Hocko
> SUSE Labs


  parent reply index

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 18:08 Jann Horn
2020-03-10 18:48 ` Michal Hocko
2020-03-10 19:11   ` Jann Horn
2020-03-10 21:09     ` Michal Hocko
2020-03-10 22:48       ` Dave Hansen
2020-03-11  8:45         ` Michal Hocko
2020-03-11 22:02           ` Minchan Kim
2020-03-11 23:53           ` Shakeel Butt
2020-03-12  0:18             ` Minchan Kim
2020-03-12  2:03               ` Daniel Colascione
2020-03-12 15:15                 ` Shakeel Butt
2020-03-10 20:19   ` Daniel Colascione
2020-03-10 21:40     ` Jann Horn
2020-03-10 21:52       ` Daniel Colascione
2020-03-10 22:14 ` Minchan Kim
2020-03-12  8:22 ` Michal Hocko
2020-03-12 15:40   ` Vlastimil Babka
2020-03-12 20:16   ` Minchan Kim [this message]
2020-03-12 20:26     ` Dave Hansen
2020-03-12 20:41     ` Michal Hocko
2020-03-13  2:08       ` Minchan Kim
2020-03-13  8:05         ` Michal Hocko
2020-03-13 20:59           ` Minchan Kim
2020-03-16  9:20             ` Michal Hocko
2020-03-17  1:43               ` Minchan Kim
2020-03-17  7:12                 ` Michal Hocko
2020-03-17 15:00                   ` Minchan Kim
2020-03-17 15:58                     ` Michal Hocko
2020-03-17 17:20                       ` Minchan Kim
2020-03-12 21:41     ` Dave Hansen
2020-03-13  2:00       ` Minchan Kim
2020-03-13 16:59         ` Dave Hansen
2020-03-13 21:13           ` Minchan Kim
2020-03-12 23:29     ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200312201602.GA68817@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@intel.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git