Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Jann Horn <jannh@google.com>, Linux-MM <linux-mm@kvack.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Daniel Colascione <dancol@google.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?
Date: Mon, 16 Mar 2020 18:43:40 -0700
Message-ID: <20200317014340.GA73302@google.com> (raw)
In-Reply-To: <20200316092052.GD11482@dhcp22.suse.cz>

On Mon, Mar 16, 2020 at 10:20:52AM +0100, Michal Hocko wrote:
> On Fri 13-03-20 13:59:41, Minchan Kim wrote:
> > On Fri, Mar 13, 2020 at 09:05:46AM +0100, Michal Hocko wrote:
> > > On Thu 12-03-20 19:08:51, Minchan Kim wrote:
> > > > On Thu, Mar 12, 2020 at 09:41:55PM +0100, Michal Hocko wrote:
> > > > > On Thu 12-03-20 13:16:02, Minchan Kim wrote:
> > > > > > On Thu, Mar 12, 2020 at 09:22:48AM +0100, Michal Hocko wrote:
> > > > > [...]
> > > > > > > From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001
> > > > > > > From: Michal Hocko <mhocko@suse.com>
> > > > > > > Date: Thu, 12 Mar 2020 09:04:35 +0100
> > > > > > > Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages
> > > > > > > 
> > > > > > > Jann has brought up a very interesting point [1]. While shared pages are
> > > > > > > excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
> > > > > > > that way. This can lead to all sorts of hard to debug problems. E.g.
> > > > > > > performance problems outlined by Daniel [2]. There are runtime
> > > > > > > environments where there is a substantial memory shared among security
> > > > > > > domains via CoW memory and a easy to reclaim way of that memory, which
> > > > > > > MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation
> > > > > > > in for the parent process which might be more privileged or even open
> > > > > > > side channel attacks. The feasibility of the later is not really clear
> > > > > > 
> > > > > > I am not sure it's a good idea to mention performance stuff because
> > > > > > it's rather arguble. You and Johannes already pointed it out when I sbumit
> > > > > > early draft which had shared page filtering out logic due to performance
> > > > > > reason. You guys suggested the shared pages has higher chance to be touched
> > > > > > so that if it's really hot pages, that whould keep in the memory. I agree.
> > > > > 
> > > > > Yes, the hot memory is likely to be referenced but the point was an
> > > > > unexpected latency because of the major fault. I have to say that I have
> > > > 
> > > > I don't understand your point here. If it's likely to be referenced
> > > > among several processes, it doesn't have the major fault latency.
> > > > What's your point here?
> > > 
> > > a) the particular CoW page might be cold enough to be reclaimed and b)
> > 
> > If it is, that means it's *cold* so it's really worth to be reclaimed.
> > 
> > > nothing really prevents the MADV_PAGEOUT to be called faster than the
> > > reference bit being readded.
> > 
> > Yeb, that's undesirable. I should admit it was not intended when I implemented
> > PAGEOUT. The thing is page_check_references clears access bit of pte for every
> > process are sharing the page so that two times MADV_PAGEOUT from a process could
> > evict the page. That's the really bug.
> 
> I do not really think this is a bug. This is a side effect of the
> reclaim process and we do not really want MADV_{PAGEOUT,COLD} behave

No, that's the bug since we didn't consider the side effect.

> differently here because then the behavior would be even harder to

No, I do want to have difference because it's per-process hint. IOW,
what he know is for only his context, not others so it shouldn't clean
others' pte. That makes difference between LRU aging and the hint.

> understand.

It's not hard to understand.. MADV_PAGEOUT should consider only his
context since it's per-process hint(Even, he couldn't know others'
context) so it shouldn't bother others.

Actually, Dave's suggestion is correct to fix the issue if there
was no isse with side channel attack. However, due to the attack
issue, page_mapcount could prevent the problem effectively.
That's why I am not against of the patch now since it fixes
the bug as well as vulnerability.


  reply index

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-10 18:08 Jann Horn
2020-03-10 18:48 ` Michal Hocko
2020-03-10 19:11   ` Jann Horn
2020-03-10 21:09     ` Michal Hocko
2020-03-10 22:48       ` Dave Hansen
2020-03-11  8:45         ` Michal Hocko
2020-03-11 22:02           ` Minchan Kim
2020-03-11 23:53           ` Shakeel Butt
2020-03-12  0:18             ` Minchan Kim
2020-03-12  2:03               ` Daniel Colascione
2020-03-12 15:15                 ` Shakeel Butt
2020-03-10 20:19   ` Daniel Colascione
2020-03-10 21:40     ` Jann Horn
2020-03-10 21:52       ` Daniel Colascione
2020-03-10 22:14 ` Minchan Kim
2020-03-12  8:22 ` Michal Hocko
2020-03-12 15:40   ` Vlastimil Babka
2020-03-12 20:16   ` Minchan Kim
2020-03-12 20:26     ` Dave Hansen
2020-03-12 20:41     ` Michal Hocko
2020-03-13  2:08       ` Minchan Kim
2020-03-13  8:05         ` Michal Hocko
2020-03-13 20:59           ` Minchan Kim
2020-03-16  9:20             ` Michal Hocko
2020-03-17  1:43               ` Minchan Kim [this message]
2020-03-17  7:12                 ` Michal Hocko
2020-03-17 15:00                   ` Minchan Kim
2020-03-17 15:58                     ` Michal Hocko
2020-03-17 17:20                       ` Minchan Kim
2020-03-12 21:41     ` Dave Hansen
2020-03-13  2:00       ` Minchan Kim
2020-03-13 16:59         ` Dave Hansen
2020-03-13 21:13           ` Minchan Kim
2020-03-12 23:29     ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200317014340.GA73302@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@intel.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git