linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Steve Capper <steve.capper@linaro.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>,
	Jerome Marchand <jmarchan@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: page-flags behavior on compound pages: a worry
Date: Thu, 6 Aug 2015 12:24:22 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.1508061121120.7500@eggly.anvils> (raw)
In-Reply-To: <20150806153259.GA2834@node.dhcp.inet.fi>

On Thu, 6 Aug 2015, Kirill A. Shutemov wrote:
> On Wed, Aug 05, 2015 at 09:15:57PM -0700, Hugh Dickins wrote:
> > Hi Kirill,
> > 
> > I had a nasty thought this morning.
>  
> Tough day.
> 
> I'm trying to wrap my head around this mail and not sure if I succeed
> much. :-|

Sorry for not being clearer.

> > To be more specific: if preemption, or an interrupt, or entry to SMM
> > mode, or whatever, delays this thread somewhere in that compound_head()
> > sequence of instructions, how can we be sure that the "head" returned
> > by compound_head() is good?  We know the page was PageTail just before
> > looking up page->first_page, and we know it was PageTail just after,
> > but we don't know that it was PageTail throughout, and we don't know
> > whether page->first_page is even a good page pointer, or something
> > else from the private/ptl/slab_cache union.
> 
> That looks like a very valid worry to me. For current -mm tree.
> 
> But let's take my refcounting rework into picture.

Okay, let's do so.  I get very confused trying to think based on two
alternative schemes at the same time, so I'm happy to assume your
THP refcounting rework (which certainly has plenty to like in it
from the point of view of cleanup - though at present I think the
mlock splitting leaves it with a net regression in functionality).

That does say that this page-flags rework should not go to Linus
separately from your refcounting rework: I don't think the issues
here are ever likely to break someone's bisection, so it's fine for
the one series to precede the other, but any release should contain
both or neither.

> 
> One thing it simplifies is protection against splitting. Once you've got a
> reference to a page, it cannot be split under you. It makes PageTail() and
> ->first_page stable for most callsites.

Yes, but since you cannot acquire a reference to a tail page itself
(since it has count 0 throughout), don't you mean there that you
already hold a reference to the head?

In which case, why bother to make all the PageFlags operations on
tails redirect to the head: if the caller must hold a reference to
the head, then the caller should apply PageFlags to that head, with
no need for compound_head() redirection inside the operation, just a
VM_BUG_ON(PageTail).

Or so it seems from the outside: perhaps that becomes unworkable
somehow once you try to implement it.

> 
> We can access the page's flags under ptl, without having reference the
> page. And that's fine: ptl protects against splitting too.
> 
> Fast GUP also have a way to protect against split.

Yes and yes.  Perhaps it's those accesses under ptl which took you
in this compound_head-inside-PageFlags direction.  Fast GUP is easy
to do something special in, but there's probably a lot of scattered
PageFlags operations under ptl, which were tiresome to fiddle with
when you came to allow pte mappings of THP subpages.

> 
> IIUC, the only potentially problematic callsites left are physical memory
> scanners. This code requires audit. I'll do that.

Please.

>  
> Do I miss something else?

Probably not; but please check - and I'm afraid you've set things up
so that every use of a PageFlags operation needs to be thought about,
if only briefly.

It's certainly the physical approaches to a page (isolation, compaction,
formerly lumpy reclaim, are there others?  /proc things?) which have
always been very tricky to get right.

I think it was for those that David added the barriered double PageTail
checking.  I wonder if something extra special should be done just there,
in the physical scans; and the barriered double PageTail checking avoided
elsewhere, in the normal places that you reckon are safe already.

Mind you, shifting the unlikely PageTail handling out of line to a
called function would reduce the bloat considerably, then maybe it
wouldn't matter how complicated it gets for the general case.

> 
> > Of course it would be very rare for it to go wrong; and most callsites
> > will obviously be safe for this or that reason; though, sadly, none of
> > them safe from holding a reference to the tail page in question, since
> > its count is frozen at 0 and cannot be grabbed by get_page_unless_zero.
> 
> Do you mean that grabbing head page's ->_count is not enough to protect
> against splitting and freeing tail page under you?

No, I mean that if you know head already then why are you bothering with
tail; and if you only have tail, then locating head in all the cases where
the PageFlags operation might be called may be unsafe in a few of them.

And that it's not possible to acquire a reference to the tail page to
make this safe.  But I accept your point above, that the existence of
a pte in a locked page table amounts to a stable reference, even though
it does not contribute to that tail page's reference count.

> 
> I know a patchset which solves this! ;)

Oh, and I know a patchset which avoids these problems completely,
by not using compound pages at all ;)

Hugh

  reply	other threads:[~2015-08-06 19:25 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-19 17:08 [PATCH 00/16] Sanitize usage of ->flags and ->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 01/16] mm: consolidate all page-flags helpers in <linux/page-flags.h> Kirill A. Shutemov
2015-03-23  0:10   ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 02/16] page-flags: trivial cleanup for PageTrans* helpers Kirill A. Shutemov
2015-03-23  0:12   ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 03/16] page-flags: introduce page flags policies wrt compound pages Kirill A. Shutemov
2015-03-20 20:35   ` Andrew Morton
2015-03-20 21:34     ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 04/16] page-flags: define PG_locked behavior on " Kirill A. Shutemov
2015-03-27 15:11   ` Mateusz Krawczuk
2015-03-27 15:13   ` Mateusz Krawczuk
2015-03-27 16:37     ` Kirill A. Shutemov
2015-07-15 20:20   ` Christoph Lameter
2015-08-06  4:15   ` page-flags behavior on compound pages: a worry Hugh Dickins
2015-08-06 15:33     ` Kirill A. Shutemov
2015-08-06 19:24       ` Hugh Dickins [this message]
2015-08-06 20:45         ` Christoph Lameter
2015-08-07 14:50           ` Kirill A. Shutemov
2015-08-07 15:28             ` Christoph Lameter
2015-08-10 11:09               ` Kirill A. Shutemov
2015-08-10 13:50                 ` Christoph Lameter
2015-08-07 14:49         ` Kirill A. Shutemov
2015-08-13  5:10           ` Hugh Dickins
2015-08-12 14:35         ` Kirill A. Shutemov
2015-08-12 14:47           ` Vlastimil Babka
2015-08-12 21:16           ` Andrew Morton
2015-08-12 22:21             ` Kirill A. Shutemov
2015-08-13  4:12               ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages Kirill A. Shutemov
2015-03-19 18:29   ` Dave Hansen
2015-03-19 20:02     ` Kirill A. Shutemov
2015-03-23  0:02       ` Hugh Dickins
2015-03-23 12:17         ` Kirill A. Shutemov
2015-03-24 22:54           ` Hugh Dickins
2015-03-25 10:23             ` Kirill A. Shutemov
2015-03-25 18:56               ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 06/16] page-flags: define behavior of LRU-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 07/16] page-flags: define behavior SL*B-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 08/16] page-flags: define behavior of Xen-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 09/16] page-flags: define PG_reserved behavior " Kirill A. Shutemov
2020-01-31 15:24   ` Chris Wilson
2020-02-03 15:18     ` Kirill A. Shutemov
2020-02-03 15:24       ` Chris Wilson
2020-02-03 17:10         ` David Hildenbrand
2020-02-03 17:29       ` Christoph Hellwig
2015-03-19 17:08 ` [PATCH 10/16] page-flags: define PG_swapbacked " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 11/16] page-flags: define PG_swapcache " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 12/16] page-flags: define PG_mlocked " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 13/16] page-flags: define PG_uncached " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 14/16] page-flags: define PG_uptodate " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 15/16] page-flags: look on head page if the flag is encoded in page->mapping Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 16/16] mm: sanitize page->mapping for tail pages Kirill A. Shutemov
2015-03-23  0:28 ` [PATCH 00/16] Sanitize usage of ->flags and ->mapping " Hugh Dickins
2015-03-23 10:04   ` Kirill A. Shutemov
2015-03-24 23:42     ` Hugh Dickins
2015-03-25 10:55       ` Kirill A. Shutemov
2015-03-24 17:39 ` Konstantin Khlebnikov
2015-03-24 20:04   ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 21:18   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1508061121120.7500@eggly.anvils \
    --to=hughd@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cl@gentwo.org \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jmarchan@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=steve.capper@linaro.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).