From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
David Rientjes <rientjes@google.com>,
Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
Christoph Lameter <cl@gentwo.org>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Steve Capper <steve.capper@linaro.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
Jerome Marchand <jmarchan@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: page-flags behavior on compound pages: a worry
Date: Thu, 6 Aug 2015 18:33:00 +0300 [thread overview]
Message-ID: <20150806153259.GA2834@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1508052001350.6404@eggly.anvils>
On Wed, Aug 05, 2015 at 09:15:57PM -0700, Hugh Dickins wrote:
> Hi Kirill,
>
> I had a nasty thought this morning.
Tough day.
I'm trying to wrap my head around this mail and not sure if I succeed
much. :-|
> Andrew had prodded me gently to re-examine my concerns with your
> page-flags rework in mmotm. I still dislike the bloat (my mm/built-in.o
> text goes up from 478513 to 490183 bytes on a non-DEBUG_VM build); but I
> was hoping to set that aside, to let us move forward.
>
> But looking into the bloat led me to what seems a more serious issue
> with it. I'd tacked a little function on to the end of mm/filemap.c:
>
> bool page_is_locked(struct page *page)
> {
> return !!PageLocked(page);
> }
>
> which came out as:
>
> 0000000000003a60 <page_is_locked>:
> 3a60: 48 8b 07 mov (%rdi),%rax
> 3a63: 55 push %rbp
> 3a64: 48 89 e5 mov %rsp,%rbp
>
> [instructions above same as without your patches; those below added by them]
>
> 3a67: f6 c4 80 test $0x80,%ah
> 3a6a: 74 10 je 3a7c <page_is_locked+0x1c>
> 3a6c: 48 8b 47 30 mov 0x30(%rdi),%rax
> 3a70: 48 8b 17 mov (%rdi),%rdx
> 3a73: 80 e6 80 and $0x80,%dh
> 3a76: 48 0f 44 c7 cmove %rdi,%rax
> 3a7a: eb 03 jmp 3a7f <page_is_locked+0x1f>
> 3a7c: 48 89 f8 mov %rdi,%rax
> 3a7f: 48 8b 00 mov (%rax),%rax
>
> [instructions above added by your patches; those below same as before]
>
> 3a82: 5d pop %rbp
> 3a83: 83 e0 01 and $0x1,%eax
> 3a86: c3 retq
>
> The "and $0x80,%dh" looked superfluous at first, but of course it isn't:
> it's from the smp_rmb() in David's 668f9abbd433 "mm: close PageTail race"
> (a later commit refactors compound_head() but doesn't change the story).
>
> And it's that race, or a worse race of that kind, that now worries me.
> Relying on smp_wmb() and smp_rmb() may be all that was needed in the
> case that David was fixing; and (I dare not look at them to audit!)
> all uses of compound_head() in our current v4.2-rc tree may well be
> safe, for this or that contingent reason in each place that it's used.
>
> But there is no locking within compound_head(page) to make it safe
> everywhere, yet your page-flags rework is changing a large number
> of PageWhatever()s and SetPageWhatever()s and ClearPageWhatever()s
> now to do a hidden compound_head(page) beneath the covers.
>
> To be more specific: if preemption, or an interrupt, or entry to SMM
> mode, or whatever, delays this thread somewhere in that compound_head()
> sequence of instructions, how can we be sure that the "head" returned
> by compound_head() is good? We know the page was PageTail just before
> looking up page->first_page, and we know it was PageTail just after,
> but we don't know that it was PageTail throughout, and we don't know
> whether page->first_page is even a good page pointer, or something
> else from the private/ptl/slab_cache union.
That looks like a very valid worry to me. For current -mm tree.
But let's take my refcounting rework into picture.
One thing it simplifies is protection against splitting. Once you've got a
reference to a page, it cannot be split under you. It makes PageTail() and
->first_page stable for most callsites.
We can access the page's flags under ptl, without having reference the
page. And that's fine: ptl protects against splitting too.
Fast GUP also have a way to protect against split.
IIUC, the only potentially problematic callsites left are physical memory
scanners. This code requires audit. I'll do that.
Do I miss something else?
> Of course it would be very rare for it to go wrong; and most callsites
> will obviously be safe for this or that reason; though, sadly, none of
> them safe from holding a reference to the tail page in question, since
> its count is frozen at 0 and cannot be grabbed by get_page_unless_zero.
Do you mean that grabbing head page's ->_count is not enough to protect
against splitting and freeing tail page under you?
I know a patchset which solves this! ;)
> But I don't see how it can be safe to rely on compound_head() inside
> a general purpose page-flag function, that we're all accustomed to
> think of as a simple bitop, that can be applied without great care.
>
> Hugh
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kirill A. Shutemov
next prev parent reply other threads:[~2015-08-06 15:33 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 17:08 [PATCH 00/16] Sanitize usage of ->flags and ->mapping for tail pages Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 01/16] mm: consolidate all page-flags helpers in <linux/page-flags.h> Kirill A. Shutemov
2015-03-23 0:10 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 02/16] page-flags: trivial cleanup for PageTrans* helpers Kirill A. Shutemov
2015-03-23 0:12 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 03/16] page-flags: introduce page flags policies wrt compound pages Kirill A. Shutemov
2015-03-20 20:35 ` Andrew Morton
2015-03-20 21:34 ` Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 04/16] page-flags: define PG_locked behavior on " Kirill A. Shutemov
2015-03-27 15:11 ` Mateusz Krawczuk
2015-03-27 15:13 ` Mateusz Krawczuk
2015-03-27 16:37 ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-08-06 4:15 ` page-flags behavior on compound pages: a worry Hugh Dickins
2015-08-06 15:33 ` Kirill A. Shutemov [this message]
2015-08-06 19:24 ` Hugh Dickins
2015-08-06 20:45 ` Christoph Lameter
2015-08-07 14:50 ` Kirill A. Shutemov
2015-08-07 15:28 ` Christoph Lameter
2015-08-10 11:09 ` Kirill A. Shutemov
2015-08-10 13:50 ` Christoph Lameter
2015-08-07 14:49 ` Kirill A. Shutemov
2015-08-13 5:10 ` Hugh Dickins
2015-08-12 14:35 ` Kirill A. Shutemov
2015-08-12 14:47 ` Vlastimil Babka
2015-08-12 21:16 ` Andrew Morton
2015-08-12 22:21 ` Kirill A. Shutemov
2015-08-13 4:12 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 05/16] page-flags: define behavior of FS/IO-related flags on compound pages Kirill A. Shutemov
2015-03-19 18:29 ` Dave Hansen
2015-03-19 20:02 ` Kirill A. Shutemov
2015-03-23 0:02 ` Hugh Dickins
2015-03-23 12:17 ` Kirill A. Shutemov
2015-03-24 22:54 ` Hugh Dickins
2015-03-25 10:23 ` Kirill A. Shutemov
2015-03-25 18:56 ` Hugh Dickins
2015-03-19 17:08 ` [PATCH 06/16] page-flags: define behavior of LRU-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 07/16] page-flags: define behavior SL*B-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 08/16] page-flags: define behavior of Xen-related " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 09/16] page-flags: define PG_reserved behavior " Kirill A. Shutemov
2020-01-31 15:24 ` Chris Wilson
2020-02-03 15:18 ` Kirill A. Shutemov
2020-02-03 15:24 ` Chris Wilson
2020-02-03 17:10 ` David Hildenbrand
2020-02-03 17:29 ` Christoph Hellwig
2015-03-19 17:08 ` [PATCH 10/16] page-flags: define PG_swapbacked " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 11/16] page-flags: define PG_swapcache " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 12/16] page-flags: define PG_mlocked " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 13/16] page-flags: define PG_uncached " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 14/16] page-flags: define PG_uptodate " Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 15/16] page-flags: look on head page if the flag is encoded in page->mapping Kirill A. Shutemov
2015-03-19 17:08 ` [PATCH 16/16] mm: sanitize page->mapping for tail pages Kirill A. Shutemov
2015-03-23 0:28 ` [PATCH 00/16] Sanitize usage of ->flags and ->mapping " Hugh Dickins
2015-03-23 10:04 ` Kirill A. Shutemov
2015-03-24 23:42 ` Hugh Dickins
2015-03-25 10:55 ` Kirill A. Shutemov
2015-03-24 17:39 ` Konstantin Khlebnikov
2015-03-24 20:04 ` Kirill A. Shutemov
2015-07-15 20:20 ` Christoph Lameter
2015-07-15 21:18 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806153259.GA2834@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cl@gentwo.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jmarchan@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=steve.capper@linaro.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).