From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org
Subject: Re: [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem()
Date: Wed, 28 Nov 2018 13:59:12 +0300 [thread overview]
Message-ID: <20181128105911.ggngeqq5xevxpmsk@kshutemo-mobl1> (raw)
In-Reply-To: <alpine.LSU.2.11.1811271121410.4027@eggly.anvils>
On Tue, Nov 27, 2018 at 12:23:32PM -0800, Hugh Dickins wrote:
> On Tue, 27 Nov 2018, Kirill A. Shutemov wrote:
> > On Mon, Nov 26, 2018 at 03:27:52PM -0800, Hugh Dickins wrote:
> > > Several cleanups in collapse_shmem(): most of which probably do not
> > > really matter, beyond doing things in a more familiar and reassuring
> > > order. Simplify the failure gotos in the main loop, and on success
> > > update stats while interrupts still disabled from the last iteration.
> > >
> > > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > > Signed-off-by: Hugh Dickins <hughd@google.com>
> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Cc: stable@vger.kernel.org # 4.8+
> > > ---
> > > mm/khugepaged.c | 72 ++++++++++++++++++++++---------------------------
> > > 1 file changed, 32 insertions(+), 40 deletions(-)
> > >
> > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > index 1c402d33547e..9d4e9ff1af95 100644
> > > --- a/mm/khugepaged.c
> > > +++ b/mm/khugepaged.c
> > > @@ -1329,10 +1329,10 @@ static void collapse_shmem(struct mm_struct *mm,
> > > goto out;
> > > }
> > >
> > > + __SetPageLocked(new_page);
> > > + __SetPageSwapBacked(new_page);
> > > new_page->index = start;
> > > new_page->mapping = mapping;
> > > - __SetPageSwapBacked(new_page);
> > > - __SetPageLocked(new_page);
> > > BUG_ON(!page_ref_freeze(new_page, 1));
> > >
> > > /*
> > > @@ -1366,13 +1366,13 @@ static void collapse_shmem(struct mm_struct *mm,
> > > if (index == start) {
> > > if (!xas_next_entry(&xas, end - 1)) {
> > > result = SCAN_TRUNCATED;
> > > - break;
> > > + goto xa_locked;
> > > }
> > > xas_set(&xas, index);
> > > }
> > > if (!shmem_charge(mapping->host, 1)) {
> > > result = SCAN_FAIL;
> > > - break;
> > > + goto xa_locked;
> > > }
> > > xas_store(&xas, new_page + (index % HPAGE_PMD_NR));
> > > nr_none++;
> > > @@ -1387,13 +1387,12 @@ static void collapse_shmem(struct mm_struct *mm,
> > > result = SCAN_FAIL;
> > > goto xa_unlocked;
> > > }
> > > - xas_lock_irq(&xas);
> > > - xas_set(&xas, index);
> > > } else if (trylock_page(page)) {
> > > get_page(page);
> > > + xas_unlock_irq(&xas);
> > > } else {
> > > result = SCAN_PAGE_LOCK;
> > > - break;
> > > + goto xa_locked;
> > > }
> > >
> > > /*
> >
> > I'm puzzled by locking change here.
>
> The locking change here is to not re-get xas_lock_irq (in shmem_getpage
> case) just before we drop it anyway: you point out that it used to cover
> /*
> * The page must be locked, so we can drop the i_pages lock
> * without racing with truncate.
> */
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> VM_BUG_ON_PAGE(!PageUptodate(page), page);
> VM_BUG_ON_PAGE(PageTransCompound(page), page);
> if (page_mapping(page) != mapping) {
> but now does not.
>
> But the comment you wrote there originally (ah, git blame shows
> that Matthew has made it say i_pages lock instead of tree_lock),
> "The page must be locked, so we can drop...", was correct all along,
> I'm just following what it says.
>
> It would wrong if the trylock_page came after the xas_unlock_irq,
> but it comes before (as before): holding i_pages lock across the
> lookup makes sure we look up the right page (no RCU racing) and
> trylock_page makes sure that it cannot be truncated or hole-punched
> or migrated or whatever from that point on - so can drop i_pages lock.
You are right. I confused myself.
> Actually, I think we could VM_BUG_ON(page_mapping(page) != mapping),
> couldn't we? Not that I propose to make such a change at this stage.
Yeah it should be safe. We may put WARN there.
> > Isn't the change responsible for the bug you are fixing in 09/10?
>
> In which I relaxed the VM_BUG_ON_PAGE(PageTransCompound(page), page)
> that appears in the sequence above.
>
> Well, what I was thinking of in 9/10 was a THP being inserted at some
> stage between selecting this range for collapse and reaching the last
> (usually first) xas_lock_irq(&xas) in the "This will be less messy..."
> loop above: I don't see any locking against that possibility. (And it
> has to be that initial xas_lock_irq(&xas), because once the PageLocked
> head of new_page is inserted in the i_pages tree, there is no more
> chance for truncation and a competing THP to be inserted there.)
>
> So 9/10 would be required anyway; but you're thinking that the page
> we looked up under i_pages lock and got trylock_page on, could then
> become Compound once i_pages lock is dropped? I don't think so: pages
> don't become Compound after they've left the page allocator, do they?
> And if we ever manage to change that, I'm pretty sure it would be with
> page locks held and page refcounts frozen.
> > IIRC, my intend for the locking scheme was to protect against
> > truncate-repopulate race.
> >
> > What do I miss?
>
> The stage in between selecting the range for collapse, and getting
> the initial i_pages lock? Pages not becoming Compound underneath
> you, with or without page lock, with or without i_pages lock? Page
> lock being sufficient protection against truncation and migration?
Agreed on all fronts. Sorry for the noise.
> > The rest of the patch *looks* okay, but I found it hard to follow.
> > Splitting it up would make it easier.
>
> It needs some time, I admit: thanks a lot for persisting with it.
> And thanks (to you and to Matthew) for the speedy Acks elsewhere.
>
> Hugh
--
Kirill A. Shutemov
next prev parent reply other threads:[~2018-11-28 10:59 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-26 23:13 [PATCH 00/10] huge_memory,khugepaged tmpfs split/collapse fixes Hugh Dickins
2018-11-26 23:19 ` [PATCH 02/10] mm/huge_memory: splitting set mapping+index before unfreeze Hugh Dickins
2018-11-27 7:26 ` Kirill A. Shutemov
2018-11-26 23:21 ` [PATCH 03/10] mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Hugh Dickins
2018-11-27 7:27 ` Kirill A. Shutemov
2018-11-26 23:23 ` [PATCH 04/10] mm/khugepaged: collapse_shmem() stop if punched or truncated Hugh Dickins
2018-11-27 7:31 ` Kirill A. Shutemov
2018-11-27 13:26 ` Matthew Wilcox
2018-11-26 23:25 ` [PATCH 05/10] mm/khugepaged: fix crashes due to misaccounted holes Hugh Dickins
2018-11-27 7:35 ` Kirill A. Shutemov
2018-11-26 23:26 ` [PATCH 06/10] mm/khugepaged: collapse_shmem() remember to clear holes Hugh Dickins
2018-11-27 7:38 ` Kirill A. Shutemov
2018-11-26 23:27 ` [PATCH 07/10] mm/khugepaged: minor reorderings in collapse_shmem() Hugh Dickins
2018-11-27 7:59 ` Kirill A. Shutemov
2018-11-27 20:23 ` Hugh Dickins
2018-11-28 10:59 ` Kirill A. Shutemov [this message]
2018-11-28 19:40 ` Hugh Dickins
2018-11-26 23:29 ` [PATCH 08/10] mm/khugepaged: collapse_shmem() without freezing new_page Hugh Dickins
2018-11-27 8:01 ` Kirill A. Shutemov
2018-11-26 23:31 ` [PATCH 09/10] mm/khugepaged: collapse_shmem() do not crash on Compound Hugh Dickins
2018-11-26 23:33 ` [PATCH 10/10] mm/khugepaged: fix the xas_create_range() error path Hugh Dickins
2018-11-27 8:02 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181128105911.ggngeqq5xevxpmsk@kshutemo-mobl1 \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).