All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Wang Yugui <wangyugui@e16-tech.com>,
	Matthew Wilcox <willy@infradead.org>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Alistair Popple <apopple@nvidia.com>,
	Ralph Campbell <rcampbell@nvidia.com>, Zi Yan <ziy@nvidia.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Minchan Kim <minchan@kernel.org>, Jue Wang <juew@google.com>,
	Peter Xu <peterx@redhat.com>, Jan Kara <jack@suse.cz>,
	Linux MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/7] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
Date: Fri, 4 Jun 2021 11:03:57 -0700	[thread overview]
Message-ID: <CAHbLzkqYyoXm1sAq7yBi3s8PbY127VbbgNGZ-5e-zqZMzFOzWA@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.11.2106031918400.12760@eggly.anvils>

On Thu, Jun 3, 2021 at 7:23 PM Hugh Dickins <hughd@google.com> wrote:
>
> On Thu, 3 Jun 2021, Yang Shi wrote:
> > On Tue, Jun 1, 2021 at 2:05 PM Hugh Dickins <hughd@google.com> wrote:
> > >
> > > Are there more places that need to be careful about pmd migration entries?
> > > None hit in practice, but several of those is_huge_zero_pmd() tests were
> > > done without checking pmd_present() first: I believe a pmd migration entry
> > > could end up satisfying that test.  Ah, the inversion of swap offset, to
> > > protect against L1TF, makes that impossible on x86; but other arches need
> > > the pmd_present() check, and even x86 ought not to apply pmd_page() to a
> > > swap-like pmd.  Fix those instances; __split_huge_pmd_locked() was not
> > > wrong to be checking with pmd_trans_huge() instead, but I think it's
> > > clearer to use pmd_present() in each instance.
> ...
> > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > > index 63ed6b25deaa..9fb7b47da87e 100644
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -1676,7 +1676,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> > >                 spin_unlock(ptl);
> > >                 if (is_huge_zero_pmd(orig_pmd))
> > >                         tlb_remove_page_size(tlb, pmd_page(orig_pmd), HPAGE_PMD_SIZE);
> > > -       } else if (is_huge_zero_pmd(orig_pmd)) {
> > > +       } else if (pmd_present(orig_pmd) && is_huge_zero_pmd(orig_pmd)) {
> >
> > If it is a huge zero migration entry, the code would fallback to the
> > "else". But IIUC the "else" case doesn't handle the huge zero page
> > correctly. It may mess up the rss counter.
>
> A huge zero migration entry?  I hope that's not something special
> that I've missed.
>
> Do we ever migrate a huge zero page - and how do we find where it's
> mapped, to insert the migration entries?  But if we do, I thought it
> would use the usual kind of pmd migration entry; and the first check
> in is_pmd_migration_entry() is !pmd_present(pmd).

I overlooked if the huge zero page is migratable or not when I was
writing the comment, just focused on the if/else if/else conditions.

I don't think huge zero page is migratable by a quick look since:
* mempolicy and numa hinting skip huge zero pmd
* other migration callsites just try to migrate LRU pages

>
> (I have to be rather careful to check such details, after getting
> burnt once by pmd_present(): which includes the "huge" bit even when
> not otherwise present, to permit races with pmdp_invalidate().
> I mentioned in private mail that I'd dropped one of my "fixes" because
> it was harmless but mistaken: I had misunderstood pmd_present().)
>
> The point here (see commit message above) is that some unrelated pmd
> migration entry could pass the is_huge_zero_pmd() test, which rushes
> off to use pmd_page() without even checking pmd_present() first.  And
> most of its users have, one way or another, checked pmd_present() first;
> but this place and a couple of others had not.

Thanks for the elaboration. Wondering whether we'd better add some
comments in the code? Someone may submit a fix patch by visual
inspection in the future due to missing these points.

>
> I'm just verifying that it's really a a huge zero pmd before handling
> its case; the "else" still does not need to handle the huge zero page.
>
> Hugh

  reply	other threads:[~2021-06-04 18:05 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 21:03 [PATCH 0/7] mm/thp: fix THP splitting unmap BUGs and related Hugh Dickins
2021-06-01 21:03 ` Hugh Dickins
2021-06-01 21:05 ` [PATCH 1/7] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry Hugh Dickins
2021-06-01 21:05   ` Hugh Dickins
2021-06-03 21:26   ` Yang Shi
2021-06-03 21:26     ` Yang Shi
2021-06-04  2:22     ` Hugh Dickins
2021-06-04  2:22       ` Hugh Dickins
2021-06-04 18:03       ` Yang Shi [this message]
2021-06-04 18:03         ` Yang Shi
2021-06-04 21:52         ` Hugh Dickins
2021-06-04 21:52           ` Hugh Dickins
2021-06-04 15:34   ` Kirill A. Shutemov
2021-06-04 21:29     ` Hugh Dickins
2021-06-04 21:29       ` Hugh Dickins
2021-06-01 21:07 ` [PATCH 2/7] mm/thp: try_to_unmap() use TTU_SYNC for safe DEBUG_VM splitting Hugh Dickins
2021-06-01 21:07   ` Hugh Dickins
2021-06-02  1:59   ` Alistair Popple
2021-06-03 21:45   ` Yang Shi
2021-06-03 21:45     ` Yang Shi
2021-06-04  2:45     ` Hugh Dickins
2021-06-04  2:45       ` Hugh Dickins
2021-06-04 18:24       ` Yang Shi
2021-06-04 18:24         ` Yang Shi
2021-06-03 21:48   ` Peter Xu
2021-06-04  2:54     ` Hugh Dickins
2021-06-04  2:54       ` Hugh Dickins
2021-06-04 14:48       ` Peter Xu
2021-06-04 22:26         ` Hugh Dickins
2021-06-04 22:26           ` Hugh Dickins
2021-06-04 15:47       ` Kirill A. Shutemov
2021-06-01 21:09 ` [PATCH 3/7] mm/thp: fix vma_address() if virtual address below file offset Hugh Dickins
2021-06-01 21:09   ` Hugh Dickins
2021-06-01 21:30   ` Matthew Wilcox
2021-06-03 21:36     ` Hugh Dickins
2021-06-03 21:36       ` Hugh Dickins
2021-06-03 21:40       ` [PATCH v2 " Hugh Dickins
2021-06-03 21:40         ` Hugh Dickins
2021-06-04 15:53         ` Kirill A. Shutemov
2021-06-04 17:36         ` Matthew Wilcox
2021-06-04 22:35           ` Hugh Dickins
2021-06-04 22:35             ` Hugh Dickins
2021-06-01 21:11 ` [PATCH 4/7] mm/thp: fix page_address_in_vma() on file THP tails Hugh Dickins
2021-06-01 21:11   ` Hugh Dickins
2021-06-01 21:32   ` Matthew Wilcox
2021-06-03 22:06   ` Yang Shi
2021-06-03 22:06     ` Yang Shi
2021-06-04 15:54   ` Kirill A. Shutemov
2021-06-01 21:13 ` [PATCH 5/7] mm/thp: fix page_vma_mapped_walk() if huge page mapped by ptes Hugh Dickins
2021-06-01 21:13   ` Hugh Dickins
2021-06-04 16:24   ` Kirill A. Shutemov
2021-06-04 17:42     ` Matthew Wilcox
2021-06-04 22:56     ` Hugh Dickins
2021-06-04 22:56       ` Hugh Dickins
2021-06-01 21:15 ` [PATCH 6/7] mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page() Hugh Dickins
2021-06-01 21:15   ` Hugh Dickins
2021-06-04 16:39   ` Kirill A. Shutemov
2021-06-04 23:07     ` Hugh Dickins
2021-06-04 23:07       ` Hugh Dickins
2021-06-01 21:17 ` [PATCH 7/7] mm/thp: remap_page() is only needed on anonymous THP Hugh Dickins
2021-06-01 21:17   ` Hugh Dickins
2021-06-03 22:09   ` Yang Shi
2021-06-03 22:09     ` Yang Shi
2021-06-04 16:41   ` Kirill A. Shutemov
2021-06-02  2:07 ` [PATCH 0/7] mm/thp: fix THP splitting unmap BUGs and related Alistair Popple
2021-06-03 22:21 ` Hugh Dickins
2021-06-03 22:21   ` Hugh Dickins
2021-06-03 23:03   ` Andrew Morton
2021-06-03 22:26 ` [PATCH 6.1/7] mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split Hugh Dickins
2021-06-03 22:26   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHbLzkqYyoXm1sAq7yBi3s8PbY127VbbgNGZ-5e-zqZMzFOzWA@mail.gmail.com \
    --to=shy828301@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=juew@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=peterx@redhat.com \
    --cc=rcampbell@nvidia.com \
    --cc=wangyugui@e16-tech.com \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.