On 6 Nov 2017, at 15:35, Andrea Arcangeli wrote: > On Mon, Nov 06, 2017 at 10:53:48AM -0500, Zi Yan wrote: >> Thanks for clarifying it. We both agree that !pmd_present(), which means >> PMD migration entry, does not get into userfaultfd_must_wait(), >> then there seems to be no issue with current code yet. >> >> However, the if (!pmd_present(_pmd)) in userfaultfd_must_wait() does not >> match >> the exact condition. How about the patch below? It can catch pmd >> migration entries, >> which are only possible in x86_64 at the moment. >> >> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c >> index 1c713fd5b3e6..dda25444a6ee 100644 >> --- a/fs/userfaultfd.c >> +++ b/fs/userfaultfd.c >> @@ -294,9 +294,11 @@ static inline bool userfaultfd_must_wait(struct >> userfaultfd_ctx *ctx, >> * pmd_trans_unstable) of the pmd. >> */ >> _pmd = READ_ONCE(*pmd); >> - if (!pmd_present(_pmd)) >> + if (pmd_none(_pmd)) >> goto out; >> >> + VM_BUG_ON(thp_migration_supported() && is_pmd_migration_entry(_pmd)); >> + > > As I wrote in prev email I'm not sure about this invariant to be > correct 100% of the time (plus we'd want a VM_WARN_ON only > here). Specifically, what does prevent try_to_unmap to run on a THP > backed mapping with only the mmap_sem for reading? > Right. I missed the part that the page table lock is released before entering handle_userfault(). The pmd_none() can be mapped elsewhere and migrated, !pmd_present() but not pmd_none() is possible here when THP migration is enabled. > I know what prevents to ever reproduce this in practice though (aside > from the fact the race between the is_swap_pmd() check in the main > page fault and the above check is small) and it's because compaction > won't migrate THP and even the numa faults will not use the migration > entry. So it'd require some more explicit migration numactl while > userfaults are running to ever see an hang in there. > > I think it's a regression since the introduction of THP migration > around commits 84c3fc4e9c563d8fb91cfdf5948da48fe1af34d3 / > 616b8371539a6c487404c3b8fb04078016dab4ba / > 9c670ea37947a82cb6d4df69139f7e46ed71a0ac etc.. before that pmd_none or > !pmd_present used to be equivalent, not the case any longer. Of course > pmd_none would have been better before too. > Right. Ying’s patch is a fix of the regression. Fixes: 84c3fc4e9c563 ("mm: thp: check pmd migration entry in common path") Thanks for pointing all these out. — Best Regards, Yan Zi