On Wed, 2020-10-21 at 16:09 +0800, Xu Yu wrote: > @@ -1887,6 +1930,7 @@ static int shmem_getpage_gfp(struct inode > *inode, pgoff_t index, > } > > alloc_huge: > + gfp = shmem_hugepage_gfpmask_fixup(gfp, sgp_huge); > page = shmem_alloc_and_acct_page(gfp, inode, index, true); > if (IS_ERR(page)) { > alloc_nohuge: This looks it could be a bug, because the changed gfp flags are also used for the non-huge allocation below the alloc_nohuge: label, when the huge allocation fails. Using a separate huge_gfp variable would solve that issue. However, your patch also changes the meaning of SHMEM_HUGE_FORCE from "override mount flags" to "aggressively try reclaim and compaction", which mixes up the equivalents of the anon THP sysctl "enabled" and "defrag" settings. I believe it makes sense to continue keeping the "what should khugepaged do with these pages?" and "how hard should we try at allocation time?" settings separately for shmem the same way they are kept separately for anonymous memory. I also suspect it is simplest if shmem uses the same "how hard should we try at allocation time?" settings from the "defrag" sysfs file, instead of giving system administrators two knobs that they will likely want to set to the same value anyway. Coincidentally, I have been looking at the same code on and off for the past month, and also sent a patch to the list to fix this issue yesterday. I suspect my patch can be simplified a little more by directly using alloc_hugepage_direct_gfpmask to create a huge_gfp flag in shmem_getpage_gfp. https://lore.kernel.org/linux-mm/20201021234846.5cc97e62@imladris.surriel.com/ -- All Rights Reversed.