* Re: [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
@ 2022-01-30 18:03 kernel test robot
0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2022-01-30 18:03 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 22826 bytes --]
CC: llvm(a)lists.linux.dev
CC: kbuild-all(a)lists.01.org
In-Reply-To: <20220128131006.67712-19-michel@lespinasse.org>
References: <20220128131006.67712-19-michel@lespinasse.org>
TO: Michel Lespinasse <michel@lespinasse.org>
TO: "Linux-MM" <linux-mm@kvack.org>
TO: linux-kernel(a)vger.kernel.org
TO: Andrew Morton <akpm@linux-foundation.org>
CC: kernel-team(a)fb.com
CC: Laurent Dufour <ldufour@linux.ibm.com>
CC: Jerome Glisse <jglisse@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Michal Hocko <mhocko@suse.com>
CC: Vlastimil Babka <vbabka@suse.cz>
CC: Davidlohr Bueso <dave@stgolabs.net>
Hi Michel,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.17-rc1 next-20220128]
[cannot apply to tip/x86/mm arm64/for-next/core powerpc/next hnaz-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Michel-Lespinasse/Speculative-page-faults/20220128-212122
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 145d9b498fc827b79c1260b4caa29a8e59d4c2b9
:::::: branch date: 2 days ago
:::::: commit date: 2 days ago
config: x86_64-randconfig-c007-20220124 (https://download.01.org/0day-ci/archive/20220131/202201310126.IymdD4Vv-lkp(a)intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 33b45ee44b1f32ffdbc995e6fec806271b4b3ba4)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/fa5331bae2e49ce86eff959390b451b7401f9156
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Michel-Lespinasse/Speculative-page-faults/20220128-212122
git checkout fa5331bae2e49ce86eff959390b451b7401f9156
# save the config file to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 clang-analyzer
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
clang-analyzer warnings: (new ones prefixed by >>)
^~~~~~~~~~~~~~
mm/memory.c:2576:22: note: Left side of '&&' is false
if (pmd_none(*pmd) && !create)
^
mm/memory.c:2578:7: note: Assuming the condition is true
if (WARN_ON_ONCE(pmd_leaf(*pmd)))
^
include/asm-generic/bug.h:104:23: note: expanded from macro 'WARN_ON_ONCE'
int __ret_warn_on = !!(condition); \
^~~~~~~~~~~~
mm/memory.c:2578:7: note: Taking false branch
if (WARN_ON_ONCE(pmd_leaf(*pmd)))
^
include/asm-generic/bug.h:105:2: note: expanded from macro 'WARN_ON_ONCE'
if (unlikely(__ret_warn_on)) \
^
mm/memory.c:2578:3: note: Taking false branch
if (WARN_ON_ONCE(pmd_leaf(*pmd)))
^
mm/memory.c:2580:8: note: Calling 'pmd_none'
if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) {
^~~~~~~~~~~~~~
arch/x86/include/asm/pgtable.h:797:2: note: Returning zero, which participates in a condition later
return (val & ~_PAGE_KNL_ERRATUM_MASK) == 0;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/memory.c:2580:8: note: Returning from 'pmd_none'
if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) {
^~~~~~~~~~~~~~
mm/memory.c:2580:7: note: Left side of '&&' is true
if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) {
^
mm/memory.c:2580:26: note: Taking false branch
if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) {
^
include/asm-generic/bug.h:105:2: note: expanded from macro 'WARN_ON_ONCE'
if (unlikely(__ret_warn_on)) \
^
mm/memory.c:2580:3: note: Taking false branch
if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) {
^
mm/memory.c:2585:9: note: Calling 'apply_to_pte_range'
err = apply_to_pte_range(mm, pmd, addr, next,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/memory.c:2520:2: note: 'ptl' declared without an initial value
spinlock_t *ptl;
^~~~~~~~~~~~~~~
mm/memory.c:2522:6: note: 'create' is false
if (create) {
^~~~~~
mm/memory.c:2522:2: note: Taking false branch
if (create) {
^
mm/memory.c:2529:23: note: Assuming the condition is true
mapped_pte = pte = (mm == &init_mm) ?
^~~~~~~~~~~~~~
mm/memory.c:2529:22: note: '?' condition is true
mapped_pte = pte = (mm == &init_mm) ?
^
mm/memory.c:2534:2: note: Taking false branch
BUG_ON(pmd_huge(*pmd));
^
include/asm-generic/bug.h:65:32: note: expanded from macro 'BUG_ON'
#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
^
mm/memory.c:2534:2: note: Loop condition is false. Exiting loop
BUG_ON(pmd_huge(*pmd));
^
include/asm-generic/bug.h:65:27: note: expanded from macro 'BUG_ON'
#define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
^
mm/memory.c:2536:2: note: Loop condition is false. Exiting loop
arch_enter_lazy_mmu_mode();
^
include/linux/pgtable.h:985:36: note: expanded from macro 'arch_enter_lazy_mmu_mode'
#define arch_enter_lazy_mmu_mode() do {} while (0)
^
mm/memory.c:2538:6: note: Assuming 'fn' is null
if (fn) {
^~
mm/memory.c:2538:2: note: Taking false branch
if (fn) {
^
mm/memory.c:2549:2: note: Loop condition is false. Exiting loop
arch_leave_lazy_mmu_mode();
^
include/linux/pgtable.h:986:36: note: expanded from macro 'arch_leave_lazy_mmu_mode'
#define arch_leave_lazy_mmu_mode() do {} while (0)
^
mm/memory.c:2551:6: note: Assuming the condition is true
if (mm != &init_mm)
^~~~~~~~~~~~~~
mm/memory.c:2551:2: note: Taking true branch
if (mm != &init_mm)
^
mm/memory.c:2552:3: note: 1st function call argument is an uninitialized value
pte_unmap_unlock(mapped_pte, ptl);
^
include/linux/mm.h:2357:2: note: expanded from macro 'pte_unmap_unlock'
spin_unlock(ptl); \
^ ~~~
>> mm/memory.c:3876:7: warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign]
if (!pte_map_lock(vmf)) {
^
include/linux/mm.h:3418:2: note: expanded from macro 'pte_map_lock'
struct vm_fault *vmf = __vmf; \
^
mm/memory.c:4940:6: note: Assuming the condition is false
if (flags & FAULT_FLAG_SPECULATIVE)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/memory.c:4940:2: note: Taking false branch
if (flags & FAULT_FLAG_SPECULATIVE)
^
mm/memory.c:4943:2: note: Taking false branch
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:204:3: note: expanded from macro '__set_current_state'
debug_normal_state_change((state_value)); \
^
include/linux/sched.h:137:3: note: expanded from macro 'debug_normal_state_change'
WARN_ON_ONCE(is_special_task_state(state_value)); \
^
include/asm-generic/bug.h:105:2: note: expanded from macro 'WARN_ON_ONCE'
if (unlikely(__ret_warn_on)) \
^
mm/memory.c:4943:2: note: Loop condition is false. Exiting loop
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:204:3: note: expanded from macro '__set_current_state'
debug_normal_state_change((state_value)); \
^
include/linux/sched.h:136:2: note: expanded from macro 'debug_normal_state_change'
do { \
^
mm/memory.c:4943:2: note: Left side of '||' is false
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:205:3: note: expanded from macro '__set_current_state'
WRITE_ONCE(current->__state, (state_value)); \
^
include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE'
compiletime_assert_rwonce_type(x); \
^
include/asm-generic/rwonce.h:36:21: note: expanded from macro 'compiletime_assert_rwonce_type'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
include/linux/compiler_types.h:313:3: note: expanded from macro '__native_word'
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
^
mm/memory.c:4943:2: note: Left side of '||' is false
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:205:3: note: expanded from macro '__set_current_state'
WRITE_ONCE(current->__state, (state_value)); \
^
include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE'
compiletime_assert_rwonce_type(x); \
^
include/asm-generic/rwonce.h:36:21: note: expanded from macro 'compiletime_assert_rwonce_type'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
include/linux/compiler_types.h:313:3: note: expanded from macro '__native_word'
(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
^
mm/memory.c:4943:2: note: Left side of '||' is true
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:205:3: note: expanded from macro '__set_current_state'
WRITE_ONCE(current->__state, (state_value)); \
^
include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE'
compiletime_assert_rwonce_type(x); \
^
include/asm-generic/rwonce.h:36:21: note: expanded from macro 'compiletime_assert_rwonce_type'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
include/linux/compiler_types.h:314:28: note: expanded from macro '__native_word'
sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
^
mm/memory.c:4943:2: note: Taking false branch
__set_current_state(TASK_RUNNING);
^
include/linux/sched.h:205:3: note: expanded from macro '__set_current_state'
WRITE_ONCE(current->__state, (state_value)); \
^
include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE'
compiletime_assert_rwonce_type(x); \
^
include/asm-generic/rwonce.h:36:2: note: expanded from macro 'compiletime_assert_rwonce_type'
compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
^
include/linux/compiler_types.h:346:2: note: expanded from macro 'compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
include/linux/compiler_types.h:334:2: note: expanded from macro '_compiletime_assert'
__compiletime_assert(condition, msg, prefix, suffix)
^
include/linux/compiler_types.h:326:3: note: expanded from macro '__compiletime_assert'
if (!(condition)) \
^
mm/memory.c:4943:2: note: Loop condition is false. Exiting loop
__set_current_state(TASK_RUNNING);
vim +3876 mm/memory.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 3808
^1da177e4c3f41 Linus Torvalds 2005-04-16 3809 /*
c1e8d7c6a7a682 Michel Lespinasse 2020-06-08 3810 * We enter with non-exclusive mmap_lock (to exclude vma changes,
8f4e2101fd7df9 Hugh Dickins 2005-10-29 3811 * but allow concurrent faults), and pte mapped but not yet locked.
c1e8d7c6a7a682 Michel Lespinasse 2020-06-08 3812 * We return with mmap_lock still held, but pte unmapped and unlocked.
^1da177e4c3f41 Linus Torvalds 2005-04-16 3813 */
2b7403035459c7 Souptick Joarder 2018-08-23 3814 static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
^1da177e4c3f41 Linus Torvalds 2005-04-16 3815 {
82b0f8c39a3869 Jan Kara 2016-12-14 3816 struct vm_area_struct *vma = vmf->vma;
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3817 struct page *page = NULL;
2b7403035459c7 Souptick Joarder 2018-08-23 3818 vm_fault_t ret = 0;
^1da177e4c3f41 Linus Torvalds 2005-04-16 3819 pte_t entry;
^1da177e4c3f41 Linus Torvalds 2005-04-16 3820
6b7339f4c31ad6 Kirill A. Shutemov 2015-07-06 3821 /* File mapping without ->vm_ops ? */
6b7339f4c31ad6 Kirill A. Shutemov 2015-07-06 3822 if (vma->vm_flags & VM_SHARED)
6b7339f4c31ad6 Kirill A. Shutemov 2015-07-06 3823 return VM_FAULT_SIGBUS;
6b7339f4c31ad6 Kirill A. Shutemov 2015-07-06 3824
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3825 /*
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3826 * Use pte_alloc() instead of pte_alloc_map(). We can't run
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3827 * pte_offset_map() on pmds where a huge pmd might be created
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3828 * from a different thread.
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3829 *
3e4e28c5a8f01e Michel Lespinasse 2020-06-08 3830 * pte_alloc_map() is safe to use under mmap_write_lock(mm) or when
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3831 * parallel threads are excluded by other means.
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3832 *
3e4e28c5a8f01e Michel Lespinasse 2020-06-08 3833 * Here we only have mmap_read_lock(mm).
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3834 */
4cf58924951ef8 Joel Fernandes (Google 2019-01-03 3835) if (pte_alloc(vma->vm_mm, vmf->pmd))
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3836 return VM_FAULT_OOM;
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3837
2fce3f44868d85 Michel Lespinasse 2022-01-28 3838 /* See comment in __handle_mm_fault() */
82b0f8c39a3869 Jan Kara 2016-12-14 3839 if (unlikely(pmd_trans_unstable(vmf->pmd)))
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3840 return 0;
7267ec008b5cd8 Kirill A. Shutemov 2016-07-26 3841
11ac552477e328 Linus Torvalds 2010-08-14 3842 /* Use the zero-page for reads */
82b0f8c39a3869 Jan Kara 2016-12-14 3843 if (!(vmf->flags & FAULT_FLAG_WRITE) &&
bae473a423f65e Kirill A. Shutemov 2016-07-26 3844 !mm_forbids_zeropage(vma->vm_mm)) {
82b0f8c39a3869 Jan Kara 2016-12-14 3845 entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
62eede62dafb4a Hugh Dickins 2009-09-21 3846 vma->vm_page_prot));
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3847 } else {
^1da177e4c3f41 Linus Torvalds 2005-04-16 3848 /* Allocate our own private page. */
fa5331bae2e49c Michel Lespinasse 2022-01-28 3849 if (unlikely(!vma->anon_vma)) {
fa5331bae2e49c Michel Lespinasse 2022-01-28 3850 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
fa5331bae2e49c Michel Lespinasse 2022-01-28 3851 return VM_FAULT_RETRY;
fa5331bae2e49c Michel Lespinasse 2022-01-28 3852 if (__anon_vma_prepare(vma))
65500d234e74fc Hugh Dickins 2005-10-29 3853 goto oom;
fa5331bae2e49c Michel Lespinasse 2022-01-28 3854 }
82b0f8c39a3869 Jan Kara 2016-12-14 3855 page = alloc_zeroed_user_highpage_movable(vma, vmf->address);
^1da177e4c3f41 Linus Torvalds 2005-04-16 3856 if (!page)
65500d234e74fc Hugh Dickins 2005-10-29 3857 goto oom;
eb3c24f305e56c Mel Gorman 2015-06-24 3858
8f425e4ed0eb3e Matthew Wilcox (Oracle 2021-06-25 3859) if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
eb3c24f305e56c Mel Gorman 2015-06-24 3860 goto oom_free_page;
9d82c69438d0df Johannes Weiner 2020-06-03 3861 cgroup_throttle_swaprate(page, GFP_KERNEL);
eb3c24f305e56c Mel Gorman 2015-06-24 3862
52f37629fd3c7b Minchan Kim 2013-04-29 3863 /*
52f37629fd3c7b Minchan Kim 2013-04-29 3864 * The memory barrier inside __SetPageUptodate makes sure that
f4f5329d453704 Wei Yang 2019-11-30 3865 * preceding stores to the page contents become visible before
52f37629fd3c7b Minchan Kim 2013-04-29 3866 * the set_pte_at() write.
52f37629fd3c7b Minchan Kim 2013-04-29 3867 */
0ed361dec36945 Nicholas Piggin 2008-02-04 3868 __SetPageUptodate(page);
^1da177e4c3f41 Linus Torvalds 2005-04-16 3869
65500d234e74fc Hugh Dickins 2005-10-29 3870 entry = mk_pte(page, vma->vm_page_prot);
50c25ee97cf6ab Thomas Bogendoerfer 2021-06-04 3871 entry = pte_sw_mkyoung(entry);
1ac0cb5d0e22d5 Hugh Dickins 2009-09-21 3872 if (vma->vm_flags & VM_WRITE)
1ac0cb5d0e22d5 Hugh Dickins 2009-09-21 3873 entry = pte_mkwrite(pte_mkdirty(entry));
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3874 }
8f4e2101fd7df9 Hugh Dickins 2005-10-29 3875
fa5331bae2e49c Michel Lespinasse 2022-01-28 @3876 if (!pte_map_lock(vmf)) {
fa5331bae2e49c Michel Lespinasse 2022-01-28 3877 ret = VM_FAULT_RETRY;
fa5331bae2e49c Michel Lespinasse 2022-01-28 3878 goto release;
fa5331bae2e49c Michel Lespinasse 2022-01-28 3879 }
7df676974359f9 Bibo Mao 2020-05-27 3880 if (!pte_none(*vmf->pte)) {
45ee1834760b3b Michel Lespinasse 2022-01-28 3881 update_mmu_tlb(vma, vmf->address, vmf->pte);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3882 goto unlock;
7df676974359f9 Bibo Mao 2020-05-27 3883 }
9ba6929480088a Hugh Dickins 2009-09-21 3884
6b31d5955cb29a Michal Hocko 2017-08-18 3885 ret = check_stable_address_space(vma->vm_mm);
6b31d5955cb29a Michal Hocko 2017-08-18 3886 if (ret)
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3887 goto unlock;
6b31d5955cb29a Michal Hocko 2017-08-18 3888
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 3889 /* Deliver the page fault to userland, check inside PT lock */
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 3890 if (userfaultfd_missing(vma)) {
82b0f8c39a3869 Jan Kara 2016-12-14 3891 pte_unmap_unlock(vmf->pte, vmf->ptl);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3892 if (page)
09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 3893 put_page(page);
fa5331bae2e49c Michel Lespinasse 2022-01-28 3894 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
fa5331bae2e49c Michel Lespinasse 2022-01-28 3895 return VM_FAULT_RETRY;
82b0f8c39a3869 Jan Kara 2016-12-14 3896 return handle_userfault(vmf, VM_UFFD_MISSING);
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 3897 }
6b251fc96cf2cd Andrea Arcangeli 2015-09-04 3898
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3899 if (page) {
bae473a423f65e Kirill A. Shutemov 2016-07-26 3900 inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
82b0f8c39a3869 Jan Kara 2016-12-14 3901 page_add_new_anon_rmap(page, vma, vmf->address, false);
b518154e59aab3 Joonsoo Kim 2020-08-11 3902 lru_cache_add_inactive_or_unevictable(page, vma);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3903 }
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3904
82b0f8c39a3869 Jan Kara 2016-12-14 3905 set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
^1da177e4c3f41 Linus Torvalds 2005-04-16 3906
^1da177e4c3f41 Linus Torvalds 2005-04-16 3907 /* No need to invalidate - it was non-present before */
82b0f8c39a3869 Jan Kara 2016-12-14 3908 update_mmu_cache(vma, vmf->address, vmf->pte);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3909 pte_unmap_unlock(vmf->pte, vmf->ptl);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3910 return 0;
65500d234e74fc Hugh Dickins 2005-10-29 3911 unlock:
82b0f8c39a3869 Jan Kara 2016-12-14 3912 pte_unmap_unlock(vmf->pte, vmf->ptl);
fa5331bae2e49c Michel Lespinasse 2022-01-28 3913 release:
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3914 if (page)
09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 3915 put_page(page);
e2bf0c1a3180a3 Michel Lespinasse 2022-01-28 3916 return ret;
8a9f3ccd24741b Balbir Singh 2008-02-07 3917 oom_free_page:
09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 3918 put_page(page);
65500d234e74fc Hugh Dickins 2005-10-29 3919 oom:
^1da177e4c3f41 Linus Torvalds 2005-04-16 3920 return VM_FAULT_OOM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 3921 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 3922
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
2022-01-28 21:03 ` kernel test robot
@ 2022-01-28 22:08 ` Michel Lespinasse
-1 siblings, 0 replies; 6+ messages in thread
From: Michel Lespinasse @ 2022-01-28 22:08 UTC (permalink / raw)
To: kernel test robot
Cc: Michel Lespinasse, Linux-MM, linux-kernel, Andrew Morton, llvm,
kbuild-all, kernel-team, Laurent Dufour, Jerome Glisse,
Peter Zijlstra, Michal Hocko, Vlastimil Babka, Davidlohr Bueso
On Sat, Jan 29, 2022 at 05:03:53AM +0800, kernel test robot wrote:
> >> mm/memory.c:3876:20: warning: variable 'vmf' is uninitialized when used within its own initialization [-Wuninitialized]
> if (!pte_map_lock(vmf)) {
> ~~~~~~~~~~~~~^~~~
> include/linux/mm.h:3418:25: note: expanded from macro 'pte_map_lock'
> struct vm_fault *vmf = __vmf; \
> ~~~ ^~~~~
> 1 warning generated.
Ah, that's interesting - this works with gcc, but breaks with clang.
The following amended patch should fix this:
(I only added underscores to the pte_map_lock and pte_spinlock macros)
------------------------------------ 8< ---------------------------------
mm: add pte_map_lock() and pte_spinlock()
pte_map_lock() and pte_spinlock() are used by fault handlers to ensure
the pte is mapped and locked before they commit the faulted page to the
mm's address space at the end of the fault.
The functions differ in their preconditions; pte_map_lock() expects
the pte to be unmapped prior to the call, while pte_spinlock() expects
it to be already mapped.
In the speculative fault case, the functions verify, after locking the pte,
that the mmap sequence count has not changed since the start of the fault,
and thus that no mmap lock writers have been running concurrently with
the fault. After that point the page table lock serializes any further
races with concurrent mmap lock writers.
If the mmap sequence count check fails, both functions will return false
with the pte being left unmapped and unlocked.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
include/linux/mm.h | 38 ++++++++++++++++++++++++++
mm/memory.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2e2122bd3da3..80894db6f01a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3394,5 +3394,43 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
}
#endif
+#ifdef CONFIG_MMU
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+
+bool __pte_map_lock(struct vm_fault *vmf);
+
+static inline bool pte_map_lock(struct vm_fault *vmf)
+{
+ VM_BUG_ON(vmf->pte);
+ return __pte_map_lock(vmf);
+}
+
+static inline bool pte_spinlock(struct vm_fault *vmf)
+{
+ VM_BUG_ON(!vmf->pte);
+ return __pte_map_lock(vmf);
+}
+
+#else /* !CONFIG_SPECULATIVE_PAGE_FAULT */
+
+#define pte_map_lock(____vmf) \
+({ \
+ struct vm_fault *__vmf = ____vmf; \
+ __vmf->pte = pte_offset_map_lock(__vmf->vma->vm_mm, __vmf->pmd, \
+ __vmf->address, &__vmf->ptl); \
+ true; \
+})
+
+#define pte_spinlock(____vmf) \
+({ \
+ struct vm_fault *__vmf = ____vmf; \
+ __vmf->ptl = pte_lockptr(__vmf->vma->vm_mm, __vmf->pmd); \
+ spin_lock(__vmf->ptl); \
+ true; \
+})
+
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+#endif /* CONFIG_MMU */
+
#endif /* __KERNEL__ */
#endif /* _LINUX_MM_H */
diff --git a/mm/memory.c b/mm/memory.c
index d0db10bd5bee..1ce837e47395 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2745,6 +2745,72 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page_range);
#define speculative_page_walk_end() local_irq_enable()
#endif
+bool __pte_map_lock(struct vm_fault *vmf)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ pmd_t pmdval;
+#endif
+ pte_t *pte = vmf->pte;
+ spinlock_t *ptl;
+
+ if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
+ vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+ if (!pte)
+ vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
+ spin_lock(vmf->ptl);
+ return true;
+ }
+
+ speculative_page_walk_begin();
+ if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+ goto fail;
+ /*
+ * The mmap sequence count check guarantees that the page
+ * tables are still valid at that point, and
+ * speculative_page_walk_begin() ensures that they stay around.
+ */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ /*
+ * We check if the pmd value is still the same to ensure that there
+ * is not a huge collapse operation in progress in our back.
+ */
+ pmdval = READ_ONCE(*vmf->pmd);
+ if (!pmd_same(pmdval, vmf->orig_pmd))
+ goto fail;
+#endif
+ ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+ if (!pte)
+ pte = pte_offset_map(vmf->pmd, vmf->address);
+ /*
+ * Try locking the page table.
+ *
+ * Note that we might race against zap_pte_range() which
+ * invalidates TLBs while holding the page table lock.
+ * We are still under the speculative_page_walk_begin() section,
+ * and zap_pte_range() could thus deadlock with us if we tried
+ * using spin_lock() here.
+ *
+ * We also don't want to retry until spin_trylock() succeeds,
+ * because of the starvation potential against a stream of lockers.
+ */
+ if (unlikely(!spin_trylock(ptl)))
+ goto fail;
+ if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+ goto unlock_fail;
+ speculative_page_walk_end();
+ vmf->pte = pte;
+ vmf->ptl = ptl;
+ return true;
+
+unlock_fail:
+ spin_unlock(ptl);
+fail:
+ if (pte)
+ pte_unmap(pte);
+ speculative_page_walk_end();
+ return false;
+}
+
#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
/*
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
@ 2022-01-28 22:08 ` Michel Lespinasse
0 siblings, 0 replies; 6+ messages in thread
From: Michel Lespinasse @ 2022-01-28 22:08 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 5324 bytes --]
On Sat, Jan 29, 2022 at 05:03:53AM +0800, kernel test robot wrote:
> >> mm/memory.c:3876:20: warning: variable 'vmf' is uninitialized when used within its own initialization [-Wuninitialized]
> if (!pte_map_lock(vmf)) {
> ~~~~~~~~~~~~~^~~~
> include/linux/mm.h:3418:25: note: expanded from macro 'pte_map_lock'
> struct vm_fault *vmf = __vmf; \
> ~~~ ^~~~~
> 1 warning generated.
Ah, that's interesting - this works with gcc, but breaks with clang.
The following amended patch should fix this:
(I only added underscores to the pte_map_lock and pte_spinlock macros)
------------------------------------ 8< ---------------------------------
mm: add pte_map_lock() and pte_spinlock()
pte_map_lock() and pte_spinlock() are used by fault handlers to ensure
the pte is mapped and locked before they commit the faulted page to the
mm's address space@the end of the fault.
The functions differ in their preconditions; pte_map_lock() expects
the pte to be unmapped prior to the call, while pte_spinlock() expects
it to be already mapped.
In the speculative fault case, the functions verify, after locking the pte,
that the mmap sequence count has not changed since the start of the fault,
and thus that no mmap lock writers have been running concurrently with
the fault. After that point the page table lock serializes any further
races with concurrent mmap lock writers.
If the mmap sequence count check fails, both functions will return false
with the pte being left unmapped and unlocked.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
include/linux/mm.h | 38 ++++++++++++++++++++++++++
mm/memory.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 104 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2e2122bd3da3..80894db6f01a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3394,5 +3394,43 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
}
#endif
+#ifdef CONFIG_MMU
+#ifdef CONFIG_SPECULATIVE_PAGE_FAULT
+
+bool __pte_map_lock(struct vm_fault *vmf);
+
+static inline bool pte_map_lock(struct vm_fault *vmf)
+{
+ VM_BUG_ON(vmf->pte);
+ return __pte_map_lock(vmf);
+}
+
+static inline bool pte_spinlock(struct vm_fault *vmf)
+{
+ VM_BUG_ON(!vmf->pte);
+ return __pte_map_lock(vmf);
+}
+
+#else /* !CONFIG_SPECULATIVE_PAGE_FAULT */
+
+#define pte_map_lock(____vmf) \
+({ \
+ struct vm_fault *__vmf = ____vmf; \
+ __vmf->pte = pte_offset_map_lock(__vmf->vma->vm_mm, __vmf->pmd, \
+ __vmf->address, &__vmf->ptl); \
+ true; \
+})
+
+#define pte_spinlock(____vmf) \
+({ \
+ struct vm_fault *__vmf = ____vmf; \
+ __vmf->ptl = pte_lockptr(__vmf->vma->vm_mm, __vmf->pmd); \
+ spin_lock(__vmf->ptl); \
+ true; \
+})
+
+#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
+#endif /* CONFIG_MMU */
+
#endif /* __KERNEL__ */
#endif /* _LINUX_MM_H */
diff --git a/mm/memory.c b/mm/memory.c
index d0db10bd5bee..1ce837e47395 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2745,6 +2745,72 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page_range);
#define speculative_page_walk_end() local_irq_enable()
#endif
+bool __pte_map_lock(struct vm_fault *vmf)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ pmd_t pmdval;
+#endif
+ pte_t *pte = vmf->pte;
+ spinlock_t *ptl;
+
+ if (!(vmf->flags & FAULT_FLAG_SPECULATIVE)) {
+ vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+ if (!pte)
+ vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
+ spin_lock(vmf->ptl);
+ return true;
+ }
+
+ speculative_page_walk_begin();
+ if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+ goto fail;
+ /*
+ * The mmap sequence count check guarantees that the page
+ * tables are still valid at that point, and
+ * speculative_page_walk_begin() ensures that they stay around.
+ */
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ /*
+ * We check if the pmd value is still the same to ensure that there
+ * is not a huge collapse operation in progress in our back.
+ */
+ pmdval = READ_ONCE(*vmf->pmd);
+ if (!pmd_same(pmdval, vmf->orig_pmd))
+ goto fail;
+#endif
+ ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+ if (!pte)
+ pte = pte_offset_map(vmf->pmd, vmf->address);
+ /*
+ * Try locking the page table.
+ *
+ * Note that we might race against zap_pte_range() which
+ * invalidates TLBs while holding the page table lock.
+ * We are still under the speculative_page_walk_begin() section,
+ * and zap_pte_range() could thus deadlock with us if we tried
+ * using spin_lock() here.
+ *
+ * We also don't want to retry until spin_trylock() succeeds,
+ * because of the starvation potential against a stream of lockers.
+ */
+ if (unlikely(!spin_trylock(ptl)))
+ goto fail;
+ if (!mmap_seq_read_check(vmf->vma->vm_mm, vmf->seq))
+ goto unlock_fail;
+ speculative_page_walk_end();
+ vmf->pte = pte;
+ vmf->ptl = ptl;
+ return true;
+
+unlock_fail:
+ spin_unlock(ptl);
+fail:
+ if (pte)
+ pte_unmap(pte);
+ speculative_page_walk_end();
+ return false;
+}
+
#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */
/*
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
2022-01-28 13:09 ` [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
@ 2022-01-28 21:03 ` kernel test robot
0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2022-01-28 21:03 UTC (permalink / raw)
To: Michel Lespinasse, Linux-MM, linux-kernel, Andrew Morton
Cc: llvm, kbuild-all, kernel-team, Laurent Dufour, Jerome Glisse,
Peter Zijlstra, Michal Hocko, Vlastimil Babka, Davidlohr Bueso
Hi Michel,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.17-rc1 next-20220128]
[cannot apply to tip/x86/mm arm64/for-next/core powerpc/next hnaz-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Michel-Lespinasse/Speculative-page-faults/20220128-212122
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 145d9b498fc827b79c1260b4caa29a8e59d4c2b9
config: arm-vt8500_v6_v7_defconfig (https://download.01.org/0day-ci/archive/20220129/202201290445.uKuWeLmf-lkp@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 33b45ee44b1f32ffdbc995e6fec806271b4b3ba4)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/0day-ci/linux/commit/fa5331bae2e49ce86eff959390b451b7401f9156
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Michel-Lespinasse/Speculative-page-faults/20220128-212122
git checkout fa5331bae2e49ce86eff959390b451b7401f9156
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> mm/memory.c:3876:20: warning: variable 'vmf' is uninitialized when used within its own initialization [-Wuninitialized]
if (!pte_map_lock(vmf)) {
~~~~~~~~~~~~~^~~~
include/linux/mm.h:3418:25: note: expanded from macro 'pte_map_lock'
struct vm_fault *vmf = __vmf; \
~~~ ^~~~~
1 warning generated.
vim +/vmf +3876 mm/memory.c
3808
3809 /*
3810 * We enter with non-exclusive mmap_lock (to exclude vma changes,
3811 * but allow concurrent faults), and pte mapped but not yet locked.
3812 * We return with mmap_lock still held, but pte unmapped and unlocked.
3813 */
3814 static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
3815 {
3816 struct vm_area_struct *vma = vmf->vma;
3817 struct page *page = NULL;
3818 vm_fault_t ret = 0;
3819 pte_t entry;
3820
3821 /* File mapping without ->vm_ops ? */
3822 if (vma->vm_flags & VM_SHARED)
3823 return VM_FAULT_SIGBUS;
3824
3825 /*
3826 * Use pte_alloc() instead of pte_alloc_map(). We can't run
3827 * pte_offset_map() on pmds where a huge pmd might be created
3828 * from a different thread.
3829 *
3830 * pte_alloc_map() is safe to use under mmap_write_lock(mm) or when
3831 * parallel threads are excluded by other means.
3832 *
3833 * Here we only have mmap_read_lock(mm).
3834 */
3835 if (pte_alloc(vma->vm_mm, vmf->pmd))
3836 return VM_FAULT_OOM;
3837
3838 /* See comment in __handle_mm_fault() */
3839 if (unlikely(pmd_trans_unstable(vmf->pmd)))
3840 return 0;
3841
3842 /* Use the zero-page for reads */
3843 if (!(vmf->flags & FAULT_FLAG_WRITE) &&
3844 !mm_forbids_zeropage(vma->vm_mm)) {
3845 entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
3846 vma->vm_page_prot));
3847 } else {
3848 /* Allocate our own private page. */
3849 if (unlikely(!vma->anon_vma)) {
3850 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
3851 return VM_FAULT_RETRY;
3852 if (__anon_vma_prepare(vma))
3853 goto oom;
3854 }
3855 page = alloc_zeroed_user_highpage_movable(vma, vmf->address);
3856 if (!page)
3857 goto oom;
3858
3859 if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
3860 goto oom_free_page;
3861 cgroup_throttle_swaprate(page, GFP_KERNEL);
3862
3863 /*
3864 * The memory barrier inside __SetPageUptodate makes sure that
3865 * preceding stores to the page contents become visible before
3866 * the set_pte_at() write.
3867 */
3868 __SetPageUptodate(page);
3869
3870 entry = mk_pte(page, vma->vm_page_prot);
3871 entry = pte_sw_mkyoung(entry);
3872 if (vma->vm_flags & VM_WRITE)
3873 entry = pte_mkwrite(pte_mkdirty(entry));
3874 }
3875
> 3876 if (!pte_map_lock(vmf)) {
3877 ret = VM_FAULT_RETRY;
3878 goto release;
3879 }
3880 if (!pte_none(*vmf->pte)) {
3881 update_mmu_tlb(vma, vmf->address, vmf->pte);
3882 goto unlock;
3883 }
3884
3885 ret = check_stable_address_space(vma->vm_mm);
3886 if (ret)
3887 goto unlock;
3888
3889 /* Deliver the page fault to userland, check inside PT lock */
3890 if (userfaultfd_missing(vma)) {
3891 pte_unmap_unlock(vmf->pte, vmf->ptl);
3892 if (page)
3893 put_page(page);
3894 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
3895 return VM_FAULT_RETRY;
3896 return handle_userfault(vmf, VM_UFFD_MISSING);
3897 }
3898
3899 if (page) {
3900 inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
3901 page_add_new_anon_rmap(page, vma, vmf->address, false);
3902 lru_cache_add_inactive_or_unevictable(page, vma);
3903 }
3904
3905 set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
3906
3907 /* No need to invalidate - it was non-present before */
3908 update_mmu_cache(vma, vmf->address, vmf->pte);
3909 pte_unmap_unlock(vmf->pte, vmf->ptl);
3910 return 0;
3911 unlock:
3912 pte_unmap_unlock(vmf->pte, vmf->ptl);
3913 release:
3914 if (page)
3915 put_page(page);
3916 return ret;
3917 oom_free_page:
3918 put_page(page);
3919 oom:
3920 return VM_FAULT_OOM;
3921 }
3922
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
@ 2022-01-28 21:03 ` kernel test robot
0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2022-01-28 21:03 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 6384 bytes --]
Hi Michel,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v5.17-rc1 next-20220128]
[cannot apply to tip/x86/mm arm64/for-next/core powerpc/next hnaz-mm/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Michel-Lespinasse/Speculative-page-faults/20220128-212122
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 145d9b498fc827b79c1260b4caa29a8e59d4c2b9
config: arm-vt8500_v6_v7_defconfig (https://download.01.org/0day-ci/archive/20220129/202201290445.uKuWeLmf-lkp(a)intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 33b45ee44b1f32ffdbc995e6fec806271b4b3ba4)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/0day-ci/linux/commit/fa5331bae2e49ce86eff959390b451b7401f9156
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Michel-Lespinasse/Speculative-page-faults/20220128-212122
git checkout fa5331bae2e49ce86eff959390b451b7401f9156
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
>> mm/memory.c:3876:20: warning: variable 'vmf' is uninitialized when used within its own initialization [-Wuninitialized]
if (!pte_map_lock(vmf)) {
~~~~~~~~~~~~~^~~~
include/linux/mm.h:3418:25: note: expanded from macro 'pte_map_lock'
struct vm_fault *vmf = __vmf; \
~~~ ^~~~~
1 warning generated.
vim +/vmf +3876 mm/memory.c
3808
3809 /*
3810 * We enter with non-exclusive mmap_lock (to exclude vma changes,
3811 * but allow concurrent faults), and pte mapped but not yet locked.
3812 * We return with mmap_lock still held, but pte unmapped and unlocked.
3813 */
3814 static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
3815 {
3816 struct vm_area_struct *vma = vmf->vma;
3817 struct page *page = NULL;
3818 vm_fault_t ret = 0;
3819 pte_t entry;
3820
3821 /* File mapping without ->vm_ops ? */
3822 if (vma->vm_flags & VM_SHARED)
3823 return VM_FAULT_SIGBUS;
3824
3825 /*
3826 * Use pte_alloc() instead of pte_alloc_map(). We can't run
3827 * pte_offset_map() on pmds where a huge pmd might be created
3828 * from a different thread.
3829 *
3830 * pte_alloc_map() is safe to use under mmap_write_lock(mm) or when
3831 * parallel threads are excluded by other means.
3832 *
3833 * Here we only have mmap_read_lock(mm).
3834 */
3835 if (pte_alloc(vma->vm_mm, vmf->pmd))
3836 return VM_FAULT_OOM;
3837
3838 /* See comment in __handle_mm_fault() */
3839 if (unlikely(pmd_trans_unstable(vmf->pmd)))
3840 return 0;
3841
3842 /* Use the zero-page for reads */
3843 if (!(vmf->flags & FAULT_FLAG_WRITE) &&
3844 !mm_forbids_zeropage(vma->vm_mm)) {
3845 entry = pte_mkspecial(pfn_pte(my_zero_pfn(vmf->address),
3846 vma->vm_page_prot));
3847 } else {
3848 /* Allocate our own private page. */
3849 if (unlikely(!vma->anon_vma)) {
3850 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
3851 return VM_FAULT_RETRY;
3852 if (__anon_vma_prepare(vma))
3853 goto oom;
3854 }
3855 page = alloc_zeroed_user_highpage_movable(vma, vmf->address);
3856 if (!page)
3857 goto oom;
3858
3859 if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
3860 goto oom_free_page;
3861 cgroup_throttle_swaprate(page, GFP_KERNEL);
3862
3863 /*
3864 * The memory barrier inside __SetPageUptodate makes sure that
3865 * preceding stores to the page contents become visible before
3866 * the set_pte_at() write.
3867 */
3868 __SetPageUptodate(page);
3869
3870 entry = mk_pte(page, vma->vm_page_prot);
3871 entry = pte_sw_mkyoung(entry);
3872 if (vma->vm_flags & VM_WRITE)
3873 entry = pte_mkwrite(pte_mkdirty(entry));
3874 }
3875
> 3876 if (!pte_map_lock(vmf)) {
3877 ret = VM_FAULT_RETRY;
3878 goto release;
3879 }
3880 if (!pte_none(*vmf->pte)) {
3881 update_mmu_tlb(vma, vmf->address, vmf->pte);
3882 goto unlock;
3883 }
3884
3885 ret = check_stable_address_space(vma->vm_mm);
3886 if (ret)
3887 goto unlock;
3888
3889 /* Deliver the page fault to userland, check inside PT lock */
3890 if (userfaultfd_missing(vma)) {
3891 pte_unmap_unlock(vmf->pte, vmf->ptl);
3892 if (page)
3893 put_page(page);
3894 if (vmf->flags & FAULT_FLAG_SPECULATIVE)
3895 return VM_FAULT_RETRY;
3896 return handle_userfault(vmf, VM_UFFD_MISSING);
3897 }
3898
3899 if (page) {
3900 inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
3901 page_add_new_anon_rmap(page, vma, vmf->address, false);
3902 lru_cache_add_inactive_or_unevictable(page, vma);
3903 }
3904
3905 set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
3906
3907 /* No need to invalidate - it was non-present before */
3908 update_mmu_cache(vma, vmf->address, vmf->pte);
3909 pte_unmap_unlock(vmf->pte, vmf->ptl);
3910 return 0;
3911 unlock:
3912 pte_unmap_unlock(vmf->pte, vmf->ptl);
3913 release:
3914 if (page)
3915 put_page(page);
3916 return ret;
3917 oom_free_page:
3918 put_page(page);
3919 oom:
3920 return VM_FAULT_OOM;
3921 }
3922
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page()
2022-01-28 13:09 [PATCH v2 00/35] Speculative page faults Michel Lespinasse
@ 2022-01-28 13:09 ` Michel Lespinasse
2022-01-28 21:03 ` kernel test robot
0 siblings, 1 reply; 6+ messages in thread
From: Michel Lespinasse @ 2022-01-28 13:09 UTC (permalink / raw)
To: Linux-MM, linux-kernel, Andrew Morton
Cc: kernel-team, Laurent Dufour, Jerome Glisse, Peter Zijlstra,
Michal Hocko, Vlastimil Babka, Davidlohr Bueso, Matthew Wilcox,
Liam Howlett, Rik van Riel, Paul McKenney, Song Liu,
Suren Baghdasaryan, Minchan Kim, Joel Fernandes, David Rientjes,
Axel Rasmussen, Andy Lutomirski, Michel Lespinasse
Change do_anonymous_page() to handle the speculative case.
This involves aborting speculative faults if they have to allocate a new
anon_vma, and using pte_map_lock() instead of pte_offset_map_lock()
to complete the page fault.
Signed-off-by: Michel Lespinasse <michel@lespinasse.org>
---
mm/memory.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 1ce837e47395..8d036140634d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3846,8 +3846,12 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
vma->vm_page_prot));
} else {
/* Allocate our own private page. */
- if (unlikely(anon_vma_prepare(vma)))
- goto oom;
+ if (unlikely(!vma->anon_vma)) {
+ if (vmf->flags & FAULT_FLAG_SPECULATIVE)
+ return VM_FAULT_RETRY;
+ if (__anon_vma_prepare(vma))
+ goto oom;
+ }
page = alloc_zeroed_user_highpage_movable(vma, vmf->address);
if (!page)
goto oom;
@@ -3869,8 +3873,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
entry = pte_mkwrite(pte_mkdirty(entry));
}
- vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
- &vmf->ptl);
+ if (!pte_map_lock(vmf)) {
+ ret = VM_FAULT_RETRY;
+ goto release;
+ }
if (!pte_none(*vmf->pte)) {
update_mmu_tlb(vma, vmf->address, vmf->pte);
goto unlock;
@@ -3885,6 +3891,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
pte_unmap_unlock(vmf->pte, vmf->ptl);
if (page)
put_page(page);
+ if (vmf->flags & FAULT_FLAG_SPECULATIVE)
+ return VM_FAULT_RETRY;
return handle_userfault(vmf, VM_UFFD_MISSING);
}
@@ -3902,6 +3910,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
return 0;
unlock:
pte_unmap_unlock(vmf->pte, vmf->ptl);
+release:
if (page)
put_page(page);
return ret;
--
2.20.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-01-30 18:03 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-30 18:03 [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page() kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2022-01-28 13:09 [PATCH v2 00/35] Speculative page faults Michel Lespinasse
2022-01-28 13:09 ` [PATCH v2 18/35] mm: implement speculative handling in do_anonymous_page() Michel Lespinasse
2022-01-28 21:03 ` kernel test robot
2022-01-28 21:03 ` kernel test robot
2022-01-28 22:08 ` Michel Lespinasse
2022-01-28 22:08 ` Michel Lespinasse
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.