From: Miaohe Lin <linmiaohe@huawei.com>
To: kbuild-all@lists.01.org
Subject: Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff
Date: Fri, 09 Apr 2021 16:46:06 +0800 [thread overview]
Message-ID: <46cd2086-cb4a-cda1-171b-5c10b30f92f5@huawei.com> (raw)
In-Reply-To: <202104090509.oCjbls7L-lkp@intel.com>
[-- Attachment #1: Type: text/plain, Size: 12132 bytes --]
On 2021/4/9 5:37, kernel test robot wrote:
> Hi Miaohe,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linux/master]
> [also build test ERROR on linus/master hnaz-linux-mm/master v5.12-rc6 next-20210408]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Miaohe-Lin/close-various-race-windows-for-swap/20210408-211224
> base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 5e46d1b78a03d52306f21f77a4e4a144b6d31486
> config: x86_64-randconfig-a012-20210408 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 56ea2e2fdd691136d5e6631fa0e447173694b82c)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install x86_64 cross compiling tool for clang build
> # apt-get install binutils-x86-64-linux-gnu
> # https://github.com/0day-ci/linux/commit/56e65e21c8c9858e36c3bca84006a15fe9b85efd
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Miaohe-Lin/close-various-race-windows-for-swap/20210408-211224
> git checkout 56e65e21c8c9858e36c3bca84006a15fe9b85efd
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All error/warnings (new ones prefixed by >>):
>
>>> mm/memory.c:3300:7: error: implicit declaration of function 'get_swap_device' [-Werror,-Wimplicit-function-declaration]
> si = get_swap_device(entry);
> ^
> mm/memory.c:3300:7: note: did you mean 'get_cpu_device'?
> include/linux/cpu.h:38:23: note: 'get_cpu_device' declared here
> extern struct device *get_cpu_device(unsigned cpu);
> ^
>>> mm/memory.c:3300:5: warning: incompatible integer to pointer conversion assigning to 'struct swap_info_struct *' from 'int' [-Wint-conversion]
> si = get_swap_device(entry);
> ^ ~~~~~~~~~~~~~~~~~~~~~~
>>> mm/memory.c:3483:3: error: implicit declaration of function 'put_swap_device' [-Werror,-Wimplicit-function-declaration]
> put_swap_device(si);
> ^
> mm/memory.c:3483:3: note: did you mean 'get_swap_device'?
> mm/memory.c:3300:7: note: 'get_swap_device' declared here
> si = get_swap_device(entry);
> ^
> 1 warning and 2 errors generated.
>
Many thanks. Will fix it.
>
> vim +/get_swap_device +3300 mm/memory.c
>
> 3258
> 3259 /*
> 3260 * We enter with non-exclusive mmap_lock (to exclude vma changes,
> 3261 * but allow concurrent faults), and pte mapped but not yet locked.
> 3262 * We return with pte unmapped and unlocked.
> 3263 *
> 3264 * We return with the mmap_lock locked or unlocked in the same cases
> 3265 * as does filemap_fault().
> 3266 */
> 3267 vm_fault_t do_swap_page(struct vm_fault *vmf)
> 3268 {
> 3269 struct vm_area_struct *vma = vmf->vma;
> 3270 struct page *page = NULL, *swapcache;
> 3271 struct swap_info_struct *si = NULL;
> 3272 swp_entry_t entry;
> 3273 pte_t pte;
> 3274 int locked;
> 3275 int exclusive = 0;
> 3276 vm_fault_t ret = 0;
> 3277 void *shadow = NULL;
> 3278
> 3279 if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> 3280 goto out;
> 3281
> 3282 entry = pte_to_swp_entry(vmf->orig_pte);
> 3283 if (unlikely(non_swap_entry(entry))) {
> 3284 if (is_migration_entry(entry)) {
> 3285 migration_entry_wait(vma->vm_mm, vmf->pmd,
> 3286 vmf->address);
> 3287 } else if (is_device_private_entry(entry)) {
> 3288 vmf->page = device_private_entry_to_page(entry);
> 3289 ret = vmf->page->pgmap->ops->migrate_to_ram(vmf);
> 3290 } else if (is_hwpoison_entry(entry)) {
> 3291 ret = VM_FAULT_HWPOISON;
> 3292 } else {
> 3293 print_bad_pte(vma, vmf->address, vmf->orig_pte, NULL);
> 3294 ret = VM_FAULT_SIGBUS;
> 3295 }
> 3296 goto out;
> 3297 }
> 3298
> 3299
>> 3300 si = get_swap_device(entry);
> 3301 /* In case we raced with swapoff. */
> 3302 if (unlikely(!si))
> 3303 goto out;
> 3304
> 3305 delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> 3306 page = lookup_swap_cache(entry, vma, vmf->address);
> 3307 swapcache = page;
> 3308
> 3309 if (!page) {
> 3310 struct swap_info_struct *si = swp_swap_info(entry);
> 3311
> 3312 if (data_race(si->flags & SWP_SYNCHRONOUS_IO) &&
> 3313 __swap_count(entry) == 1) {
> 3314 /* skip swapcache */
> 3315 page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
> 3316 vmf->address);
> 3317 if (page) {
> 3318 int err;
> 3319
> 3320 __SetPageLocked(page);
> 3321 __SetPageSwapBacked(page);
> 3322 set_page_private(page, entry.val);
> 3323
> 3324 /* Tell memcg to use swap ownership records */
> 3325 SetPageSwapCache(page);
> 3326 err = mem_cgroup_charge(page, vma->vm_mm,
> 3327 GFP_KERNEL);
> 3328 ClearPageSwapCache(page);
> 3329 if (err) {
> 3330 ret = VM_FAULT_OOM;
> 3331 goto out_page;
> 3332 }
> 3333
> 3334 shadow = get_shadow_from_swap_cache(entry);
> 3335 if (shadow)
> 3336 workingset_refault(page, shadow);
> 3337
> 3338 lru_cache_add(page);
> 3339 swap_readpage(page, true);
> 3340 }
> 3341 } else {
> 3342 page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> 3343 vmf);
> 3344 swapcache = page;
> 3345 }
> 3346
> 3347 if (!page) {
> 3348 /*
> 3349 * Back out if somebody else faulted in this pte
> 3350 * while we released the pte lock.
> 3351 */
> 3352 vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
> 3353 vmf->address, &vmf->ptl);
> 3354 if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
> 3355 ret = VM_FAULT_OOM;
> 3356 delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> 3357 goto unlock;
> 3358 }
> 3359
> 3360 /* Had to read the page from swap area: Major fault */
> 3361 ret = VM_FAULT_MAJOR;
> 3362 count_vm_event(PGMAJFAULT);
> 3363 count_memcg_event_mm(vma->vm_mm, PGMAJFAULT);
> 3364 } else if (PageHWPoison(page)) {
> 3365 /*
> 3366 * hwpoisoned dirty swapcache pages are kept for killing
> 3367 * owner processes (which may be unknown at hwpoison time)
> 3368 */
> 3369 ret = VM_FAULT_HWPOISON;
> 3370 delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> 3371 goto out_release;
> 3372 }
> 3373
> 3374 locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
> 3375
> 3376 delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
> 3377 if (!locked) {
> 3378 ret |= VM_FAULT_RETRY;
> 3379 goto out_release;
> 3380 }
> 3381
> 3382 /*
> 3383 * Make sure try_to_free_swap or reuse_swap_page or swapoff did not
> 3384 * release the swapcache from under us. The page pin, and pte_same
> 3385 * test below, are not enough to exclude that. Even if it is still
> 3386 * swapcache, we need to check that the page's swap has not changed.
> 3387 */
> 3388 if (unlikely((!PageSwapCache(page) ||
> 3389 page_private(page) != entry.val)) && swapcache)
> 3390 goto out_page;
> 3391
> 3392 page = ksm_might_need_to_copy(page, vma, vmf->address);
> 3393 if (unlikely(!page)) {
> 3394 ret = VM_FAULT_OOM;
> 3395 page = swapcache;
> 3396 goto out_page;
> 3397 }
> 3398
> 3399 cgroup_throttle_swaprate(page, GFP_KERNEL);
> 3400
> 3401 /*
> 3402 * Back out if somebody else already faulted in this pte.
> 3403 */
> 3404 vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
> 3405 &vmf->ptl);
> 3406 if (unlikely(!pte_same(*vmf->pte, vmf->orig_pte)))
> 3407 goto out_nomap;
> 3408
> 3409 if (unlikely(!PageUptodate(page))) {
> 3410 ret = VM_FAULT_SIGBUS;
> 3411 goto out_nomap;
> 3412 }
> 3413
> 3414 /*
> 3415 * The page isn't present yet, go ahead with the fault.
> 3416 *
> 3417 * Be careful about the sequence of operations here.
> 3418 * To get its accounting right, reuse_swap_page() must be called
> 3419 * while the page is counted on swap but not yet in mapcount i.e.
> 3420 * before page_add_anon_rmap() and swap_free(); try_to_free_swap()
> 3421 * must be called after the swap_free(), or it will never succeed.
> 3422 */
> 3423
> 3424 inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
> 3425 dec_mm_counter_fast(vma->vm_mm, MM_SWAPENTS);
> 3426 pte = mk_pte(page, vma->vm_page_prot);
> 3427 if ((vmf->flags & FAULT_FLAG_WRITE) && reuse_swap_page(page, NULL)) {
> 3428 pte = maybe_mkwrite(pte_mkdirty(pte), vma);
> 3429 vmf->flags &= ~FAULT_FLAG_WRITE;
> 3430 ret |= VM_FAULT_WRITE;
> 3431 exclusive = RMAP_EXCLUSIVE;
> 3432 }
> 3433 flush_icache_page(vma, page);
> 3434 if (pte_swp_soft_dirty(vmf->orig_pte))
> 3435 pte = pte_mksoft_dirty(pte);
> 3436 if (pte_swp_uffd_wp(vmf->orig_pte)) {
> 3437 pte = pte_mkuffd_wp(pte);
> 3438 pte = pte_wrprotect(pte);
> 3439 }
> 3440 set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> 3441 arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
> 3442 vmf->orig_pte = pte;
> 3443
> 3444 /* ksm created a completely new copy */
> 3445 if (unlikely(page != swapcache && swapcache)) {
> 3446 page_add_new_anon_rmap(page, vma, vmf->address, false);
> 3447 lru_cache_add_inactive_or_unevictable(page, vma);
> 3448 } else {
> 3449 do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
> 3450 }
> 3451
> 3452 swap_free(entry);
> 3453 if (mem_cgroup_swap_full(page) ||
> 3454 (vma->vm_flags & VM_LOCKED) || PageMlocked(page))
> 3455 try_to_free_swap(page);
> 3456 unlock_page(page);
> 3457 if (page != swapcache && swapcache) {
> 3458 /*
> 3459 * Hold the lock to avoid the swap entry to be reused
> 3460 * until we take the PT lock for the pte_same() check
> 3461 * (to avoid false positives from pte_same). For
> 3462 * further safety release the lock after the swap_free
> 3463 * so that the swap count won't change under a
> 3464 * parallel locked swapcache.
> 3465 */
> 3466 unlock_page(swapcache);
> 3467 put_page(swapcache);
> 3468 }
> 3469
> 3470 if (vmf->flags & FAULT_FLAG_WRITE) {
> 3471 ret |= do_wp_page(vmf);
> 3472 if (ret & VM_FAULT_ERROR)
> 3473 ret &= VM_FAULT_ERROR;
> 3474 goto out;
> 3475 }
> 3476
> 3477 /* No need to invalidate - it was non-present before */
> 3478 update_mmu_cache(vma, vmf->address, vmf->pte);
> 3479 unlock:
> 3480 pte_unmap_unlock(vmf->pte, vmf->ptl);
> 3481 out:
> 3482 if (si)
>> 3483 put_swap_device(si);
> 3484 return ret;
> 3485 out_nomap:
> 3486 pte_unmap_unlock(vmf->pte, vmf->ptl);
> 3487 out_page:
> 3488 unlock_page(page);
> 3489 out_release:
> 3490 put_page(page);
> 3491 if (page != swapcache && swapcache) {
> 3492 unlock_page(swapcache);
> 3493 put_page(swapcache);
> 3494 }
> 3495 if (si)
> 3496 put_swap_device(si);
> 3497 return ret;
> 3498 }
> 3499
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
>
next prev parent reply other threads:[~2021-04-09 8:46 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 13:08 [PATCH 0/5] close various race windows for swap Miaohe Lin
2021-04-08 13:08 ` [PATCH 1/5] mm/swapfile: add percpu_ref support " Miaohe Lin
2021-04-12 3:30 ` Huang, Ying
2021-04-12 3:30 ` Huang, Ying
2021-04-12 6:59 ` Miaohe Lin
2021-04-12 7:24 ` Huang, Ying
2021-04-12 7:24 ` Huang, Ying
2021-04-13 12:39 ` Miaohe Lin
2021-04-14 1:17 ` Huang, Ying
2021-04-14 1:17 ` Huang, Ying
2021-04-14 1:58 ` Miaohe Lin
2021-04-14 2:06 ` Huang, Ying
2021-04-14 2:06 ` Huang, Ying
2021-04-14 3:44 ` Dennis Zhou
2021-04-14 3:59 ` Huang, Ying
2021-04-14 3:59 ` Huang, Ying
2021-04-14 4:05 ` Dennis Zhou
2021-04-14 5:44 ` Huang, Ying
2021-04-14 5:44 ` Huang, Ying
2021-04-14 14:53 ` Dennis Zhou
2021-04-15 3:16 ` Miaohe Lin
2021-04-15 4:20 ` Dennis Zhou
2021-04-15 9:17 ` Miaohe Lin
2021-04-15 5:24 ` Huang, Ying
2021-04-15 5:24 ` Huang, Ying
2021-04-15 14:31 ` Dennis Zhou
2021-04-16 0:54 ` Huang, Ying
2021-04-16 0:54 ` Huang, Ying
2021-04-16 2:27 ` Miaohe Lin
2021-04-16 6:25 ` Huang, Ying
2021-04-16 6:25 ` Huang, Ying
2021-04-16 8:30 ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 2/5] swap: fix do_swap_page() race with swapoff Miaohe Lin
2021-04-08 21:34 ` Tim Chen
2021-04-09 8:42 ` Miaohe Lin
2021-04-09 17:17 ` Tim Chen
2021-04-10 3:17 ` Miaohe Lin
2021-04-12 1:44 ` Huang, Ying
2021-04-12 1:44 ` Huang, Ying
2021-04-12 3:24 ` Miaohe Lin
2021-04-08 21:37 ` kernel test robot
2021-04-09 8:46 ` Miaohe Lin [this message]
2021-04-08 22:56 ` kernel test robot
2021-04-13 1:27 ` Huang, Ying
2021-04-13 1:27 ` Huang, Ying
2021-04-13 19:24 ` Tim Chen
2021-04-14 1:04 ` Huang, Ying
2021-04-14 1:04 ` Huang, Ying
2021-04-14 2:20 ` Miaohe Lin
2021-04-14 16:13 ` Tim Chen
2021-04-15 3:19 ` Miaohe Lin
2021-04-14 2:55 ` Miaohe Lin
2021-04-14 3:07 ` Huang, Ying
2021-04-14 3:07 ` Huang, Ying
2021-04-14 3:27 ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 3/5] mm/swap_state: fix get_shadow_from_swap_cache() " Miaohe Lin
2021-04-13 1:33 ` Huang, Ying
2021-04-13 1:33 ` Huang, Ying
2021-04-14 2:42 ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 4/5] mm/swap_state: fix potential faulted in race in swap_ra_info() Miaohe Lin
2021-04-09 8:50 ` Huang, Ying
2021-04-09 8:50 ` Huang, Ying
2021-04-09 9:00 ` Miaohe Lin
2021-04-12 0:55 ` Huang, Ying
2021-04-12 0:55 ` Huang, Ying
2021-04-12 3:17 ` Miaohe Lin
2021-04-08 13:08 ` [PATCH 5/5] mm/swap_state: fix swap_cluster_readahead() race with swapoff Miaohe Lin
2021-04-13 1:36 ` Huang, Ying
2021-04-13 1:36 ` Huang, Ying
2021-04-14 2:43 ` Miaohe Lin
2021-04-08 14:55 ` [PATCH 0/5] close various race windows for swap riteshh
2021-04-09 8:01 ` Miaohe Lin
2021-04-08 20:46 [PATCH 2/5] swap: fix do_swap_page() race with swapoff kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46cd2086-cb4a-cda1-171b-5c10b30f92f5@huawei.com \
--to=linmiaohe@huawei.com \
--cc=kbuild-all@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.