All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org
Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed
Date: Sun, 10 Mar 2024 19:59:46 +0000	[thread overview]
Message-ID: <9dfd3b3b-4733-4c7c-b09c-5e6388531e49@arm.com> (raw)
In-Reply-To: <02e820c2-8a1d-42cc-954b-f9e041c4417a@arm.com>

On 10/03/2024 16:31, Ryan Roberts wrote:
> On 10/03/2024 11:11, Matthew Wilcox wrote:
>> On Sun, Mar 10, 2024 at 11:01:06AM +0000, Ryan Roberts wrote:
>>>> So after my patch, instead of calling (in order):
>>>>
>>>> 	page_cache_release(folio);
>>>> 	folio_undo_large_rmappable(folio);
>>>> 	mem_cgroup_uncharge(folio);
>>>> 	free_unref_page()
>>>>
>>>> it calls:
>>>>
>>>> 	__page_cache_release(folio, &lruvec, &flags);
>>>> 	mem_cgroup_uncharge_folios()
>>>> 	folio_undo_large_rmappable(folio);
>>>
>>> I was just looking at this again, and something pops out...
>>>
>>> You have swapped the order of folio_undo_large_rmappable() and
>>> mem_cgroup_uncharge(). But folio_undo_large_rmappable() calls
>>> get_deferred_split_queue() which tries to get the split queue from
>>> folio_memcg(folio) first and falls back to pgdat otherwise. If you are now
>>> calling mem_cgroup_uncharge_folios() first, will that remove the folio from the
>>> cgroup? Then we are operating on the wrong list? (just a guess based on the name
>>> of the function...)
>>
>> Oh my.  You've got it.  This explains everything.  Thank you!
> 
> I've just taken today's mm-unstable, added your official patch to fix the ordering and applied my large folio swap-out series on top (v4, which I haven't posted yet). In testing that, I'm seeing another oops :-( 
> 
> That's exactly how I discovered the original problem, and was hoping that with your fix, this would unblock me. Given I can only repro this when my changes are on top, I guess my code is most likely buggy, but perhaps you can take a quick look at the oops and tell me what you think?

I've now been able to repro this without any of my code on top - just mm-unstable and your fix for the the memcg uncharging ordering issue. So we have separate, more difficultt to repro bug. I've discovered CONFIG_DEBUG_LIST so enabled that. I'll try to bisect in the morning, but I suspect it will be slow going.

[  390.317982] ------------[ cut here ]------------
[  390.318646] list_del corruption. prev->next should be fffffc00152a9090, but was fffffc002798a490. (prev=fffffc002798a490)
[  390.319895] WARNING: CPU: 28 PID: 3187 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xe0/0x110
[  390.320957] Modules linked in:
[  390.321295] CPU: 28 PID: 3187 Comm: usemem Not tainted 6.8.0-rc5-00462-gdbdeae0a47d9 #4
[  390.322432] Hardware name: linux,dummy-virt (DT)
[  390.323078] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  390.324187] pc : __list_del_entry_valid_or_report+0xe0/0x110
[  390.325156] lr : __list_del_entry_valid_or_report+0xe0/0x110
[  390.326179] sp : ffff800087fcb6e0
[  390.326730] x29: ffff800087fcb6e0 x28: 0000fffff7e00000 x27: ffff0005c1c0a790
[  390.327897] x26: ffff00116f44c010 x25: 0000000000000090 x24: 0000000000000001
[  390.329021] x23: ffff800082e2a660 x22: 00000000000000c0 x21: fffffc00152a9090
[  390.330344] x20: ffff0000c7d30818 x19: fffffc00152a9000 x18: 0000000000000006
[  390.331513] x17: 20747562202c3039 x16: 3039613235313030 x15: 6366666666662065
[  390.332607] x14: 6220646c756f6873 x13: 2930393461383937 x12: 3230306366666666
[  390.333713] x11: 663d766572702820 x10: ffff0013f5e7b7c0 x9 : ffff800080128e84
[  390.334945] x8 : 00000000ffffbfff x7 : ffff0013f5e7b7c0 x6 : 80000000ffffc000
[  390.336235] x5 : ffff0013a58ecd08 x4 : 0000000000000000 x3 : ffff8013235c7000
[  390.337435] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00010fe79140
[  390.338501] Call trace:
[  390.338800]  __list_del_entry_valid_or_report+0xe0/0x110
[  390.339704]  folio_undo_large_rmappable+0xb8/0x128
[  390.340572]  folios_put_refs+0x1e4/0x200
[  390.341201]  free_pages_and_swap_cache+0xf0/0x178
[  390.342074]  __tlb_batch_free_encoded_pages+0x54/0xf0
[  390.342898]  tlb_flush_mmu+0x5c/0xe0
[  390.343466]  unmap_page_range+0x960/0xe48
[  390.344112]  unmap_single_vma.constprop.0+0x90/0x118
[  390.344948]  unmap_vmas+0x84/0x180
[  390.345576]  unmap_region+0xdc/0x170
[  390.346208]  do_vmi_align_munmap+0x464/0x5f0
[  390.346988]  do_vmi_munmap+0xb4/0x138
[  390.347657]  __vm_munmap+0xa8/0x188
[  390.348061]  __arm64_sys_munmap+0x28/0x40
[  390.348513]  invoke_syscall+0x50/0x128
[  390.348952]  el0_svc_common.constprop.0+0x48/0xf0
[  390.349494]  do_el0_svc+0x24/0x38
[  390.350085]  el0_svc+0x34/0xb8
[  390.350486]  el0t_64_sync_handler+0x100/0x130
[  390.351256]  el0t_64_sync+0x190/0x198
[  390.351823] ---[ end trace 0000000000000000 ]---


> 
> [   96.372503] BUG: Bad page state in process usemem  pfn:be502
> [   96.373336] page: refcount:0 mapcount:0 mapping:000000005abfa8d5 index:0x0 pfn:0xbe502
> [   96.374341] aops:0x0 ino:fffffc0001f940c8
> [   96.374893] flags: 0x7fff8000000000(node=0|zone=0|lastcpupid=0xffff)
> [   96.375653] page_type: 0xffffffff()
> [   96.376071] raw: 007fff8000000000 0000000000000000 fffffc0001f94090 ffff0000c99ee860
> [   96.377055] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [   96.378650] page dumped because: non-NULL mapping
> [   96.379828] Modules linked in: binfmt_misc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore ip_tables x_tables autofs4 xfs btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 crct10dif_ce ghash_ce sha2_ce virtio_net sha256_arm64 net_failover sha1_ce virtio_blk failover virtio_scsi virtio_rng aes_neon_bs aes_neon_blk aes_ce_blk aes_ce_cipher
> [   96.386802] CPU: 13 PID: 4713 Comm: usemem Not tainted 6.8.0-rc5-ryarob01-swap-out-v4 #2
> [   96.387691] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [   96.388887] Call trace:
> [   96.389348]  dump_backtrace+0x9c/0x128
> [   96.390213]  show_stack+0x20/0x38
> [   96.390688]  dump_stack_lvl+0x78/0xc8
> [   96.391163]  dump_stack+0x18/0x28
> [   96.391545]  bad_page+0x88/0x128
> [   96.391893]  get_page_from_freelist+0xa94/0x1bc0
> [   96.392407]  __alloc_pages+0x194/0x10b0
> [   96.392833]  alloc_pages_mpol+0x98/0x278
> [   96.393278]  vma_alloc_folio+0x74/0xd8
> [   96.393674]  __handle_mm_fault+0x7ac/0x1470
> [   96.394146]  handle_mm_fault+0x70/0x2c8
> [   96.394575]  do_page_fault+0x100/0x530
> [   96.395013]  do_translation_fault+0xa4/0xd0
> [   96.395476]  do_mem_abort+0x4c/0xa8
> [   96.395869]  el0_da+0x30/0xa8
> [   96.396229]  el0t_64_sync_handler+0xb4/0x130
> [   96.396735]  el0t_64_sync+0x1a8/0x1b0
> [   96.397133] Disabling lock debugging due to kernel taint
> [  112.507052] Adding 36700156k swap on /dev/ram0.  Priority:-2 extents:1 across:36700156k SS
> [  113.131515] ------------[ cut here ]------------
> [  113.132190] UBSAN: array-index-out-of-bounds in mm/vmscan.c:1654:14
> [  113.132892] index 7 is out of range for type 'long unsigned int [5]'
> [  113.133617] CPU: 9 PID: 528 Comm: kswapd0 Tainted: G    B              6.8.0-rc5-ryarob01-swap-out-v4 #2
> [  113.134705] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [  113.135500] Call trace:
> [  113.135776]  dump_backtrace+0x9c/0x128
> [  113.136218]  show_stack+0x20/0x38
> [  113.136574]  dump_stack_lvl+0x78/0xc8
> [  113.136964]  dump_stack+0x18/0x28
> [  113.137322]  __ubsan_handle_out_of_bounds+0xa0/0xd8
> [  113.137885]  isolate_lru_folios+0x57c/0x658
> [  113.138352]  shrink_lruvec+0x5b4/0xdf8
> [  113.138751]  shrink_node+0x3f0/0x990
> [  113.139152]  balance_pgdat+0x3d0/0x810
> [  113.139579]  kswapd+0x268/0x568
> [  113.139936]  kthread+0x118/0x128
> [  113.140289]  ret_from_fork+0x10/0x20
> [  113.140686] ---[ end trace ]---
> 
> The UBSAN issue reported for mm/vmscan.c:1654 is:
> 
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> 
> nr_skipped is a stack array of 5 elements. So I guess folio_zonemem(folio) is returning 7. That comes from the flags. I guess this is most likely just a side effect of the corrupted folio due to someone writing to it while its on the free list?
> 
> 



  parent reply	other threads:[~2024-03-10 19:59 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 17:42 [PATCH v3 00/18] Rearrange batched folio freeing Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 01/18] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 02/18] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 03/18] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 04/18] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 05/18] memcg: Add mem_cgroup_uncharge_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 06/18] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 07/18] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 08/18] mm: use __page_cache_release() in folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 09/18] mm: Handle large folios in free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox (Oracle)
2024-03-06 13:42   ` Ryan Roberts
2024-03-06 16:09     ` Matthew Wilcox
2024-03-06 16:19       ` Ryan Roberts
2024-03-06 17:41         ` Ryan Roberts
2024-03-06 18:41           ` Zi Yan
2024-03-06 19:55             ` Matthew Wilcox
2024-03-06 21:55               ` Matthew Wilcox
2024-03-07  8:56                 ` Ryan Roberts
2024-03-07 13:50                   ` Yin, Fengwei
2024-03-07 14:05                     ` Re: Matthew Wilcox
2024-03-07 15:24                       ` Re: Ryan Roberts
2024-03-07 16:24                         ` Re: Ryan Roberts
2024-03-07 23:02                           ` Re: Matthew Wilcox
2024-03-08  1:06                       ` Re: Yin, Fengwei
2024-03-07 17:33                   ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox
2024-03-07 18:35                     ` Ryan Roberts
2024-03-07 20:42                       ` Matthew Wilcox
2024-03-08 11:44                     ` Ryan Roberts
2024-03-08 12:09                       ` Ryan Roberts
2024-03-08 14:21                         ` Ryan Roberts
2024-03-08 15:11                           ` Matthew Wilcox
2024-03-08 16:03                             ` Matthew Wilcox
2024-03-08 17:13                               ` Ryan Roberts
2024-03-08 18:09                                 ` Ryan Roberts
2024-03-08 18:18                                   ` Matthew Wilcox
2024-03-09  4:34                                     ` Andrew Morton
2024-03-09  4:52                                       ` Matthew Wilcox
2024-03-09  8:05                                         ` Ryan Roberts
2024-03-09 12:33                                           ` Ryan Roberts
2024-03-10 13:38                                             ` Matthew Wilcox
2024-03-08 15:33                         ` Matthew Wilcox
2024-03-09  6:09                       ` Matthew Wilcox
2024-03-09  7:59                         ` Ryan Roberts
2024-03-09  8:18                           ` Ryan Roberts
2024-03-09  9:38                             ` Ryan Roberts
2024-03-10  4:23                               ` Matthew Wilcox
2024-03-10  8:23                                 ` Ryan Roberts
2024-03-10 11:08                                   ` Matthew Wilcox
2024-03-10 11:01       ` Ryan Roberts
2024-03-10 11:11         ` Matthew Wilcox
2024-03-10 16:31           ` Ryan Roberts
2024-03-10 19:57             ` Matthew Wilcox
2024-03-10 19:59             ` Ryan Roberts [this message]
2024-03-10 20:46               ` Matthew Wilcox
2024-03-10 21:52                 ` Matthew Wilcox
2024-03-11  9:01                   ` Ryan Roberts
2024-03-11 12:26                     ` Matthew Wilcox
2024-03-11 12:36                       ` Ryan Roberts
2024-03-11 15:50                         ` Matthew Wilcox
2024-03-11 16:14                           ` Ryan Roberts
2024-03-11 17:49                             ` Matthew Wilcox
2024-03-12 11:57                               ` Ryan Roberts
2024-03-11 19:26                             ` Matthew Wilcox
2024-03-10 11:14         ` Ryan Roberts
2024-02-27 17:42 ` [PATCH v3 11/18] mm: Free folios in a batch in shrink_folio_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 12/18] mm: Free folios directly in move_folios_to_lru() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 13/18] memcg: Remove mem_cgroup_uncharge_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 14/18] mm: Remove free_unref_page_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 15/18] mm: Remove lru_to_page() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 16/18] mm: Convert free_pages_and_swap_cache() to use folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 17/18] mm: Use a folio in __collapse_huge_page_copy_succeeded() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 18/18] mm: Convert free_swap_cache() to take a folio Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9dfd3b3b-4733-4c7c-b09c-5e6388531e49@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.