linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: Nathan Chancellor <nathan@kernel.org>,
	Qian Cai <quic_qiancai@quicinc.com>,
	 "maple-tree@lists.infradead.org"
	<maple-tree@lists.infradead.org>,
	 "linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Mark Brown <broonie@kernel.org>,
	 Stephen Rothwell <sfr@canb.auug.org.au>,
	 Suren Baghdasaryan <surenb@google.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v6 00/71] Introducing the Maple Tree
Date: Sat, 26 Feb 2022 15:19:45 -0800 (PST)	[thread overview]
Message-ID: <5f8f4f-ad63-eb-fd73-d48748af8a76@google.com> (raw)
In-Reply-To: <20220226015803.h4w6y3doe3om2sbc@revolver>

On Sat, 26 Feb 2022, Liam Howlett wrote:
> * Nathan Chancellor <nathan@kernel.org> [220225 18:00]:
> > On Fri, Feb 25, 2022 at 03:46:52PM -0500, Qian Cai wrote:
> > > On Fri, Feb 25, 2022 at 08:23:41PM +0000, Liam Howlett wrote:
> > > > I just booted an arm64 VM with my build and kasan enabled with no issue.
> > > > Could you please send me your config file for the build?
> > > 
> > > On linux-next, I just do:
> > > 
> > > $ make arch=arm64 defconfig debug.config [1]
> > > 
> > > Then, I just generate some memory pressume into swapping/OOM Killer to
> > > trigger it.
> > > 
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/kernel/configs/debug.config
> > 
> > Is the stacktrace [1] related to the conflict that Mark encountered [2]
> > while merging the maple and folio trees? Booting a next-20220223 kernel
> > on my Raspberry Pi 3 and 4 shows constant NULL pointer dereferences
> > (just ARCH=arm and ARCH=arm64 defconfigs) and reverting the folio and
> > maple tree merges makes everything work properly again.
> > 
> > [1]: https://lore.kernel.org/r/YhhRrBpXTFolUAKi@qian/
> > [2]: https://lore.kernel.org/r/20220224011653.1380557-1-broonie@kernel.org/
> 
> Maybe?  I'm trying to figure out why it's having issues.. I've not been
> able to reproduce it with just my maple tree branch.  Steven Rostedt
> found a bad commit that has been fixed in either 20220224, I believe
> [1].  It might be best to try next-20220225 and see if you have better
> luck if that's an option.
> 
> [1]:
> https://lore.kernel.org/linux-fsdevel/f6fb6fd4-dcf2-4326-d25e-9a4a9dad5020@fb.com/T/#t

Hi Liam, I think I have the beginnings of an answer for Qian's issue,
details below; but first off, I'd better make my own opinion clear.

I think this series, however good it may be so far, is undertested and
not approaching ready for 5.18, and should not have been pushed via
git tree into 5.17-rc-late-stage linux-next, subverting Andrew's mmotm.

I believe Stephen and Mark should drop it from linux-next for now,
while linux-next stabilizes for 5.18, while you work on testing and
fixing, and resubmit rebased to 5.18-rc1 or -rc2 when you're ready.

Qian's issue, BUG: KASAN: use-after-free in move_vma.isra.0,
with stacktrace implicating mremap.

I don't have KASAN on, but my config on this laptop does have
CONFIG_SLAB=y CONFIG_DEBUG_SLAB=y, and reported "Slab corruption"
during bootup of mmotm 2022-02-24-22-38 (plus Steven Rostedt's fix
to vfs_statx() which you mention in [1], missed from mmotm - not
related to your series of course, but essential for booting here).
Previous mmotm 2022-02-23-21-20 (plus vfs_statx() fix) was okay,
but did not contain the Maple Tree series.

The "vm_area" "Slab corruption" "Single bit error detected" 6b->7b
report is enough to indicate that the VM_ACCOUNT bit gets set in a
freed vma's vm_flags.  And looking through "|= VM_ACCOUNT"s, my
suspicion fell on mm/mremap.c's move_vma().  Which I now see is
implicated in Qian's report too.

mremap move's VM_ACCOUNT accounting is difficult: not losing what's
already charged in case the move fails, accounting the extra without
overcharging, in the face of vmas being split and merged: difficult.

And there's an assumption documented in (now) do_mas_align_munmap()
"Note: mremap's move_vma VM_ACCOUNT handling assumes a partially
unmapped vm_area_struct will remain in use".  My suspicion (not
verified) is that the maple tree changes are now violating that;
and I doubt that fixing it will be easy (I'm not going to try) -
doable, sure, but needs careful thought.  (The way move_vma() masks
out VM_ACCOUNT in a live vma, then adds it back later: implications
for vma merging; all under mmap lock of course, but still awkward.)

Though I did partially verify it, by commenting out the VM_ACCOUNT
adjustments in move_vma(), and booting several times without any
"Slab corruption".  And did also kind-of verify it by booting with
#define VM_ACCOUNT 0: I was interested to see if that bit corruption
could account for all the other bugs I get, but sadly no.

Initially I was building without CONFIG_DEBUG_VM_MAPLE_TREE=y and
CONFIG_DEBUG_MAPLE_TREE=y, but have now switched them on.  Hit bugs
without them and with them, but now they're on, validate_mm() often
catches something (whether it is correct to complain, I haven't
investigated, but I assume so since the debug option is there,
and problems are seen without it).

I say "often": it's very erratic.  Once, a machine booted with mem=1G
ran kernel builds successfully swapping for 4.5 hours before faulting
in __nr_to_section in virt_to_folio ... while doing a __vma_adjust() -
you'll ask me for a full stacktrace, and I'll answer sorry, too many,
please try for yourself.  Another time, for 1.5 hours before hitting
the BUG_ON(is_migration_entry(entry) && !PageLocked(p)) in
pfn_swap_entry_to_page() - suggesting anon_vma locking had not been
right while doing page migration (I was exercising THPs a lot).
But now, can I even get it to complete the boot sequence?

(I happened to be sampling /proc/meminfo during those successful
runs: looking afterwards at those samples, I see Committed_AS growing
steadily; whereas samples saved from pre-maple runs were not: that
would correspond to VM_ACCOUNT vm_enough_memory() charges leaking.)

You're having difficulty reproducing: I suggest trying with different
mem= on the boot command line (some of my loads I run with mem=700M,
some mem=1G, some with the workstation mem 8G) - I get the impression
that different mem= jumbles up memory allocations differently, so
what's harmless in one case is quickly harmful in another.
Or try changing between SLAB and SLUB.

One other thing: I haven't studied the source much, but didn't like
that rwsem_acquire(&mm->mmap_lock.dep_map, 0, 0, _THIS_IP_) hack in
exit_mmap(): sneaked into "mm: Start tracking VMAs with maple tree",
it reverts Suren's 64591e8605d6 "mm: protect free_pgtables with
mmap_lock write lock in exit_mmap", without explanating how it becomes
safe with maple tree (I imagine it's not).  That would have to be a
separate, justified patch if it goes forward.  (The nearby conflict
resolutions in mmotm and next are not quite right there, some stuff
my mm/munlock series deleted has resurfaced: but it's harmless, and
not worth worrying about if maple tree is dropped from linux-next.)

Hugh


  reply	other threads:[~2022-02-26 23:20 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15 14:37 [PATCH v6 00/71] Introducing the Maple Tree Liam Howlett
2022-02-15 14:42 ` [PATCH v6 01/71] binfmt_elf: Take the mmap lock when walking the VMA list Liam Howlett
2022-02-15 14:42   ` [PATCH v6 03/71] radix tree test suite: Add pr_err define Liam Howlett
2022-02-15 14:42   ` [PATCH v6 02/71] xarray: Fix bitmap breakage Liam Howlett
2022-02-15 14:42   ` [PATCH v6 04/71] radix tree test suite: Add kmem_cache_set_non_kernel() Liam Howlett
2022-02-15 14:42   ` [PATCH v6 05/71] radix tree test suite: Add allocation counts and size to kmem_cache Liam Howlett
2022-02-15 14:42   ` [PATCH v6 06/71] radix tree test suite: Add support for slab bulk APIs Liam Howlett
2022-02-15 14:42   ` [PATCH v6 07/71] radix tree test suite: Add lockdep_is_held to header Liam Howlett
2022-02-15 14:43   ` [PATCH v6 08/71] Maple Tree: Add new data structure Liam Howlett
2022-02-16 10:11     ` Mark Hemment
2022-02-16 18:25       ` Liam Howlett
2022-02-27  1:11     ` Vasily Gorbik
2022-02-27 12:46       ` Vasily Gorbik
2022-02-28 14:36       ` Liam Howlett
2022-03-01  2:01         ` Vasily Gorbik
2022-03-01 20:39           ` Liam Howlett
2022-03-01 22:50             ` Vasily Gorbik
2022-03-01 22:56               ` Vasily Gorbik
2022-03-02 14:08               ` Liam Howlett
2022-02-15 14:43   ` [PATCH v6 09/71] lib/test_maple_tree: Add testing for maple tree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 10/71] mm: Start tracking VMAs with " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 11/71] mm: Add VMA iterator Liam Howlett
2022-02-16 10:50     ` Mark Hemment
2022-02-16 18:32       ` Liam Howlett
2022-02-15 14:43   ` [PATCH v6 12/71] mmap: Use the VMA iterator in count_vma_pages_range() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 13/71] mm/mmap: Use the maple tree in find_vma() instead of the rbtree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 15/71] mm/mmap: Use maple tree for unmapped_area{_topdown} Liam Howlett
2022-02-15 14:43   ` [PATCH v6 16/71] kernel/fork: Use maple tree for dup_mmap() during forking Liam Howlett
2022-02-15 14:43   ` [PATCH v6 14/71] mm/mmap: Use the maple tree for find_vma_prev() instead of the rbtree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 18/71] proc: Remove VMA rbtree use from nommu Liam Howlett
2022-02-15 14:43   ` [PATCH v6 17/71] damon: Convert __damon_va_three_regions to use the VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 19/71] mm: Remove rb tree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 20/71] mmap: Change zeroing of maple tree in __vma_adjust() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 23/71] mm/khugepaged: Optimize collapse_pte_mapped_thp() by using vma_lookup() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 21/71] xen: Use vma_lookup() in privcmd_ioctl_mmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 22/71] mm: Optimize find_exact_vma() to use vma_lookup() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 25/71] mm: Use maple tree operations for find_vma_intersection() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 24/71] mm/mmap: Change do_brk_flags() to expand existing VMA and add do_brk_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 26/71] mm/mmap: Use advanced maple tree API for mmap_region() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 27/71] mm: Remove vmacache Liam Howlett
2022-02-15 14:43   ` [PATCH v6 28/71] mm: Convert vma_lookup() to use mtree_load() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 29/71] mm/mmap: Move mmap_region() below do_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 30/71] mm/mmap: Reorganize munmap to use maple states Liam Howlett
2022-02-15 14:43   ` [PATCH v6 31/71] mm/mmap: Change do_brk_munmap() to use do_mas_align_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 32/71] arm64: Remove mmap linked list from vdso Liam Howlett
2022-02-15 14:43   ` [PATCH v6 35/71] s390: Remove vma linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 33/71] parisc: Remove mmap linked list from cache handling Liam Howlett
2022-02-15 14:43   ` [PATCH v6 34/71] powerpc: Remove mmap linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 37/71] xtensa: Remove vma " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 38/71] cxl: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 36/71] x86: Remove vma linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 40/71] um: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 39/71] optee: " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 41/71] binfmt_elf: " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 43/71] exec: Use VMA iterator instead of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 42/71] coredump: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 45/71] fs/proc/task_mmu: Stop using linked list and highest_vm_end Liam Howlett
2022-02-15 14:43   ` [PATCH v6 44/71] fs/proc/base: Use maple tree iterators in place of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 47/71] ipc/shm: Use VMA iterator instead " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 46/71] userfaultfd: Use maple tree iterator to iterate VMAs Liam Howlett
2022-02-15 14:43   ` [PATCH v6 48/71] acct: Use VMA iterator instead of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 49/71] perf: Use VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 50/71] sched: Use maple tree iterator to walk VMAs Liam Howlett
2022-02-15 14:43   ` [PATCH v6 51/71] fork: Use VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 52/71] bpf: Remove VMA linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 53/71] mm/gup: Use maple tree navigation instead of " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 55/71] mm/ksm: Use vma iterators instead of vma " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 54/71] mm/khugepaged: Stop using " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 57/71] mm/memcontrol: Stop using mm->highest_vm_end Liam Howlett
2022-02-15 14:43   ` [PATCH v6 56/71] mm/madvise: Use vma_find() instead of vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 58/71] mm/mempolicy: Use vma iterator & maple state " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 61/71] mm/mremap: Use vma_find_intersection() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 60/71] mm/mprotect: Use maple tree navigation " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 59/71] mm/mlock: Use vma iterator and " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 62/71] mm/msync: Use vma_find() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 63/71] mm/oom_kill: Use maple tree iterators " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 64/71] mm/pagewalk: Use vma_find() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 66/71] i915: Use the VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 65/71] mm/swapfile: Use vma iterator instead of vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 68/71] riscv: Use vma iterator for vdso Liam Howlett
2022-02-15 14:43   ` [PATCH v6 69/71] mm: Remove the vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 67/71] nommu: Remove uses of VMA " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 70/71] mm/mmap: Drop range_has_overlap() function Liam Howlett
2022-02-15 14:43   ` [PATCH v6 71/71] mm/mmap.c: Pass in mapping to __vma_link_file() Liam Howlett
2022-02-16 19:47 ` [PATCH v6 00/71] Introducing the Maple Tree Andrew Morton
2022-02-16 20:24   ` Matthew Wilcox
2022-02-23 16:35     ` Mel Gorman
2022-02-23 16:45       ` Matthew Wilcox
2022-02-25  3:49 ` Qian Cai
2022-02-25 19:08   ` Liam Howlett
2022-02-25 20:23     ` Liam Howlett
2022-02-25 20:46       ` Qian Cai
2022-02-25 23:00         ` Nathan Chancellor
2022-02-26  1:58           ` Liam Howlett
2022-02-26 23:19             ` Hugh Dickins [this message]
2022-02-27 18:32               ` Hugh Dickins
2022-02-28 14:26               ` Liam Howlett
2022-02-28 11:56             ` Qian Cai
2022-02-27  2:22 ` Vasily Gorbik
2022-02-28 14:56   ` Liam Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f8f4f-ad63-eb-fd73-d48748af8a76@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=nathan@kernel.org \
    --cc=quic_qiancai@quicinc.com \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).