All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Liam Howlett <liam.howlett@oracle.com>
Cc: Nathan Chancellor <nathan@kernel.org>,
	Qian Cai <quic_qiancai@quicinc.com>,
	"maple-tree@lists.infradead.org" <maple-tree@lists.infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mark Brown <broonie@kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Suren Baghdasaryan <surenb@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v6 00/71] Introducing the Maple Tree
Date: Sat, 26 Feb 2022 15:19:45 -0800 (PST)	[thread overview]
Message-ID: <5f8f4f-ad63-eb-fd73-d48748af8a76@google.com> (raw)
In-Reply-To: <20220226015803.h4w6y3doe3om2sbc@revolver>

On Sat, 26 Feb 2022, Liam Howlett wrote:
> * Nathan Chancellor <nathan@kernel.org> [220225 18:00]:
> > On Fri, Feb 25, 2022 at 03:46:52PM -0500, Qian Cai wrote:
> > > On Fri, Feb 25, 2022 at 08:23:41PM +0000, Liam Howlett wrote:
> > > > I just booted an arm64 VM with my build and kasan enabled with no issue.
> > > > Could you please send me your config file for the build?
> > > 
> > > On linux-next, I just do:
> > > 
> > > $ make arch=arm64 defconfig debug.config [1]
> > > 
> > > Then, I just generate some memory pressume into swapping/OOM Killer to
> > > trigger it.
> > > 
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/kernel/configs/debug.config
> > 
> > Is the stacktrace [1] related to the conflict that Mark encountered [2]
> > while merging the maple and folio trees? Booting a next-20220223 kernel
> > on my Raspberry Pi 3 and 4 shows constant NULL pointer dereferences
> > (just ARCH=arm and ARCH=arm64 defconfigs) and reverting the folio and
> > maple tree merges makes everything work properly again.
> > 
> > [1]: https://lore.kernel.org/r/YhhRrBpXTFolUAKi@qian/
> > [2]: https://lore.kernel.org/r/20220224011653.1380557-1-broonie@kernel.org/
> 
> Maybe?  I'm trying to figure out why it's having issues.. I've not been
> able to reproduce it with just my maple tree branch.  Steven Rostedt
> found a bad commit that has been fixed in either 20220224, I believe
> [1].  It might be best to try next-20220225 and see if you have better
> luck if that's an option.
> 
> [1]:
> https://lore.kernel.org/linux-fsdevel/f6fb6fd4-dcf2-4326-d25e-9a4a9dad5020@fb.com/T/#t

Hi Liam, I think I have the beginnings of an answer for Qian's issue,
details below; but first off, I'd better make my own opinion clear.

I think this series, however good it may be so far, is undertested and
not approaching ready for 5.18, and should not have been pushed via
git tree into 5.17-rc-late-stage linux-next, subverting Andrew's mmotm.

I believe Stephen and Mark should drop it from linux-next for now,
while linux-next stabilizes for 5.18, while you work on testing and
fixing, and resubmit rebased to 5.18-rc1 or -rc2 when you're ready.

Qian's issue, BUG: KASAN: use-after-free in move_vma.isra.0,
with stacktrace implicating mremap.

I don't have KASAN on, but my config on this laptop does have
CONFIG_SLAB=y CONFIG_DEBUG_SLAB=y, and reported "Slab corruption"
during bootup of mmotm 2022-02-24-22-38 (plus Steven Rostedt's fix
to vfs_statx() which you mention in [1], missed from mmotm - not
related to your series of course, but essential for booting here).
Previous mmotm 2022-02-23-21-20 (plus vfs_statx() fix) was okay,
but did not contain the Maple Tree series.

The "vm_area" "Slab corruption" "Single bit error detected" 6b->7b
report is enough to indicate that the VM_ACCOUNT bit gets set in a
freed vma's vm_flags.  And looking through "|= VM_ACCOUNT"s, my
suspicion fell on mm/mremap.c's move_vma().  Which I now see is
implicated in Qian's report too.

mremap move's VM_ACCOUNT accounting is difficult: not losing what's
already charged in case the move fails, accounting the extra without
overcharging, in the face of vmas being split and merged: difficult.

And there's an assumption documented in (now) do_mas_align_munmap()
"Note: mremap's move_vma VM_ACCOUNT handling assumes a partially
unmapped vm_area_struct will remain in use".  My suspicion (not
verified) is that the maple tree changes are now violating that;
and I doubt that fixing it will be easy (I'm not going to try) -
doable, sure, but needs careful thought.  (The way move_vma() masks
out VM_ACCOUNT in a live vma, then adds it back later: implications
for vma merging; all under mmap lock of course, but still awkward.)

Though I did partially verify it, by commenting out the VM_ACCOUNT
adjustments in move_vma(), and booting several times without any
"Slab corruption".  And did also kind-of verify it by booting with
#define VM_ACCOUNT 0: I was interested to see if that bit corruption
could account for all the other bugs I get, but sadly no.

Initially I was building without CONFIG_DEBUG_VM_MAPLE_TREE=y and
CONFIG_DEBUG_MAPLE_TREE=y, but have now switched them on.  Hit bugs
without them and with them, but now they're on, validate_mm() often
catches something (whether it is correct to complain, I haven't
investigated, but I assume so since the debug option is there,
and problems are seen without it).

I say "often": it's very erratic.  Once, a machine booted with mem=1G
ran kernel builds successfully swapping for 4.5 hours before faulting
in __nr_to_section in virt_to_folio ... while doing a __vma_adjust() -
you'll ask me for a full stacktrace, and I'll answer sorry, too many,
please try for yourself.  Another time, for 1.5 hours before hitting
the BUG_ON(is_migration_entry(entry) && !PageLocked(p)) in
pfn_swap_entry_to_page() - suggesting anon_vma locking had not been
right while doing page migration (I was exercising THPs a lot).
But now, can I even get it to complete the boot sequence?

(I happened to be sampling /proc/meminfo during those successful
runs: looking afterwards at those samples, I see Committed_AS growing
steadily; whereas samples saved from pre-maple runs were not: that
would correspond to VM_ACCOUNT vm_enough_memory() charges leaking.)

You're having difficulty reproducing: I suggest trying with different
mem= on the boot command line (some of my loads I run with mem=700M,
some mem=1G, some with the workstation mem 8G) - I get the impression
that different mem= jumbles up memory allocations differently, so
what's harmless in one case is quickly harmful in another.
Or try changing between SLAB and SLUB.

One other thing: I haven't studied the source much, but didn't like
that rwsem_acquire(&mm->mmap_lock.dep_map, 0, 0, _THIS_IP_) hack in
exit_mmap(): sneaked into "mm: Start tracking VMAs with maple tree",
it reverts Suren's 64591e8605d6 "mm: protect free_pgtables with
mmap_lock write lock in exit_mmap", without explanating how it becomes
safe with maple tree (I imagine it's not).  That would have to be a
separate, justified patch if it goes forward.  (The nearby conflict
resolutions in mmotm and next are not quite right there, some stuff
my mm/munlock series deleted has resurfaced: but it's harmless, and
not worth worrying about if maple tree is dropped from linux-next.)

Hugh

  reply	other threads:[~2022-02-26 23:20 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15 14:37 [PATCH v6 00/71] Introducing the Maple Tree Liam Howlett
2022-02-15 14:42 ` [PATCH v6 01/71] binfmt_elf: Take the mmap lock when walking the VMA list Liam Howlett
2022-02-15 14:42   ` [PATCH v6 03/71] radix tree test suite: Add pr_err define Liam Howlett
2022-02-15 14:42   ` [PATCH v6 02/71] xarray: Fix bitmap breakage Liam Howlett
2022-02-15 14:42   ` [PATCH v6 04/71] radix tree test suite: Add kmem_cache_set_non_kernel() Liam Howlett
2022-02-15 14:42   ` [PATCH v6 05/71] radix tree test suite: Add allocation counts and size to kmem_cache Liam Howlett
2022-02-15 14:42   ` [PATCH v6 06/71] radix tree test suite: Add support for slab bulk APIs Liam Howlett
2022-02-15 14:42   ` [PATCH v6 07/71] radix tree test suite: Add lockdep_is_held to header Liam Howlett
2022-02-15 14:43   ` [PATCH v6 08/71] Maple Tree: Add new data structure Liam Howlett
2022-02-16 10:11     ` Mark Hemment
2022-02-16 18:25       ` Liam Howlett
2022-02-27  1:11     ` Vasily Gorbik
2022-02-27 12:46       ` Vasily Gorbik
2022-02-28 14:36       ` Liam Howlett
2022-03-01  2:01         ` Vasily Gorbik
2022-03-01 20:39           ` Liam Howlett
2022-03-01 22:50             ` Vasily Gorbik
2022-03-01 22:56               ` Vasily Gorbik
2022-03-02 14:08               ` Liam Howlett
2022-02-15 14:43   ` [PATCH v6 09/71] lib/test_maple_tree: Add testing for maple tree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 10/71] mm: Start tracking VMAs with " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 11/71] mm: Add VMA iterator Liam Howlett
2022-02-16 10:50     ` Mark Hemment
2022-02-16 18:32       ` Liam Howlett
2022-02-15 14:43   ` [PATCH v6 12/71] mmap: Use the VMA iterator in count_vma_pages_range() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 13/71] mm/mmap: Use the maple tree in find_vma() instead of the rbtree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 15/71] mm/mmap: Use maple tree for unmapped_area{_topdown} Liam Howlett
2022-02-15 14:43   ` [PATCH v6 16/71] kernel/fork: Use maple tree for dup_mmap() during forking Liam Howlett
2022-02-15 14:43   ` [PATCH v6 14/71] mm/mmap: Use the maple tree for find_vma_prev() instead of the rbtree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 18/71] proc: Remove VMA rbtree use from nommu Liam Howlett
2022-02-15 14:43   ` [PATCH v6 17/71] damon: Convert __damon_va_three_regions to use the VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 19/71] mm: Remove rb tree Liam Howlett
2022-02-15 14:43   ` [PATCH v6 20/71] mmap: Change zeroing of maple tree in __vma_adjust() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 23/71] mm/khugepaged: Optimize collapse_pte_mapped_thp() by using vma_lookup() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 21/71] xen: Use vma_lookup() in privcmd_ioctl_mmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 22/71] mm: Optimize find_exact_vma() to use vma_lookup() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 25/71] mm: Use maple tree operations for find_vma_intersection() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 24/71] mm/mmap: Change do_brk_flags() to expand existing VMA and add do_brk_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 26/71] mm/mmap: Use advanced maple tree API for mmap_region() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 27/71] mm: Remove vmacache Liam Howlett
2022-02-15 14:43   ` [PATCH v6 28/71] mm: Convert vma_lookup() to use mtree_load() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 29/71] mm/mmap: Move mmap_region() below do_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 30/71] mm/mmap: Reorganize munmap to use maple states Liam Howlett
2022-02-15 14:43   ` [PATCH v6 31/71] mm/mmap: Change do_brk_munmap() to use do_mas_align_munmap() Liam Howlett
2022-02-15 14:43   ` [PATCH v6 32/71] arm64: Remove mmap linked list from vdso Liam Howlett
2022-02-15 14:43   ` [PATCH v6 35/71] s390: Remove vma linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 33/71] parisc: Remove mmap linked list from cache handling Liam Howlett
2022-02-17 20:18     ` Fwd: " Helge Deller
2022-02-15 14:43   ` [PATCH v6 34/71] powerpc: Remove mmap linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 37/71] xtensa: Remove vma " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 38/71] cxl: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 36/71] x86: Remove vma linked list walks Liam Howlett
2022-02-15 14:43   ` [PATCH v6 40/71] um: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 39/71] optee: " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 41/71] binfmt_elf: " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 43/71] exec: Use VMA iterator instead of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 42/71] coredump: Remove vma linked list walk Liam Howlett
2022-02-15 14:43   ` [PATCH v6 45/71] fs/proc/task_mmu: Stop using linked list and highest_vm_end Liam Howlett
2022-02-15 14:43   ` [PATCH v6 44/71] fs/proc/base: Use maple tree iterators in place of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 47/71] ipc/shm: Use VMA iterator instead " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 46/71] userfaultfd: Use maple tree iterator to iterate VMAs Liam Howlett
2022-02-15 14:43   ` [PATCH v6 48/71] acct: Use VMA iterator instead of linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 49/71] perf: Use VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 50/71] sched: Use maple tree iterator to walk VMAs Liam Howlett
2022-02-15 14:43   ` [PATCH v6 51/71] fork: Use VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 52/71] bpf: Remove VMA linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 53/71] mm/gup: Use maple tree navigation instead of " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 55/71] mm/ksm: Use vma iterators instead of vma " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 54/71] mm/khugepaged: Stop using " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 57/71] mm/memcontrol: Stop using mm->highest_vm_end Liam Howlett
2022-02-15 14:43   ` [PATCH v6 56/71] mm/madvise: Use vma_find() instead of vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 58/71] mm/mempolicy: Use vma iterator & maple state " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 61/71] mm/mremap: Use vma_find_intersection() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 60/71] mm/mprotect: Use maple tree navigation " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 59/71] mm/mlock: Use vma iterator and " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 62/71] mm/msync: Use vma_find() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 63/71] mm/oom_kill: Use maple tree iterators " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 64/71] mm/pagewalk: Use vma_find() " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 66/71] i915: Use the VMA iterator Liam Howlett
2022-02-15 14:43   ` [PATCH v6 65/71] mm/swapfile: Use vma iterator instead of vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 68/71] riscv: Use vma iterator for vdso Liam Howlett
2022-02-15 14:43   ` [PATCH v6 69/71] mm: Remove the vma linked list Liam Howlett
2022-02-15 14:43   ` [PATCH v6 67/71] nommu: Remove uses of VMA " Liam Howlett
2022-02-15 14:43   ` [PATCH v6 70/71] mm/mmap: Drop range_has_overlap() function Liam Howlett
2022-02-15 14:43   ` [PATCH v6 71/71] mm/mmap.c: Pass in mapping to __vma_link_file() Liam Howlett
2022-02-16 19:47 ` [PATCH v6 00/71] Introducing the Maple Tree Andrew Morton
2022-02-16 20:24   ` Matthew Wilcox
2022-02-23 16:35     ` Mel Gorman
2022-02-23 16:45       ` Matthew Wilcox
2022-02-25  3:49 ` Qian Cai
2022-02-25 19:08   ` Liam Howlett
2022-02-25 20:23     ` Liam Howlett
2022-02-25 20:46       ` Qian Cai
2022-02-25 23:00         ` Nathan Chancellor
2022-02-26  1:58           ` Liam Howlett
2022-02-26 23:19             ` Hugh Dickins [this message]
2022-02-27 18:32               ` Hugh Dickins
2022-02-28 14:26               ` Liam Howlett
2022-02-28 11:56             ` Qian Cai
2022-02-27  2:22 ` Vasily Gorbik
2022-02-28 14:56   ` Liam Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f8f4f-ad63-eb-fd73-d48748af8a76@google.com \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=liam.howlett@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=nathan@kernel.org \
    --cc=quic_qiancai@quicinc.com \
    --cc=sfr@canb.auug.org.au \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.