Re: [Bug 216489] New: Machine freezes due to memory lock

From: Matthew Wilcox <willy@infradead.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: dev@der-flo.net, linux-mm@kvack.org,
	Uladzislau Rezki <urezki@gmail.com>,
	bugzilla-daemon@kernel.org, Kees Cook <keescook@chromium.org>
Subject: Re: [Bug 216489] New: Machine freezes due to memory lock
Date: Thu, 15 Sep 2022 23:42:17 +0100	[thread overview]
Message-ID: <YyOqSWAmAFxx8RCt@casper.infradead.org> (raw)
In-Reply-To: <20220915133931.ee0a6c8a86c59a144828eb60@linux-foundation.org>

On Thu, Sep 15, 2022 at 01:39:31PM -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Wed, 14 Sep 2022 15:07:46 +0000 bugzilla-daemon@kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=216489
> > 
> >             Bug ID: 216489
> >            Summary: Machine freezes due to memory lock
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 5.19.8
> >           Hardware: AMD
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Other
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: dev@der-flo.net
> >         Regression: No
> > 
> > Hi all,
> > With Kernel 5.19.x we noticed system freezes. This happens in virtual
> > environments as well as on real hardware.
> > On a real hardware machine we were able to catch the moment of freeze with
> > continuous profiling.
> 
> Thanks.  I forwarded this to Uladzislau and he offered to help.  He said:
> 
> 
> : I can help with debugging. What i need is reproduce steps. Could you
> : please clarify if it is easy to hit and what kind of profiling triggers it?
> 
> and
> 
> : I do not think that Matthew Wilcox commits destroys it but... I see that
> : __vunmap() is invoked by the free_work() thus a caller is in atomic
> : context including IRQ context.

To keep all of this information together; Florian emailed me off-list
and I replied cc'ing Kees.

I asked to try this patch to decide whether it's the extra load on the
spinlock from examining the vmap tree more often:

diff --git a/mm/usercopy.c b/mm/usercopy.c
index c1ee15a98633..76d2d4fb6d22 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -173,15 +173,6 @@ static inline void check_heap_object(const void *ptr, unsigned long n,
 	}
 
 	if (is_vmalloc_addr(ptr)) {
-		struct vmap_area *area = find_vmap_area(addr);
-
-		if (!area)
-			usercopy_abort("vmalloc", "no area", to_user, 0, n);
-
-		if (n > area->va_end - addr) {
-			offset = addr - area->va_start;
-			usercopy_abort("vmalloc", NULL, to_user, offset, n);
-		}
 		return;
 	}
 

Kees wrote:

} If you can reproduce the hangs, perhaps enable:
} 
} CONFIG_DEBUG_LOCKDEP=y
} CONFIG_DEBUG_ATOMIC_SLEEP=y
} 
} I would expect any hung spinlock to complain very loudly under
} LOCKDEP...

I hope we can keep the remainder of the debugging in this email thread.

> > Specification of the machine where we captured the freeze:
> > Thinkpad T14
> > CPU: AMD Ryzen 7 PRO 4750U
> > Kernel: 5.19.8-200.fc36.x86_64
> > 
> > Stacktrace of kworker/12:3 that is using all resources and causing the freeze:
> > 
> > #   Source Location                 Function Name               Function Line
> > 0   arch/x86/include/asm/vdso/processor.h:13    rep_nop                 11
> > 1   arch/x86/include/asm/vdso/processor.h:18    cpu_relax               16
> > 2   kernel/locking/qspinlock.c:514          native_queued_spin_lock_slowpath   
> > 316
> > 3   kernel/locking/qspinlock.c:316          native_queued_spin_lock_slowpath   
> > N/A
> > 4   arch/x86/include/asm/paravirt.h:591     pv_queued_spin_lock_slowpath       
> > 588
> > 5   arch/x86/include/asm/qspinlock.h:51     queued_spin_lock_slowpath       49
> > 6   include/asm-generic/qspinlock.h:114     queued_spin_lock            107
> > 7   include/linux/spinlock.h:185            do_raw_spin_lock            182
> > 8   include/linux/spinlock_api_smp.h:134        __raw_spin_lock             130
> > 9   kernel/locking/spinlock.c:154           _raw_spin_lock              152
> > 10  include/linux/spinlock.h:349            spin_lock               347
> > 11  mm/vmalloc.c:1805               find_vmap_area              1801
> > 12  mm/vmalloc.c:2525               find_vm_area                2521
> > 13  mm/vmalloc.c:2639               __vunmap                2628
> > 14  mm/vmalloc.c:97                 free_work               91
> > 15  kernel/workqueue.c:2289             process_one_work            2181
> > 16  kernel/workqueue.c:2436             worker_thread               2378
> > 17  kernel/kthread.c:376                kthread                 330
> > 18  N/A                     ret_from_fork               N/A
> > 
> > The functions in the above shown stacktrace hardly change. There is only one
> > commit 993d0b287e2ef7bee2e8b13b0ce4d2b5066f278e which introduces changes to
> > find_vmap_area() for 5.19.
> > 
> > With this change in mind we looked for stacktraces which make also use of this
> > new commit. And in a different kernel thread we do notice the use of
> > check_heap_object():
> > 
> > #   Source Location             Function Name           Function Line
> > 0   arch/x86/include/asm/paravirt.h:704 arch_local_irq_enable       702
> > 1   arch/x86/include/asm/irqflags.h:138 arch_local_irq_restore      135
> > 2   kernel/sched/sched.h:1330       raw_spin_rq_unlock_irqrestore   1327
> > 3   kernel/sched/sched.h:1327       raw_spin_rq_unlock_irqrestore   N/A
> > 4   kernel/sched/sched.h:1611       rq_unlock_irqrestore        1607
> > 5   kernel/sched/fair.c:8288        update_blocked_averages     8272
> > 6   kernel/sched/fair.c:11133       run_rebalance_domains       11115
> > 7   kernel/softirq.c:571            __do_softirq            528
> > 8   kernel/softirq.c:445            invoke_softirq          433
> > 9   kernel/softirq.c:650            __irq_exit_rcu          640
> > 10  arch/x86/kernel/apic/apic.c:1106    sysvec_apic_timer_interrupt N/A
> > 11  N/A                 asm_sysvec_apic_timer_interrupt N/A
> > 12  include/linux/mmzone.h:1403     __nr_to_section         1395
> > 13  include/linux/mmzone.h:1488     __pfn_to_section        1486
> > 14  include/linux/mmzone.h:1539     pfn_valid           1524
> > 15  arch/x86/mm/physaddr.c:65       __virt_addr_valid       47
> > 16  mm/usercopy.c:188           check_heap_object       161
> > 17  mm/usercopy.c:250           __check_object_size     212
> > 18  mm/usercopy.c:212           __check_object_size     N/A
> > 19  include/linux/thread_info.h:199     check_object_size       195
> > 20  lib/strncpy_from_user.c:137     strncpy_from_user       113
> > 21  fs/namei.c:150              getname_flags           129
> > 22  fs/namei.c:2896             user_path_at_empty      2893
> > 23  include/linux/namei.h:57        user_path_at            54
> > 24  fs/open.c:446               do_faccessat            420
> > 25  arch/x86/entry/common.c:50      do_syscall_x64          40
> > 26  arch/x86/entry/common.c:80      do_syscall_64           73
> > 27  N/A                 entry_SYSCALL_64_after_hwframe  N/A
> > 
> > We are neither experts in the mm subsystem nor can provide a fix, but wanted to
> > let you know about our findings.
> > 
> > Cheers,
> >  Florian
> > 
> > -- 
> > You may reply to this email to add a comment.
> > 
> > You are receiving this mail because:
> > You are the assignee for the bug.
>