From: Matthew Wilcox <willy@infradead.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: dev@der-flo.net, linux-mm@kvack.org,
Uladzislau Rezki <urezki@gmail.com>,
bugzilla-daemon@kernel.org, Kees Cook <keescook@chromium.org>
Subject: Re: [Bug 216489] New: Machine freezes due to memory lock
Date: Thu, 15 Sep 2022 23:42:17 +0100 [thread overview]
Message-ID: <YyOqSWAmAFxx8RCt@casper.infradead.org> (raw)
In-Reply-To: <20220915133931.ee0a6c8a86c59a144828eb60@linux-foundation.org>
On Thu, Sep 15, 2022 at 01:39:31PM -0700, Andrew Morton wrote:
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Wed, 14 Sep 2022 15:07:46 +0000 bugzilla-daemon@kernel.org wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=216489
> >
> > Bug ID: 216489
> > Summary: Machine freezes due to memory lock
> > Product: Memory Management
> > Version: 2.5
> > Kernel Version: 5.19.8
> > Hardware: AMD
> > OS: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: high
> > Priority: P1
> > Component: Other
> > Assignee: akpm@linux-foundation.org
> > Reporter: dev@der-flo.net
> > Regression: No
> >
> > Hi all,
> > With Kernel 5.19.x we noticed system freezes. This happens in virtual
> > environments as well as on real hardware.
> > On a real hardware machine we were able to catch the moment of freeze with
> > continuous profiling.
>
> Thanks. I forwarded this to Uladzislau and he offered to help. He said:
>
>
> : I can help with debugging. What i need is reproduce steps. Could you
> : please clarify if it is easy to hit and what kind of profiling triggers it?
>
> and
>
> : I do not think that Matthew Wilcox commits destroys it but... I see that
> : __vunmap() is invoked by the free_work() thus a caller is in atomic
> : context including IRQ context.
To keep all of this information together; Florian emailed me off-list
and I replied cc'ing Kees.
I asked to try this patch to decide whether it's the extra load on the
spinlock from examining the vmap tree more often:
diff --git a/mm/usercopy.c b/mm/usercopy.c
index c1ee15a98633..76d2d4fb6d22 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -173,15 +173,6 @@ static inline void check_heap_object(const void *ptr, unsigned long n,
}
if (is_vmalloc_addr(ptr)) {
- struct vmap_area *area = find_vmap_area(addr);
-
- if (!area)
- usercopy_abort("vmalloc", "no area", to_user, 0, n);
-
- if (n > area->va_end - addr) {
- offset = addr - area->va_start;
- usercopy_abort("vmalloc", NULL, to_user, offset, n);
- }
return;
}
Kees wrote:
} If you can reproduce the hangs, perhaps enable:
}
} CONFIG_DEBUG_LOCKDEP=y
} CONFIG_DEBUG_ATOMIC_SLEEP=y
}
} I would expect any hung spinlock to complain very loudly under
} LOCKDEP...
I hope we can keep the remainder of the debugging in this email thread.
> > Specification of the machine where we captured the freeze:
> > Thinkpad T14
> > CPU: AMD Ryzen 7 PRO 4750U
> > Kernel: 5.19.8-200.fc36.x86_64
> >
> > Stacktrace of kworker/12:3 that is using all resources and causing the freeze:
> >
> > # Source Location Function Name Function Line
> > 0 arch/x86/include/asm/vdso/processor.h:13 rep_nop 11
> > 1 arch/x86/include/asm/vdso/processor.h:18 cpu_relax 16
> > 2 kernel/locking/qspinlock.c:514 native_queued_spin_lock_slowpath
> > 316
> > 3 kernel/locking/qspinlock.c:316 native_queued_spin_lock_slowpath
> > N/A
> > 4 arch/x86/include/asm/paravirt.h:591 pv_queued_spin_lock_slowpath
> > 588
> > 5 arch/x86/include/asm/qspinlock.h:51 queued_spin_lock_slowpath 49
> > 6 include/asm-generic/qspinlock.h:114 queued_spin_lock 107
> > 7 include/linux/spinlock.h:185 do_raw_spin_lock 182
> > 8 include/linux/spinlock_api_smp.h:134 __raw_spin_lock 130
> > 9 kernel/locking/spinlock.c:154 _raw_spin_lock 152
> > 10 include/linux/spinlock.h:349 spin_lock 347
> > 11 mm/vmalloc.c:1805 find_vmap_area 1801
> > 12 mm/vmalloc.c:2525 find_vm_area 2521
> > 13 mm/vmalloc.c:2639 __vunmap 2628
> > 14 mm/vmalloc.c:97 free_work 91
> > 15 kernel/workqueue.c:2289 process_one_work 2181
> > 16 kernel/workqueue.c:2436 worker_thread 2378
> > 17 kernel/kthread.c:376 kthread 330
> > 18 N/A ret_from_fork N/A
> >
> > The functions in the above shown stacktrace hardly change. There is only one
> > commit 993d0b287e2ef7bee2e8b13b0ce4d2b5066f278e which introduces changes to
> > find_vmap_area() for 5.19.
> >
> > With this change in mind we looked for stacktraces which make also use of this
> > new commit. And in a different kernel thread we do notice the use of
> > check_heap_object():
> >
> > # Source Location Function Name Function Line
> > 0 arch/x86/include/asm/paravirt.h:704 arch_local_irq_enable 702
> > 1 arch/x86/include/asm/irqflags.h:138 arch_local_irq_restore 135
> > 2 kernel/sched/sched.h:1330 raw_spin_rq_unlock_irqrestore 1327
> > 3 kernel/sched/sched.h:1327 raw_spin_rq_unlock_irqrestore N/A
> > 4 kernel/sched/sched.h:1611 rq_unlock_irqrestore 1607
> > 5 kernel/sched/fair.c:8288 update_blocked_averages 8272
> > 6 kernel/sched/fair.c:11133 run_rebalance_domains 11115
> > 7 kernel/softirq.c:571 __do_softirq 528
> > 8 kernel/softirq.c:445 invoke_softirq 433
> > 9 kernel/softirq.c:650 __irq_exit_rcu 640
> > 10 arch/x86/kernel/apic/apic.c:1106 sysvec_apic_timer_interrupt N/A
> > 11 N/A asm_sysvec_apic_timer_interrupt N/A
> > 12 include/linux/mmzone.h:1403 __nr_to_section 1395
> > 13 include/linux/mmzone.h:1488 __pfn_to_section 1486
> > 14 include/linux/mmzone.h:1539 pfn_valid 1524
> > 15 arch/x86/mm/physaddr.c:65 __virt_addr_valid 47
> > 16 mm/usercopy.c:188 check_heap_object 161
> > 17 mm/usercopy.c:250 __check_object_size 212
> > 18 mm/usercopy.c:212 __check_object_size N/A
> > 19 include/linux/thread_info.h:199 check_object_size 195
> > 20 lib/strncpy_from_user.c:137 strncpy_from_user 113
> > 21 fs/namei.c:150 getname_flags 129
> > 22 fs/namei.c:2896 user_path_at_empty 2893
> > 23 include/linux/namei.h:57 user_path_at 54
> > 24 fs/open.c:446 do_faccessat 420
> > 25 arch/x86/entry/common.c:50 do_syscall_x64 40
> > 26 arch/x86/entry/common.c:80 do_syscall_64 73
> > 27 N/A entry_SYSCALL_64_after_hwframe N/A
> >
> > We are neither experts in the mm subsystem nor can provide a fix, but wanted to
> > let you know about our findings.
> >
> > Cheers,
> > Florian
> >
> > --
> > You may reply to this email to add a comment.
> >
> > You are receiving this mail because:
> > You are the assignee for the bug.
>
next prev parent reply other threads:[~2022-09-15 22:42 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-216489-27@https.bugzilla.kernel.org/>
2022-09-15 20:39 ` [Bug 216489] New: Machine freezes due to memory lock Andrew Morton
2022-09-15 22:42 ` Matthew Wilcox [this message]
2022-09-15 23:59 ` Yu Zhao
2022-09-16 8:38 ` Matthew Wilcox
2022-09-16 9:46 ` Kees Cook
2022-09-16 12:28 ` Uladzislau Rezki
2022-09-16 12:32 ` Uladzislau Rezki
2022-09-16 14:15 ` Matthew Wilcox
2022-09-16 14:42 ` Kees Cook
2022-09-16 18:47 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YyOqSWAmAFxx8RCt@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=bugzilla-daemon@kernel.org \
--cc=dev@der-flo.net \
--cc=keescook@chromium.org \
--cc=linux-mm@kvack.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.