From: Alison Chaiken <achaiken@aurora.tech>
To: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-rt-users@vger.kernel.org, williams@redhat.com,
rostedt@goodmis.org, tglx@linutronix.de
Subject: Re: System Hang With 5.15.79-rt54 Patch Set
Date: Mon, 13 Mar 2023 08:11:05 -0700 [thread overview]
Message-ID: <CAFzL-7sKDBRRoCBFsonhEVB2vhcZ8_GLkk1=se2-v41=GE9HUw@mail.gmail.com> (raw)
In-Reply-To: <daaa77a0-5091-1c21-2d53-160220f27594@canonical.com>
> On 2/16/23 12:15, Sebastian Andrzej Siewior wrote:
> > On 2023-01-18 13:52:21 [-0500], Joseph Salisbury wrote:
> >> I'll add more details to this thread as I continue.
> > Any update on this?
> > Does the system really hang? The dmesg says:
> > |[ 8235.110075] INFO: task stress-ng:9466 blocked for more than 122 seconds.
> >
> > which means stress-ng is blocked for quite some time due to I/O
> > according to the backtrace. This appears once for each stress-ng
> > process, 10 times in total. It does not repeat and the system runs at
> > least until
> >
> > | [50733.471625] hid-generic 0003:03F0:7029.0004: input,hidraw1: USB HID v1.10 Mouse [iLO Virtual Keyboard] on usb-0000:01:00.4-1/input1
> >
> > ~11h after that report.
> > Based on that it looks like the system complained about slow I/O but did
> > not hang as it completed its task.
> >
> > Sebastian
On Fri, Mar 10, 2023 at 1:09 PM Joseph Salisbury wrote:
> A bisect has not provided additional detail. This issue does not appear
> to be a regression and appears to have always existed.
>
> I was able to get additional debug info, by enabling
> CONFIG_DEBUG_PREEMPT, CONFIG_PROVE_LOCKING and CONFIG_JBD2_DEBUG.
> Enabling these configs shows a circular locking issue[0] and a call
> trace[1].
>
> I don't think the circular locking report is related. I think your
> correct that the system is not actually hanging. The interactive
> response make it seem like it's hung. For example, once the issue
> starts to happen, no other interactive commands can be issues without
> taking at least days (I never waiting more that 3 days :-) ) I'm also
> not able to log in or log out while the system "Appears" hung. I was
> able to get a sysrq-W while the system was in this state[2].
>
> I think I may have starting investigating too deep at first (By
> bisecting and enabling trace, etc). I stepped back and looked at the
> high level stats. The stress-ng test is started with one process for
> each core, and there are 96 of them. I looked at top[3] during a hang,
> and many of the stress-ng processes are running 'R'. However, a sysrq-q
> also shows many stress-ng processes are 'D' in uninterruptible sleep.
> What also sticks out to me is all the stress-ng processes are running as
> root with a priority of 20. Looking back at one of the call traces[1], I
> see jbd2 stuck in an uninterruptible state:
> ...
> [ 4461.908213] task:journal-offline state:D stack: 0 pid:17541
> ppid: 1 flags:0x00000226
> ...
>
>
> The jdb2 kernel thread also runs with a priority of 20[4]. When the
> hang happens, jbd2 is also stuck in an uninterruptible state(As well as
> systemd-journal):
> 1521 root 20 0 0 0 0 D 0.0 0.0 4:10.48
> jbd2/sda2-8
> 1593 root 19 -1 64692 15832 14512 D 0.0 0.1 0:01.54
> systemd-journal
>
>
>
> I don't yet know why running the test the same way for a generic kernel
> does not cause this behavior when it does for a preempt-rt kernel.
> Maybe it's a case of priority 'Sameness' and not priority inversion :-) ?
>
> I tried to pin all of the stress-ng threads to cores 1-95 and the kernel
> threads to a housekeeping cpu, 0. I recall though that there are certain
> kernel threads that need to run on every core and kworker is one of
> them. Output from cmdline:
> "BOOT_IMAGE=/boot/vmlinuz-5.15.0-1033-realtime
> root=UUID=3583d8c4-d539-439f-9d50-4341675268cc ro console=tty0
> console=ttyS0,115200 skew_tick=1 isolcpus=managed_irq,domain,1-95
> intel_pstate=disable nosoftlockup tsc=nowatchdog
> crashkernel=0M-2G:128M,2G-6G:256M,6G-8G:512M,8G-:768M"
>
> However, even with this pinning, stress-ng ends up running on cpu 0, per
> the ps output[4]. This may be why it is interfering with jbd2.
>
> I'll see if I can modify the test to run as a non-root user or with a
> lower priority. I could also try bumping the priority of jdb2. Maybe
> one of these would allow the journal to complete it's work and the test
> to finish?
>
> Could it be that that the system is not hung, it is just waiting to
> complete I/O, which will never happen since the jdb2 threads are stuck.
> In this case, this is not a bug, but a test that is not configured
> correctly for a real-time system. Does that sound plausible? If you
> think that is the case, I'll talk with the bug reporter and assist them
> with running the test properly for a real-time system.
Have you tried checking for low-memory during the test? Maybe the
system is unable to write because of slow page-cache allocations (see
https://www.socallinuxexpo.org/sites/default/files/presentations/Exploring%20%20Linux%20Memory%20Usage%20and%20%20Disk%20IO%20performance%20version%203.pdf)
or perhaps there is massive inter-NUMA-node rebalancing going on in
such a large system? Turning on CONFIG_PSI is a relatively easy
way to monitor memory problems. Also, have you tried connecting to
systemd-journald with GDB during the test, to see what it is doing?
Or tried calculating if the bandwidth to your storage devices is
simply maxed out?
-- Alison Chaiken
Aurora Innovation
achaiken@aurora.tech
next prev parent reply other threads:[~2023-03-13 15:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-15 18:39 System Hang With 5.15.79-rt54 Patch Set Joseph Salisbury
2022-12-15 23:39 ` Joseph Salisbury
2023-01-04 20:47 ` Joseph Salisbury
2023-01-18 18:52 ` Joseph Salisbury
2023-02-16 17:15 ` Sebastian Andrzej Siewior
2023-03-10 21:09 ` Joseph Salisbury
2023-03-13 15:11 ` Alison Chaiken [this message]
2023-02-28 16:50 ` Joseph Salisbury
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFzL-7sKDBRRoCBFsonhEVB2vhcZ8_GLkk1=se2-v41=GE9HUw@mail.gmail.com' \
--to=achaiken@aurora.tech \
--cc=bigeasy@linutronix.de \
--cc=joseph.salisbury@canonical.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).