All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Metzmacher <metze@samba.org>
To: Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@redhat.com>
Cc: linux-trace-devel@vger.kernel.org, io-uring <io-uring@vger.kernel.org>
Subject: Tracing busy processes/threads freezes/stalls the whole machine
Date: Thu, 22 Apr 2021 16:26:57 +0200	[thread overview]
Message-ID: <293cfb1d-8a53-21e1-83c1-cdb6e2f32c65@samba.org> (raw)


Hi Steven, hi Ingo,

I recently tried to analyze the performance of Samba using io-uring.

I was using ubuntu 20.04 with the 5.10.0-1023-oem kernel, which is based on v5.10.25, see:
https://kernel.ubuntu.com/git/kernel-ppa/mirror/ubuntu-oem-5.10-focal.git/log/?h=oem-5.10-prep
trace-cmd is at version 2.8.3-4build1.

In order to find the bottleneck I tried to use (trace-cmd is at version 2.8.3-4build1):

  trace-cmd -e all -P ${pid_of_io_uring_worker}

As a result the server was completely dead immediately.

I tried to reproduce this in a virtual machine (inside virtualbox).

I used a modified 'io_uring-cp' that loops forever, see:
https://github.com/metze-samba/liburing/commit/5e98efed053baf03521692e786c1c55690b04d8e

When I run './io_uring-cp-forever link-cp.c file',
I see a 'io_wq_manager' and a 'io_wqe_worker-0' kernel thread,
while './io_uring-cp-forever link-cp.c file' as well as 'io_wqe_worker-0'
consume about 25% cpu each.

When I run 'trace-cmd -e all -P $pid' for 'io_wqe_worker-0' or 'io_wq_manager'
I can reproduce the problem, then I found that the same seems to happen for
also for other kernel threads e.g. '[kworker/1:1-events]', it seems that
it happens for all kernel threads, which are not completely idle.

Which this:

 From 'top':
   1341 root      20   0    2512    576    508 S  33,4   0,1   0:10.39 io_uring-cp-for
   1343 root      20   0       0      0      0 R  29,8   0,0   0:08.43 io_wqe_worker-0
      7 root      20   0       0      0      0 I   0,3   0,0   0:00.31 kworker/0:1-events

   PID 5 is [kworker/0:0-ata_sff]

# trace-cmd record -e all -P 5'
Hit Ctrl^C to stop recording
^CCPU0 data recorded at offset=0x7b8000
    0 bytes in size
CPU1 data recorded at offset=0x7b8000
    69632 bytes in size

# But
# trace-cmd record -e all -P 7
=> machine unresponsive (no blinking cursor on the console anymore)
On the host 'top' shows that the VirtualBoxVM cpu emulator thread 'EMT-1'
uses 100% cpu, so I guess the guest kernel is in something like an endless
recursion loop. Maybe a trace function recursing to itself?

On the same VM I tried a 5.12rc8 kernel and there I can also reproduce the
problem.

I also managed to reproduce the problem without io-uring, just using:

 while true; do cat linux-image-5.12.0-rc8-dbg_5.12.0-rc8-22_amd64.deb > /dev/null; done

in order to keep some kernel threads moving.
This happens with 5.12rc8 and 5.10.0-1023-oem, but I was not able to
reproduce any of this using the 5.8.0-50-generic kernel, see
https://kernel.ubuntu.com/git/ubuntu/ubuntu-focal.git/log/?h=Ubuntu-hwe-5.8-5.8.0-50.56_20.04.1

I was also able to reproduce this with a ubuntu 21.04 vm using
the 5.11.0-14-generic kernel, see:
https://kernel.ubuntu.com/git/ubuntu/ubuntu-hirsute.git/log/?h=Ubuntu-5.11.0-14.15
On this one I only managed to reproduce the problem with
'./io_uring-cp-forever link-cp.c file', but not with
'while true; do cat linux-image-5.12.0-rc8-dbg_5.12.0-rc8-22_amd64.deb > /dev/null; done'


So it seems the problem was introduced after 5.8 and is not really related to
io-uring. And it may not be purely related to kernel threads.

With this on 5.12-rc8 (again):

  └─tmux: server,903
      ├─bash,904
      │   └─io_uring-cp-for,925 link-cp.c file
      │       ├─{iou-mgr-925},926
      │       └─{iou-wrk-925},927
      └─bash,929
          └─pstree,938 -a -t -p

I was able to to trace once:

root@ub1704-166:~# trace-cmd record -e all -P 925
Hit Ctrl^C to stop recording
^CCPU0 data recorded at offset=0x7b8000
    10842112 bytes in size
CPU1 data recorded at offset=0x120f000
    36450304 bytes in size

But the 2nd run reproduced the problem:
root@ub1704-166:~# trace-cmd record -e all -P 925

I was also able to reproduce it with:

while true; do cat linux-image-5.12.0-rc8-dbg_5.12.0-rc8-22_amd64.deb > /dev/null; done
and
pidofcat=$(pidof cat); echo $pidofcat; trace-cmd record -e all -P $pidofcat

So it seems any busy thread (user or kernel) triggers the problem.

Any ideas what has changed after 5.8?

Thanks!
metze

             reply	other threads:[~2021-04-22 14:27 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 14:26 Stefan Metzmacher [this message]
2021-04-22 14:55 ` Tracing busy processes/threads freezes/stalls the whole machine Stefan Metzmacher
2021-05-04 12:26 ` Stefan Metzmacher
2021-05-04 13:24 ` Steven Rostedt
2021-05-04 13:28   ` Stefan Metzmacher
2021-05-04 13:32     ` Steven Rostedt
2021-05-04 13:35     ` Steven Rostedt
2021-05-05  9:50       ` Stefan Metzmacher
2021-05-31 15:39         ` Stefan Metzmacher
2021-07-19 17:07           ` Stefan Metzmacher
2021-07-22 22:43             ` sched_waking vs. set_event_pid crash (Re: Tracing busy processes/threads freezes/stalls the whole machine) Stefan Metzmacher
2021-07-23  1:41               ` Steven Rostedt
2021-07-23  2:51               ` Steven Rostedt
2021-07-23 10:35                 ` Stefan Metzmacher
2021-07-23 11:29                   ` Steven Rostedt
2021-07-23 11:53                     ` Stefan Metzmacher
2021-07-23 12:41                       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=293cfb1d-8a53-21e1-83c1-cdb6e2f32c65@samba.org \
    --to=metze@samba.org \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.