All of lore.kernel.org
 help / color / mirror / Atom feed
From: vcaputo@pengaru.com
To: linux-kernel <linux-kernel@vger.kernel.org>
Cc: timmurray@google.com, tj@kernel.org
Subject: [REGRESSION] (>= v4.12) IO w/dmcrypt causing audio underruns
Date: Wed, 29 Nov 2017 10:39:19 -0800	[thread overview]
Message-ID: <20171129183919.GQ692@shells.gnugeneration.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3396 bytes --]

Hello,

Recently I noticed substantial audio dropouts when listening to MP3s in
`cmus` while doing big and churny `git checkout` commands in my linux git
tree.

It's not something I've done much of over the last couple months so I
hadn't noticed until yesterday, but didn't remember this being a problem in
recent history.

As there's quite an accumulation of similarly configured and built kernels
in my grub menu, it was trivial to determine approximately when this began:

4.11.0: no dropouts
4.12.0-rc7: dropouts
4.14.0-rc6: dropouts (seem more substantial as well, didn't investigate)

Watching top while this is going on in the various kernel versions, it's
apparent that the kworker behavior changed.  Both the priority and quantity
of running kworker threads is elevated in kernels experiencing dropouts.

Searching through the commit history for v4.11..v4.12 uncovered:

commit a1b89132dc4f61071bdeaab92ea958e0953380a1
Author: Tim Murray <timmurray@google.com>
Date:   Fri Apr 21 11:11:36 2017 +0200

    dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues
    
    Running dm-crypt with workqueues at the standard priority results in IO
    competing for CPU time with standard user apps, which can lead to
    pipeline bubbles and seriously degraded performance.  Move to using
    WQ_HIGHPRI workqueues to protect against that.
    
    Signed-off-by: Tim Murray <timmurray@google.com>
    Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
    Signed-off-by: Mike Snitzer <snitzer@redhat.com>

---

Reverting a1b8913 from 4.14.0-rc6, my current kernel, eliminates the
problem completely.

Looking at the diff in that commit, it looks like the commit message isn't
even accurate; not only is the priority of the dmcrypt workqueues being
changed - they're also being made "CPU intensive" workqueues as well.

This combination appears to result in both elevated scheduling priority and
greater quantity of participant worker threads effectively starving any
normal priority user task under periods of heavy IO on dmcrypt volumes.

I don't know what the right solution is here.  It seems to me we're lacking
the appropriate mechanism for charging CPU resources consumed on behalf of
user processes in kworker threads to the work-causing process.

What effectively happens is my normal `git` user process is able to
greatly amplify what share of CPU it takes from the system by generating IO
on what happens to be a high-priority CPU-intensive storage volume.

It looks potentially complicated to fix properly, but I suspect at its core
this may be a fairly longstanding shortcoming of the page cache and its
asynchronous design.  Something that has been exacerbated substantially by
the introduction of CPU-intensive storage subsystems like dmcrypt.

If we imagine the whole stack simplified, where all the IO was being done
synchronously in-band, and the dmcrypt kernel code simply ran in the
IO-causing process context, it would be getting charged to the calling
process and scheduled accordingly.  The resource accounting and scheduling
problems all emerge with the page cache, buffered IO, and async background
writeback in a pool of unrelated worker threads, etc.  That's how it
appears to me anyways...

The system used is a X61s Thinkpad 1.8Ghz with 840 EVO SSD, lvm on dmcrypt.
The kernel .config is attached in case it's of interest.

Thanks,
Vito Caputo

[-- Attachment #2: config-x61s.gz --]
[-- Type: application/gzip, Size: 25327 bytes --]

             reply	other threads:[~2017-11-29 18:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-29 18:39 vcaputo [this message]
2017-12-01 21:33 ` [REGRESSION] (>= v4.12) IO w/dmcrypt causing audio underruns vcaputo
2017-12-18  9:25   ` Enric Balletbo Serra
2018-01-17 22:48     ` vcaputo
2018-01-19 10:57       ` Enric Balletbo Serra
2018-01-25  6:45         ` vcaputo
2018-01-25  7:49         ` vcaputo
2018-01-25  8:33         ` vcaputo
2018-05-28  3:32           ` Vito Caputo
2018-05-28 17:01             ` Enric Balletbo Serra
2018-05-28 17:34               ` Vito Caputo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171129183919.GQ692@shells.gnugeneration.com \
    --to=vcaputo@pengaru.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=timmurray@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.