From: Nathan Huckleberry <nhuck@google.com> Cc: Nathan Huckleberry <nhuck@google.com>, Eric Biggers <ebiggers@kernel.org>, "Theodore Y. Ts'o" <tytso@mit.edu>, fsverity@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH v2] fsverity: Remove WQ_UNBOUND from fsverity read workqueue Date: Fri, 10 Mar 2023 11:33:25 -0800 [thread overview] Message-ID: <20230310193325.620493-1-nhuck@google.com> (raw) WQ_UNBOUND causes significant scheduler latency on ARM64/Android. This is problematic for latency sensitive workloads, like I/O post-processing. Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related scheduler latency and improves app cold startup times by ~30ms. WQ_UNBOUND was also removed from the dm-verity workqueue for the same reason [1]. This code was tested by running Android app startup benchmarks and measuring how long the fsverity workqueue spent in the runnable state. Before Total workqueue scheduler latency: 553800us After Total workqueue scheduler latency: 18962us [1]: https://lore.kernel.org/all/20230202012348.885402-1-nhuck@google.com/ Signed-off-by: Nathan Huckleberry <nhuck@google.com> --- Changelog: v1 -> v2: - Added comment about WQ_UNBOUND - Added info about related dm-verity patches in commit message fs/verity/verify.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/verity/verify.c b/fs/verity/verify.c index f50e3b5b52c9..782b8b4a24c1 100644 --- a/fs/verity/verify.c +++ b/fs/verity/verify.c @@ -387,15 +387,15 @@ EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work); int __init fsverity_init_workqueue(void) { /* - * Use an unbound workqueue to allow bios to be verified in parallel - * even when they happen to complete on the same CPU. This sacrifices - * locality, but it's worthwhile since hashing is CPU-intensive. - * * Also use a high-priority workqueue to prioritize verification work, * which blocks reads from completing, over regular application tasks. + * + * This workqueue is not marked as unbound for performance reasons. + * Using an unbound workqueue for crypto operations causes excessive + * scheduler latency on ARM64. */ fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue", - WQ_UNBOUND | WQ_HIGHPRI, + WQ_HIGHPRI, num_online_cpus()); if (!fsverity_read_workqueue) return -ENOMEM; -- 2.40.0.rc1.284.g88254d51c5-goog
WARNING: multiple messages have this Message-ID (diff)
From: Nathan Huckleberry <nhuck@google.com> To: unlisted-recipients:; (no To-header on input) Cc: Nathan Huckleberry <nhuck@google.com>, Eric Biggers <ebiggers@kernel.org>, "Theodore Y. Ts'o" <tytso@mit.edu>, fsverity@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH v2] fsverity: Remove WQ_UNBOUND from fsverity read workqueue Date: Fri, 10 Mar 2023 11:33:25 -0800 [thread overview] Message-ID: <20230310193325.620493-1-nhuck@google.com> (raw) WQ_UNBOUND causes significant scheduler latency on ARM64/Android. This is problematic for latency sensitive workloads, like I/O post-processing. Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related scheduler latency and improves app cold startup times by ~30ms. WQ_UNBOUND was also removed from the dm-verity workqueue for the same reason [1]. This code was tested by running Android app startup benchmarks and measuring how long the fsverity workqueue spent in the runnable state. Before Total workqueue scheduler latency: 553800us After Total workqueue scheduler latency: 18962us [1]: https://lore.kernel.org/all/20230202012348.885402-1-nhuck@google.com/ Signed-off-by: Nathan Huckleberry <nhuck@google.com> --- Changelog: v1 -> v2: - Added comment about WQ_UNBOUND - Added info about related dm-verity patches in commit message fs/verity/verify.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/verity/verify.c b/fs/verity/verify.c index f50e3b5b52c9..782b8b4a24c1 100644 --- a/fs/verity/verify.c +++ b/fs/verity/verify.c @@ -387,15 +387,15 @@ EXPORT_SYMBOL_GPL(fsverity_enqueue_verify_work); int __init fsverity_init_workqueue(void) { /* - * Use an unbound workqueue to allow bios to be verified in parallel - * even when they happen to complete on the same CPU. This sacrifices - * locality, but it's worthwhile since hashing is CPU-intensive. - * * Also use a high-priority workqueue to prioritize verification work, * which blocks reads from completing, over regular application tasks. + * + * This workqueue is not marked as unbound for performance reasons. + * Using an unbound workqueue for crypto operations causes excessive + * scheduler latency on ARM64. */ fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue", - WQ_UNBOUND | WQ_HIGHPRI, + WQ_HIGHPRI, num_online_cpus()); if (!fsverity_read_workqueue) return -ENOMEM; -- 2.40.0.rc1.284.g88254d51c5-goog
next reply other threads:[~2023-03-10 19:33 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-03-10 19:33 Nathan Huckleberry [this message] 2023-03-10 19:33 ` [PATCH v2] fsverity: Remove WQ_UNBOUND from fsverity read workqueue Nathan Huckleberry 2023-03-13 22:59 ` Eric Biggers 2023-03-13 23:07 ` Nathan Huckleberry 2023-03-14 23:20 ` Eric Biggers
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230310193325.620493-1-nhuck@google.com \ --to=nhuck@google.com \ --cc=ebiggers@kernel.org \ --cc=fsverity@lists.linux.dev \ --cc=linux-kernel@vger.kernel.org \ --cc=tytso@mit.edu \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.