linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fscrypt: use unbound workqueue for decryption
@ 2018-04-20 23:30 Eric Biggers
  2018-05-21  0:55 ` Theodore Y. Ts'o
  0 siblings, 1 reply; 2+ messages in thread
From: Eric Biggers @ 2018-04-20 23:30 UTC (permalink / raw)
  To: linux-fscrypt, Theodore Y . Ts'o
  Cc: Jaegeuk Kim, Paul Crowley, Enric Balletbo i Serra,
	Mikulas Patocka, linux-kernel, Eric Biggers

Improve fscrypt read performance by switching the decryption workqueue
from bound to unbound.  With the bound workqueue, when multiple bios
completed on the same CPU, they were decrypted on that same CPU.  But
with the unbound queue, they are now decrypted in parallel on any CPU.

Although fscrypt read performance can be tough to measure due to the
many sources of variation, this change is most beneficial when
decryption is slow, e.g. on CPUs without AES instructions.  For example,
I timed tarring up encrypted directories on f2fs.  On x86 with AES-NI
instructions disabled, the unbound workqueue improved performance by
about 25-35%, using 1 to NUM_CPUs jobs with 4 or 8 CPUs available.  But
with AES-NI enabled, performance was unchanged to within ~2%.

I also did the same test on a quad-core ARM CPU using xts-speck128-neon
encryption.  There performance was usually about 10% better with the
unbound workqueue, bringing it closer to the unencrypted speed.

The unbound workqueue may be worse in some cases due to worse locality,
but I think it's still the better default.  dm-crypt uses an unbound
workqueue by default too, so this change makes fscrypt match.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/crypto/crypto.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index ce654526c0fb..984e190f9b89 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -427,8 +427,17 @@ int fscrypt_initialize(unsigned int cop_flags)
  */
 static int __init fscrypt_init(void)
 {
+	/*
+	 * Use an unbound workqueue to allow bios to be decrypted in parallel
+	 * even when they happen to complete on the same CPU.  This sacrifices
+	 * locality, but it's worthwhile since decryption is CPU-intensive.
+	 *
+	 * Also use a high-priority workqueue to prioritize decryption work,
+	 * which blocks reads from completing, over regular application tasks.
+	 */
 	fscrypt_read_workqueue = alloc_workqueue("fscrypt_read_queue",
-							WQ_HIGHPRI, 0);
+						 WQ_UNBOUND | WQ_HIGHPRI,
+						 num_online_cpus());
 	if (!fscrypt_read_workqueue)
 		goto fail;
 
-- 
2.17.0.484.g0c8726318c-goog

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] fscrypt: use unbound workqueue for decryption
  2018-04-20 23:30 [PATCH] fscrypt: use unbound workqueue for decryption Eric Biggers
@ 2018-05-21  0:55 ` Theodore Y. Ts'o
  0 siblings, 0 replies; 2+ messages in thread
From: Theodore Y. Ts'o @ 2018-05-21  0:55 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fscrypt, Jaegeuk Kim, Paul Crowley, Enric Balletbo i Serra,
	Mikulas Patocka, linux-kernel

On Fri, Apr 20, 2018 at 04:30:02PM -0700, Eric Biggers wrote:
> Improve fscrypt read performance by switching the decryption workqueue
> from bound to unbound.  With the bound workqueue, when multiple bios
> completed on the same CPU, they were decrypted on that same CPU.  But
> with the unbound queue, they are now decrypted in parallel on any CPU.
> 
> Although fscrypt read performance can be tough to measure due to the
> many sources of variation, this change is most beneficial when
> decryption is slow, e.g. on CPUs without AES instructions.  For example,
> I timed tarring up encrypted directories on f2fs.  On x86 with AES-NI
> instructions disabled, the unbound workqueue improved performance by
> about 25-35%, using 1 to NUM_CPUs jobs with 4 or 8 CPUs available.  But
> with AES-NI enabled, performance was unchanged to within ~2%.
> 
> I also did the same test on a quad-core ARM CPU using xts-speck128-neon
> encryption.  There performance was usually about 10% better with the
> unbound workqueue, bringing it closer to the unencrypted speed.
> 
> The unbound workqueue may be worse in some cases due to worse locality,
> but I think it's still the better default.  dm-crypt uses an unbound
> workqueue by default too, so this change makes fscrypt match.
> 
> Signed-off-by: Eric Biggers <ebiggers@google.com>

Applied, thanks.

						- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-05-21  0:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-20 23:30 [PATCH] fscrypt: use unbound workqueue for decryption Eric Biggers
2018-05-21  0:55 ` Theodore Y. Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).