All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
@ 2023-03-09 21:37 ` Nathan Huckleberry
  0 siblings, 0 replies; 6+ messages in thread
From: Nathan Huckleberry @ 2023-03-09 21:37 UTC (permalink / raw)
  Cc: Nathan Huckleberry, Eric Biggers, Theodore Y. Ts'o, fsverity,
	linux-kernel

WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  This
is problematic for latency sensitive workloads like I/O post-processing.

Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
scheduler latency and improves app cold startup times by ~30ms.

This code was tested by running Android app startup benchmarks and
measuring how long the fsverity workqueue spent in the ready queue.

Before
Total workqueue scheduler latency: 553800us
After
Total workqueue scheduler latency: 18962us

Change-Id: I693efee541757851ed6d229430111cd763d39067
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
---
 fs/verity/verify.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index f50e3b5b52c9..e8ec37774d63 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -395,7 +395,7 @@ int __init fsverity_init_workqueue(void)
 	 * which blocks reads from completing, over regular application tasks.
 	 */
 	fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue",
-						  WQ_UNBOUND | WQ_HIGHPRI,
+						  WQ_HIGHPRI,
 						  num_online_cpus());
 	if (!fsverity_read_workqueue)
 		return -ENOMEM;
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
@ 2023-03-09 21:37 ` Nathan Huckleberry
  0 siblings, 0 replies; 6+ messages in thread
From: Nathan Huckleberry @ 2023-03-09 21:37 UTC (permalink / raw)
  Cc: Nathan Huckleberry, Eric Biggers, Theodore Y. Ts'o, fsverity,
	linux-kernel

WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  This
is problematic for latency sensitive workloads like I/O post-processing.

Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
scheduler latency and improves app cold startup times by ~30ms.

This code was tested by running Android app startup benchmarks and
measuring how long the fsverity workqueue spent in the ready queue.

Before
Total workqueue scheduler latency: 553800us
After
Total workqueue scheduler latency: 18962us

Change-Id: I693efee541757851ed6d229430111cd763d39067
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
---
 fs/verity/verify.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/verity/verify.c b/fs/verity/verify.c
index f50e3b5b52c9..e8ec37774d63 100644
--- a/fs/verity/verify.c
+++ b/fs/verity/verify.c
@@ -395,7 +395,7 @@ int __init fsverity_init_workqueue(void)
 	 * which blocks reads from completing, over regular application tasks.
 	 */
 	fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue",
-						  WQ_UNBOUND | WQ_HIGHPRI,
+						  WQ_HIGHPRI,
 						  num_online_cpus());
 	if (!fsverity_read_workqueue)
 		return -ENOMEM;
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
  2023-03-09 21:37 ` Nathan Huckleberry
  (?)
@ 2023-03-10  5:11 ` Eric Biggers
  2023-03-10  6:55   ` Hillf Danton
  -1 siblings, 1 reply; 6+ messages in thread
From: Eric Biggers @ 2023-03-10  5:11 UTC (permalink / raw)
  To: Nathan Huckleberry; +Cc: Theodore Y. Ts'o, fsverity, linux-kernel

On Thu, Mar 09, 2023 at 01:37:41PM -0800, Nathan Huckleberry wrote:
> WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  This
> is problematic for latency sensitive workloads like I/O post-processing.
> 
> Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
> scheduler latency and improves app cold startup times by ~30ms.

Maybe mention that WQ_UNBOUND was recently removed from the dm-verity workqueue
too, for the same reason?

I'm still amazed that it's such a big improvement!  I don't really need it to
apply this patch, but it would be very interesting to know exactly why the
latency is so bad with WQ_UNBOUND.

> 
> This code was tested by running Android app startup benchmarks and
> measuring how long the fsverity workqueue spent in the ready queue.
> 
> Before
> Total workqueue scheduler latency: 553800us
> After
> Total workqueue scheduler latency: 18962us
> 
> Change-Id: I693efee541757851ed6d229430111cd763d39067

No Change-Id in upstream patches, please.

> diff --git a/fs/verity/verify.c b/fs/verity/verify.c
> index f50e3b5b52c9..e8ec37774d63 100644
> --- a/fs/verity/verify.c
> +++ b/fs/verity/verify.c
> @@ -395,7 +395,7 @@ int __init fsverity_init_workqueue(void)
>  	 * which blocks reads from completing, over regular application tasks.
>  	 */
>  	fsverity_read_workqueue = alloc_workqueue("fsverity_read_queue",
> -						  WQ_UNBOUND | WQ_HIGHPRI,
> +						  WQ_HIGHPRI,
>  						  num_online_cpus());

There's a comment just above here that explains why WQ_UNBOUND is being used.
It needs to be updated to explain why WQ_UNBOUND is *not* being used.

- Eric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
  2023-03-10  5:11 ` Eric Biggers
@ 2023-03-10  6:55   ` Hillf Danton
  2023-03-10 19:09     ` Nathan Huckleberry
  0 siblings, 1 reply; 6+ messages in thread
From: Hillf Danton @ 2023-03-10  6:55 UTC (permalink / raw)
  To: Eric Biggers; +Cc: Nathan Huckleberry, fsverity, linux-kernel

On 9 Mar 2023 21:11:47 -0800 Eric Biggers <ebiggers@kernel.org>
> On Thu, Mar 09, 2023 at 01:37:41PM -0800, Nathan Huckleberry wrote:
> > WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  This
> > is problematic for latency sensitive workloads like I/O post-processing.
> > 
> > Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
> > scheduler latency and improves app cold startup times by ~30ms.
> 
> Maybe mention that WQ_UNBOUND was recently removed from the dm-verity workqueue
> too, for the same reason?
> 
> I'm still amazed that it's such a big improvement!  I don't really need it to
> apply this patch, but it would be very interesting to know exactly why the
> latency is so bad with WQ_UNBOUND.
> 
> > This code was tested by running Android app startup benchmarks and
> > measuring how long the fsverity workqueue spent in the ready queue.
> > 
> > Before
> > Total workqueue scheduler latency: 553800us
> > After
> > Total workqueue scheduler latency: 18962us

Given the gap between data above and the 15253 us in diagram[1], and
the SHA instructions[2], could you specify a bit on your test?

[1] https://lore.kernel.org/linux-erofs/20230106073502.4017276-1-dhavale@google.com/
[2] https://lore.kernel.org/lkml/CAJkfWY490-m6wNubkxiTPsW59sfsQs37Wey279LmiRxKt7aQYg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
  2023-03-10  6:55   ` Hillf Danton
@ 2023-03-10 19:09     ` Nathan Huckleberry
  2023-03-11  0:15       ` Hillf Danton
  0 siblings, 1 reply; 6+ messages in thread
From: Nathan Huckleberry @ 2023-03-10 19:09 UTC (permalink / raw)
  To: Hillf Danton; +Cc: Eric Biggers, fsverity, linux-kernel

On Fri, Mar 10, 2023 at 12:01 AM Hillf Danton <hdanton@sina.com> wrote:
>
> On 9 Mar 2023 21:11:47 -0800 Eric Biggers <ebiggers@kernel.org>
> > On Thu, Mar 09, 2023 at 01:37:41PM -0800, Nathan Huckleberry wrote:
> > > WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  This
> > > is problematic for latency sensitive workloads like I/O post-processing.
> > >
> > > Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue related
> > > scheduler latency and improves app cold startup times by ~30ms.
> >
> > Maybe mention that WQ_UNBOUND was recently removed from the dm-verity workqueue
> > too, for the same reason?
> >
> > I'm still amazed that it's such a big improvement!  I don't really need it to
> > apply this patch, but it would be very interesting to know exactly why the
> > latency is so bad with WQ_UNBOUND.

My current guess for the root cause is excessing saving/restoring of
the FPSIMD state.

> >
> > > This code was tested by running Android app startup benchmarks and
> > > measuring how long the fsverity workqueue spent in the ready queue.
> > >
> > > Before
> > > Total workqueue scheduler latency: 553800us
> > > After
> > > Total workqueue scheduler latency: 18962us
>
> Given the gap between data above and the 15253 us in diagram[1], and
> the SHA instructions[2], could you specify a bit on your test?

The test I'm running opens the Android messaging APK which is
validated with fsverity. It opens the messaging app 25 times, dropping
caches each time. The benchmark produces a Perfetto trace which we use
to compute the scheduler latency. We sum up the amount of time that
each fsverity worker spent in the ready state. The test in [1] is
similar, but may be using a different APK. These tests are not in
AOSP, so I can't share a link to them, but I would expect that fio on
a ramdisk would produce similarly good results.

>
> [1] https://lore.kernel.org/linux-erofs/20230106073502.4017276-1-dhavale@google.com/
> [2] https://lore.kernel.org/lkml/CAJkfWY490-m6wNubkxiTPsW59sfsQs37Wey279LmiRxKt7aQYg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue
  2023-03-10 19:09     ` Nathan Huckleberry
@ 2023-03-11  0:15       ` Hillf Danton
  0 siblings, 0 replies; 6+ messages in thread
From: Hillf Danton @ 2023-03-11  0:15 UTC (permalink / raw)
  To: Nathan Huckleberry; +Cc: Eric Biggers, fsverity, linux-kernel

On 10 Mar 2023 11:09:55 -0800 Nathan Huckleberry <nhuck@google.com>
> On Fri, Mar 10, 2023 at 12:01=E2=80=AFAM Hillf Danton <hdanton@sina.com> wr=
> ote:
> >
> > On 9 Mar 2023 21:11:47 -0800 Eric Biggers <ebiggers@kernel.org>
> > > On Thu, Mar 09, 2023 at 01:37:41PM -0800, Nathan Huckleberry wrote:
> > > > WQ_UNBOUND causes significant scheduler latency on ARM64/Android.  Th=
> is
> > > > is problematic for latency sensitive workloads like I/O post-processi=
> ng.
> > > >
> > > > Removing WQ_UNBOUND gives a 96% reduction in fsverity workqueue relat=
> ed
> > > > scheduler latency and improves app cold startup times by ~30ms.
> > >
> > > Maybe mention that WQ_UNBOUND was recently removed from the dm-verity w=
> orkqueue
> > > too, for the same reason?
> > >
> > > I'm still amazed that it's such a big improvement!  I don't really need=
>  it to
> > > apply this patch, but it would be very interesting to know exactly why =
> the
> > > latency is so bad with WQ_UNBOUND.
> 
> My current guess for the root cause is excessing saving/restoring of
> the FPSIMD state.
> 
> > >
> > > > This code was tested by running Android app startup benchmarks and
> > > > measuring how long the fsverity workqueue spent in the ready queue.
> > > >
> > > > Before
> > > > Total workqueue scheduler latency: 553800us
> > > > After
> > > > Total workqueue scheduler latency: 18962us
> >
> > Given the gap between data above and the 15253 us in diagram[1], and
> > the SHA instructions[2], could you specify a bit on your test?
> 
> The test I'm running opens the Android messaging APK which is
> validated with fsverity. It opens the messaging app 25 times, dropping
> caches each time. The benchmark produces a Perfetto trace which we use
> to compute the scheduler latency. We sum up the amount of time that
> each fsverity worker spent in the ready state. The test in [1] is
> similar, but may be using a different APK. These tests are not in
> AOSP, so I can't share a link to them, but I would expect that fio on
> a ramdisk would produce similarly good results.

Thanks for your introduction to the test.
Are the similar results ARM64/Android specific? Could this patch help X86?
> 
> >
> > [1] https://lore.kernel.org/linux-erofs/20230106073502.4017276-1-dhavale@=
> google.com/
> > [2] https://lore.kernel.org/lkml/CAJkfWY490-m6wNubkxiTPsW59sfsQs37Wey279L=
> miRxKt7aQYg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-11  0:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-09 21:37 [PATCH] fsverity: Remove WQ_UNBOUND from fsverity read workqueue Nathan Huckleberry
2023-03-09 21:37 ` Nathan Huckleberry
2023-03-10  5:11 ` Eric Biggers
2023-03-10  6:55   ` Hillf Danton
2023-03-10 19:09     ` Nathan Huckleberry
2023-03-11  0:15       ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.