All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls
@ 2014-08-06  9:33 Paolo Bonzini
  2014-08-07  9:33 ` Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Paolo Bonzini @ 2014-08-06  9:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, ming.lei, stefanha

This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 100000000 iterations: 0.173884 s
   Yield 100000000 iterations: 8.445064 s
   Lifecycle 1000000 iterations: 0.098445 s
   Nesting 10000 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/test-coroutine.c | 24 ++++++++++++++++
 1 files changed, 24 insertions(+)

diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
index 760636d..6e634f4 100644
--- a/tests/test-coroutine.c
+++ b/tests/test-coroutine.c
@@ -288,6 +288,29 @@ static void perf_yield(void)
         maxcycles, duration);
 }
 
+static __attribute__((noinline)) void dummy(unsigned *i)
+{
+    (*i)--;
+}
+
+static void perf_baseline(void)
+{
+    unsigned int i, maxcycles;
+    double duration;
+
+    maxcycles = 100000000;
+    i = maxcycles;
+
+    g_test_timer_start();
+    while (i > 0) {
+        dummy(&i);
+    }
+    duration = g_test_timer_elapsed();
+
+    g_test_message("Function call %u iterations: %f s\n",
+        maxcycles, duration);
+}
+
 int main(int argc, char **argv)
 {
     g_test_init(&argc, &argv, NULL);
@@ -301,6 +324,7 @@ int main(int argc, char **argv)
         g_test_add_func("/perf/lifecycle", perf_lifecycle);
         g_test_add_func("/perf/nesting", perf_nesting);
         g_test_add_func("/perf/yield", perf_yield);
+        g_test_add_func("/perf/function-call", perf_baseline);
     }
     return g_test_run();
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls
  2014-08-06  9:33 [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls Paolo Bonzini
@ 2014-08-07  9:33 ` Stefan Hajnoczi
  2014-08-07 14:08   ` Kevin Wolf
  2014-08-11  0:44 ` Ming Lei
  2014-08-11 10:08 ` Ming Lei
  2 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2014-08-07  9:33 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, ming.lei, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]

On Wed, Aug 06, 2014 at 11:33:41AM +0200, Paolo Bonzini wrote:
> This can be used to compute the cost of coroutine operations.  In the
> end the cost of the function call is a few clock cycles, so it's pretty
> cheap for now, but it may become more relevant as the coroutine code
> is optimized.
> 
> For example, here are the results on my machine:
> 
>    Function call 100000000 iterations: 0.173884 s
>    Yield 100000000 iterations: 8.445064 s
>    Lifecycle 1000000 iterations: 0.098445 s
>    Nesting 10000 iterations of 1000 depth each: 7.406431 s
> 
> One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
> one coroutine allocation takes (roughly, since some of the allocations
> in the nesting test do hit the pool) 739 nanoseconds:
> 
>    (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
>    (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
>    (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  tests/test-coroutine.c | 24 ++++++++++++++++
>  1 files changed, 24 insertions(+)

Can't hurt to have this as a comparison.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls
  2014-08-07  9:33 ` Stefan Hajnoczi
@ 2014-08-07 14:08   ` Kevin Wolf
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Wolf @ 2014-08-07 14:08 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, ming.lei, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1362 bytes --]

Am 07.08.2014 um 11:33 hat Stefan Hajnoczi geschrieben:
> On Wed, Aug 06, 2014 at 11:33:41AM +0200, Paolo Bonzini wrote:
> > This can be used to compute the cost of coroutine operations.  In the
> > end the cost of the function call is a few clock cycles, so it's pretty
> > cheap for now, but it may become more relevant as the coroutine code
> > is optimized.
> > 
> > For example, here are the results on my machine:
> > 
> >    Function call 100000000 iterations: 0.173884 s
> >    Yield 100000000 iterations: 8.445064 s
> >    Lifecycle 1000000 iterations: 0.098445 s
> >    Nesting 10000 iterations of 1000 depth each: 7.406431 s
> > 
> > One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
> > one coroutine allocation takes (roughly, since some of the allocations
> > in the nesting test do hit the pool) 739 nanoseconds:
> > 
> >    (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
> >    (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
> >    (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  tests/test-coroutine.c | 24 ++++++++++++++++
> >  1 files changed, 24 insertions(+)
> 
> Can't hurt to have this as a comparison.
> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Thanks, applied to the block branch.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls
  2014-08-06  9:33 [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls Paolo Bonzini
  2014-08-07  9:33 ` Stefan Hajnoczi
@ 2014-08-11  0:44 ` Ming Lei
  2014-08-11 10:08 ` Ming Lei
  2 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-08-11  0:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

Hi,

On Wed, Aug 6, 2014 at 5:33 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This can be used to compute the cost of coroutine operations.  In the
> end the cost of the function call is a few clock cycles, so it's pretty
> cheap for now, but it may become more relevant as the coroutine code
> is optimized.
>
> For example, here are the results on my machine:
>
>    Function call 100000000 iterations: 0.173884 s
>    Yield 100000000 iterations: 8.445064 s
>    Lifecycle 1000000 iterations: 0.098445 s
>    Nesting 10000 iterations of 1000 depth each: 7.406431 s
>
> One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
> one coroutine allocation takes (roughly, since some of the allocations
> in the nesting test do hit the pool) 739 nanoseconds:
>
>    (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
>    (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
>    (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

IMO, it isn't enough to only consider the cost of coroutine APIs,
given stack is switched back and forth and its usage is inevitable
in reality, the cost of this part should be tested too.

I will post a patch to test costs of coroutine from user view today,
so that the total costs can be observed with using coroutine.

Thanks,

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls
  2014-08-06  9:33 [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls Paolo Bonzini
  2014-08-07  9:33 ` Stefan Hajnoczi
  2014-08-11  0:44 ` Ming Lei
@ 2014-08-11 10:08 ` Ming Lei
  2 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2014-08-11 10:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, qemu-devel, Stefan Hajnoczi

On Wed, Aug 6, 2014 at 5:33 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This can be used to compute the cost of coroutine operations.  In the
> end the cost of the function call is a few clock cycles, so it's pretty
> cheap for now, but it may become more relevant as the coroutine code
> is optimized.
>
> For example, here are the results on my machine:
>
>    Function call 100000000 iterations: 0.173884 s
>    Yield 100000000 iterations: 8.445064 s
>    Lifecycle 1000000 iterations: 0.098445 s
>    Nesting 10000 iterations of 1000 depth each: 7.406431 s
>
> One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
> one coroutine allocation takes (roughly, since some of the allocations
> in the nesting test do hit the pool) 739 nanoseconds:
>
>    (8.445064 - 0.173884) * 10^9 / 100000000 = 82.7
>    (0.098445 * 100 - 0.173884) * 10^9 / 100000000 = 96.7
>    (7.406431 * 10 - 0.173884) * 10^9 / 100000000 = 738.9

Thought about it further, the above is _not_ cheap if it is computed correctly,
take block layer for example:

- suppose one block device can reach 300K IOPS, like Kevin's loop
over tmpfs file, so handling one IO takes 3.333us
- for handling one IO, at least two enter and one yield are required in current
implementation, so these three operations take 0.277us(0.083 + 0.097 * 2)
for handling one IO(suppose all allocations hit the pool)
- so coroutine-only operations take 8.31%(0.277/3.333), not mention effect
from switching stack

Some modern storage devices can handle millions of IOPS, so using
coroutine surely will slow down these devices a lot.

Thanks,

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-08-11 10:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-06  9:33 [Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls Paolo Bonzini
2014-08-07  9:33 ` Stefan Hajnoczi
2014-08-07 14:08   ` Kevin Wolf
2014-08-11  0:44 ` Ming Lei
2014-08-11 10:08 ` Ming Lei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.