xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/19] Assorted fixes and improvements to Credit2
@ 2016-06-17 17:32 Dario Faggioli
  2016-06-17 23:08 ` Dario Faggioli
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dario Faggioli @ 2016-06-17 17:32 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar, David Vrabel, Jan Beulich

Hi everyone,

Here you go a collection of pseudo-random fixes and improvement to Credit2.

In the process of working on Soft Affinity and Caps support, I stumbled upon
them, one after the other, and decided to take care.

It's been hard to test and run benchmark, due to the "time goes backwards" bug
I uncovered [1], and this is at least part of the reason why the code for
affinity and caps is still missing. I've got it already, but need to refine a
couple of things, after double checking benchmark results. So, now that we have
Jan's series [2] (thanks! [*]), and that I managed to indeed run some tests on
this preliminary set of patches, I decided I better set this first group free,
while working on finishing the rest.

The various patches do a wide range of different things, so, please, refer to
individual changelogs for more detailed explanation.

About the numbers I could collect so far, here's the situation. I've run rather
simple benchmarks such as: - Xen build inside a VM. Metric is how log that
takes (in seconds), so lower is better.  - Iperf from a VM to its host. Metric
is total aggregate throughput, so higher is better.

The host is a 16 pCPUs / 2 NUMA nodes Xeon E5620, 6GB RAM per node. The VM had
16 vCPUs and 4GB of memory. Dom0 had 16 vCPUs as well, and 1GB of RAM.

The Xen build, I did it one time with -j4 --representative of low VM load-- and
another time with -j24 --representative of high VM laod. The Iperf test, I've
only used 8 parallel streams (I wanted to do 4 and 8, but there was a bug in my
scripts! :-/).

I've run the above both with and without disturbing external (from the point of
view of the VM) load. Such load were just generated by means of running
processes in dom0. It's rather basic, but it certainly keeps dom0's vCPUs busy
and stress the scheduler. This "noise", when present, was composed by:
 - 8 (v)CPU hog process (`yes &> /dev/null'), running in dom0
 - 4 processes alternating computation and sleep with a duty cycle of 35%.

So, there basically were 12 vCPUs of dom0 kept busy, in an heterogeneous fashion.

I benchmarked Credit2 with runqueues arranged per-core (the current default)
and per-socket, and also Credit1, for reference. The baseline was current
staging plus Jan's monotonicity series.

Actual numbers:

|=======================================================================|
| CREDIT 1 (for reference)                                              |
|=======================================================================|
| Xen build, low VM load, no noise    |
|-------------------------------------|
|               32.207                |
|-------------------------------------|---------------------------------|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-------------------------------------|---------------------------------|
|               18.500                |             22.633              |
|-------------------------------------|---------------------------------|
| Xen build, low VM load, with noise  |
|-------------------------------------|
|               38.700                |
|-------------------------------------|---------------------------------|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-------------------------------------|---------------------------------|
|               80.317                |             21.300
|=======================================================================|
| CREDIT 2                                                              |
|=======================================================================|
| Xen build, low VM load, no noise    | 
|-------------------------------------|
|            runq=core   runq=socket  |
| baseline     34.543       38.070    |
| patched      35.200       33.433    |
|-------------------------------------|---------------------------------|
| Xen build, high VM load, no noise   | Iperf, high VM load, no noise   |
|-------------------------------------|---------------------------------|
|            runq=core   runq=socket  |           runq=core runq=socket |
| baseline     18.710       19.397    | baseline    21.300     21.933   |
| patched      18.013       18.530    | patched     23.200     23.466   |
|-------------------------------------|---------------------------------|
| Xen build, low VM load, with noise  |
|-------------------------------------|
|            runq=core   runq=socket  |
| baseline     44.483       40.747    |
| patched      45.866       39.493    |
|-------------------------------------|---------------------------------|
| Xen build, high VM load, with noise | Iperf, high VM load, with noise |
|-------------------------------------|---------------------------------|
|            runq=core   runq=socket  |           runq=core runq=socket |
| baseline     41.466       30.630    | baseline    20.333     20.633   |
| patched      36.840       29.080    | patched     19.967     21.000   |
|=======================================================================|

Which, summarizing, means:
 * as far as Credit2 is concerned,  applying this series and using runq=socket
   is what _ALWAYS_ provides the best results.
 * when looking at Credit1 vs. patched Credit2 with runq=socket:
  - Xen build, low VM load,  no noise  : Credit1 slightly better
  - Xen build, low VM load,  no noise  : on par
  - Xen build, low VM load,  with noise: Credit1 a bit better
  - Xen build, high VM load, with noise: Credit2 _ENORMOUSLY_ better (yes, I
    rerun both cases a number of time!)
  - Iperf,     high VM load, no noise  : Credit2 a bit better
  - Iperf,     high VM load, with noise: Credit1 slightly better    

So, Credit1 still wins a few rounds, but performance are very very very close,
and this series seems to me to help narrowing the gap (for some of the cases,
significantly).

It also looks like that, although rather naive, the 'Xen build, high VM load,
with noise' test case exposed another of those issues with Credit1 (more
investigation is necessary), while Credit2 keeps up just fine.

Another interesting thing to note is that, on Credit2 (with this series) 'Xen
build, high VM load, with noise' turns out being quicker than 'Xen build, low
VM load, with noise'. This means that using an higher value for `make -j' for a
build, inside a guest, results in quicker build time, which makes sense... But
that is _NOT_ what happens on Credit1, the whole thing (wildly :-P) hinting at
Credit2 being able to achieve better scalability and better fairness.

In any case, more benchmarking is necessary, and is already planned. More
investigation is also necessary to figure out whether, once we will have this
series, going back to runq=socket as default would indeed be the best thing
(which I indeed suspect it will).

But from all I see, and from all the various perspectives, this series seems a
step in the right direction.

Thanks and Regards,
Dario

[1] http://lists.xen.org/archives/html/xen-devel/2016-06/msg00922.html
[2] http://lists.xen.org/archives/html/xen-devel/2016-06/msg01884.html

[*] Jan, I confirm that, with your series applied, I haven't yet seen any of
those "Time went backwards?" printk from Credit2, as you sort of were
expecting...

---
Dario Faggioli (19):
      xen: sched: leave CPUs doing tasklet work alone.
      xen: sched: make the 'tickled' perf counter clearer
      xen: credit2: insert and tickle don't need a cpu parameter
      xen: credit2: kill useless helper function choose_cpu
      xen: credit2: do not warn if calling burn_credits more than once
      xen: credit2: read NOW() with the proper runq lock held
      xen: credit2: prevent load balancing to go mad if time goes backwards
      xen: credit2: when tickling, check idle cpus first
      xen: credit2: avoid calling __update_svc_load() multiple times on the same vcpu
      xen: credit2: rework load tracking logic
      tools: tracing: adapt Credit2 load tracking events to new format
      xen: credit2: use non-atomic cpumask and bit operations
      xen: credit2: make the code less experimental
      xen: credit2: add yet some more tracing
      xen: credit2: only marshall trace point arguments if tracing enabled
      tools: tracing: deal with new Credit2 events
      xen: credit2: the private scheduler lock can be an rwlock.
      xen: credit2: implement SMT support independent runq arrangement
      xen: credit2: use cpumask_first instead of cpumask_any when choosing cpu


 docs/misc/xen-command-line.markdown |   30 +
 tools/xentrace/formats              |   10 
 tools/xentrace/xenalyze.c           |  103 +++
 xen/common/sched_credit.c           |   22 -
 xen/common/sched_credit2.c          | 1158 +++++++++++++++++++++++++----------
 xen/common/sched_rt.c               |    8 
 xen/include/xen/cpumask.h           |    8 
 xen/include/xen/perfc_defn.h        |    5 
 8 files changed, 973 insertions(+), 371 deletions(-)

--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/19] Assorted fixes and improvements to Credit2
  2016-06-17 17:32 [PATCH 00/19] Assorted fixes and improvements to Credit2 Dario Faggioli
@ 2016-06-17 23:08 ` Dario Faggioli
  2016-06-20  7:43 ` Jan Beulich
  2016-07-08 10:11 ` George Dunlap
  2 siblings, 0 replies; 6+ messages in thread
From: Dario Faggioli @ 2016-06-17 23:08 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Anshul Makkar, David Vrabel, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 548 bytes --]

On Fri, 2016-06-17 at 19:32 +0200, Dario Faggioli wrote:
> Hi everyone,
> 
Mmm... I'm not sure why, but this time, 'stg mail' only managed to send
the cover letter, then it terminated with no errors! :-O

In any case, I'm resending... apologies for the noise.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/19] Assorted fixes and improvements to Credit2
  2016-06-17 17:32 [PATCH 00/19] Assorted fixes and improvements to Credit2 Dario Faggioli
  2016-06-17 23:08 ` Dario Faggioli
@ 2016-06-20  7:43 ` Jan Beulich
  2016-06-20 11:43   ` Dario Faggioli
  2016-07-08 10:11 ` George Dunlap
  2 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2016-06-20  7:43 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel, Anshul Makkar, David Vrabel

>>> On 17.06.16 at 19:32, <dario.faggioli@citrix.com> wrote:
> |-------------------------------------|---------------------------------|
> | Xen build, high VM load, with noise | Iperf, high VM load, with noise |
> |-------------------------------------|---------------------------------|
> |            runq=core   runq=socket  |           runq=core runq=socket |
> | baseline     41.466       30.630    | baseline    20.333     20.633   |
> | patched      36.840       29.080    | patched     19.967     21.000   |
> |=======================================================================|
> 
> Which, summarizing, means:
>  * as far as Credit2 is concerned,  applying this series and using runq=socket
>    is what _ALWAYS_ provides the best results.

Always? What about the increase on far the right side of the above
table fragment? It's not a big change, but anyway.

> [*] Jan, I confirm that, with your series applied, I haven't yet seen any of
> those "Time went backwards?" printk from Credit2, as you sort of were
> expecting...

Well, that's better than I had expected then: I didn't really think
they would be gone entirely. How long of an uptime did your tests
cover? As noted in the cover letter, I've observed remaining odd
TSC/stime jumps to increase in range over time, with no explanation
so far.

Also I wonder whether I may translate your statement above to
a Tested-by for part or all of the series (right now there's only a
coding style fix to one of the patches and a slight extension to
the rdtsc_ordered() one pending for an eventual v2).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/19] Assorted fixes and improvements to Credit2
  2016-06-20  7:43 ` Jan Beulich
@ 2016-06-20 11:43   ` Dario Faggioli
  2016-06-20 11:53     ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Dario Faggioli @ 2016-06-20 11:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, xen-devel, Anshul Makkar, David Vrabel


[-- Attachment #1.1: Type: text/plain, Size: 2712 bytes --]

On Mon, 2016-06-20 at 01:43 -0600, Jan Beulich wrote:
> > 
> > > 
> > > > 
> > > > On 17.06.16 at 19:32, <dario.faggioli@citrix.com> wrote:
> > > -------------------------------------|---------------------------------|
> > > Xen build, high VM load, with noise | Iperf, high VM load, with noise |
> > > -------------------------------------|---------------------------------|
> > >            runq=core   runq=socket  |           runq=core runq=socket |
> > > baseline     41.466       30.630    | baseline    20.333     20.633   |
> > > patched      36.840       29.080    | patched     19.967     21.000   |
> > > =======================================================================|
> > Which, summarizing, means:
> >  * as far as Credit2 is concerned,  applying this series and using
> > runq=socket
> >    is what _ALWAYS_ provides the best results.
> Always? What about the increase on far the right side of the above
> table fragment? It's not a big change, but anyway.
> 
Not sure I follow. By 'far the right side' you mean the results of
"Iperf, high VM load, with noise"?

If yes, the 'patched' and 'runq=socket' element shows the highest
value, which in this case is a good thing, because this is Iperf and
the number is the total throughput in Gbps, and the higher it is, the
better.

> > [*] Jan, I confirm that, with your series applied, I haven't yet
> > seen any of
> > those "Time went backwards?" printk from Credit2, as you sort of
> > were
> > expecting...
> Well, that's better than I had expected then: I didn't really think
> they would be gone entirely. How long of an uptime did your tests
> cover? As noted in the cover letter, I've observed remaining odd
> TSC/stime jumps to increase in range over time, with no explanation
> so far.
> 
The total uptime of one run of this benchmarks is a handful of minutes,
so that's probably why I don't see any problem.

> Also I wonder whether I may translate your statement above to
> a Tested-by for part or all of the series (right now there's only a
> coding style fix to one of the patches and a slight extension to
> the rdtsc_ordered() one pending for an eventual v2).
> 
Indeed you can... I was in fact planning to reply directly to the
series' thread with that.

I've applied, and hence tested, the full series.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/19] Assorted fixes and improvements to Credit2
  2016-06-20 11:43   ` Dario Faggioli
@ 2016-06-20 11:53     ` Jan Beulich
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Beulich @ 2016-06-20 11:53 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: George Dunlap, xen-devel, Anshul Makkar, David Vrabel

>>> On 20.06.16 at 13:43, <dario.faggioli@citrix.com> wrote:
> On Mon, 2016-06-20 at 01:43 -0600, Jan Beulich wrote:
>> > 
>> > > 
>> > > > 
>> > > > On 17.06.16 at 19:32, <dario.faggioli@citrix.com> wrote:
>> > > -------------------------------------|---------------------------------|
>> > > Xen build, high VM load, with noise | Iperf, high VM load, with noise |
>> > > -------------------------------------|---------------------------------|
>> > >            runq=core   runq=socket  |           runq=core runq=socket |
>> > > baseline     41.466       30.630    | baseline    20.333     20.633   |
>> > > patched      36.840       29.080    | patched     19.967     21.000   |
>> > > =======================================================================|
>> > Which, summarizing, means:
>> >  * as far as Credit2 is concerned,  applying this series and using
>> > runq=socket
>> >    is what _ALWAYS_ provides the best results.
>> Always? What about the increase on far the right side of the above
>> table fragment? It's not a big change, but anyway.
>> 
> Not sure I follow. By 'far the right side' you mean the results of
> "Iperf, high VM load, with noise"?
> 
> If yes, the 'patched' and 'runq=socket' element shows the highest
> value, which in this case is a good thing, because this is Iperf and
> the number is the total throughput in Gbps, and the higher it is, the
> better.

Oh, I see. You certainly said so somewhere in the description;
it's not the first time finding lower-is-better numbers right next
to higher-is-better managed to confuse me.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 00/19] Assorted fixes and improvements to Credit2
  2016-06-17 17:32 [PATCH 00/19] Assorted fixes and improvements to Credit2 Dario Faggioli
  2016-06-17 23:08 ` Dario Faggioli
  2016-06-20  7:43 ` Jan Beulich
@ 2016-07-08 10:11 ` George Dunlap
  2 siblings, 0 replies; 6+ messages in thread
From: George Dunlap @ 2016-07-08 10:11 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, David Vrabel, Jan Beulich

On Fri, Jun 17, 2016 at 6:32 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> Hi everyone,
>
> Here you go a collection of pseudo-random fixes and improvement to Credit2.
>
> In the process of working on Soft Affinity and Caps support, I stumbled upon
> them, one after the other, and decided to take care.
>
> It's been hard to test and run benchmark, due to the "time goes backwards" bug
> I uncovered [1], and this is at least part of the reason why the code for
> affinity and caps is still missing. I've got it already, but need to refine a
> couple of things, after double checking benchmark results. So, now that we have
> Jan's series [2] (thanks! [*]), and that I managed to indeed run some tests on
> this preliminary set of patches, I decided I better set this first group free,
> while working on finishing the rest.
>
> The various patches do a wide range of different things, so, please, refer to
> Dario Faggioli (19):

I've pushed the following patches:

>       xen: sched: make the 'tickled' perf counter clearer
>       xen: credit2: insert and tickle don't need a cpu parameter
>       xen: credit2: kill useless helper function choose_cpu
>       xen: credit2: do not warn if calling burn_credits more than once
>       xen: credit2: when tickling, check idle cpus first
>       xen: credit2: avoid calling __update_svc_load() multiple times on the same vcpu
>       xen: credit2: use non-atomic cpumask and bit operations

The ones below either have outstanding comments, or don't apply
without patches which haven't been applied.

>       xen: sched: leave CPUs doing tasklet work alone.
>       xen: credit2: read NOW() with the proper runq lock held
>       xen: credit2: prevent load balancing to go mad if time goes backwards
>       xen: credit2: rework load tracking logic
>       tools: tracing: adapt Credit2 load tracking events to new format
>       xen: credit2: make the code less experimental
>       xen: credit2: add yet some more tracing
>       xen: credit2: only marshall trace point arguments if tracing enabled
>       tools: tracing: deal with new Credit2 events
>       xen: credit2: the private scheduler lock can be an rwlock.
>       xen: credit2: implement SMT support independent runq arrangement
>       xen: credit2: use cpumask_first instead of cpumask_any when choosing cpu

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-08 10:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-17 17:32 [PATCH 00/19] Assorted fixes and improvements to Credit2 Dario Faggioli
2016-06-17 23:08 ` Dario Faggioli
2016-06-20  7:43 ` Jan Beulich
2016-06-20 11:43   ` Dario Faggioli
2016-06-20 11:53     ` Jan Beulich
2016-07-08 10:11 ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).