linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE
       [not found] <1432675865-378571-1-git-send-email-jbacik@fb.com>
@ 2015-05-27 20:09 ` Josef Bacik
  2015-05-27 21:03   ` Rik van Riel
  2015-05-28 11:53   ` Ingo Molnar
  0 siblings, 2 replies; 4+ messages in thread
From: Josef Bacik @ 2015-05-27 20:09 UTC (permalink / raw)
  To: riel, mingo, peterz, linux-kernel, kernel-team

[-- Attachment #1: Type: text/plain, Size: 1814 bytes --]

On 05/26/2015 05:31 PM, Josef Bacik wrote:
> At Facebook we have a pretty heavily multi-threaded application that is
> sensitive to latency.  We have been pulling forward the old SD_WAKE_IDLE code
> because it gives us a pretty significant performance gain (like 20%).  It turns
> out this is because there are cases where the scheduler puts our task on a busy
> CPU when there are idle CPU's in the system.  We verify this by reading the
> cpu_delay_req_avg_us from the scheduler netlink stuff.  With our crappy patch we
> get much lower numbers vs baseline.
>
> SD_BALANCE_WAKE is supposed to find us an idle cpu to run on, however it is just
> looking for an idle sibling, preferring affinity over all else.  This is not
> helpful in all cases, and SD_BALANCE_WAKE's job is to find us an idle cpu, not
> garuntee affinity.  Fix this by first trying to find an idle sibling, and then
> if the cpu is not idle fall through to the logic to find an idle cpu.  With this
> patch we get slightly better performance than with our forward port of
> SD_WAKE_IDLE.  Thanks,
>

I rigged up a test script to run the perf bench sched tests and give me 
the numbers.  Here are the numbers

4.0

Messaging: 56.934 Total runtime in seconds
Pipe: 105620.762 ops/sec

4.0 + my patch

Messaging: 47.374
Pipe: 113691.199

so ~20% better performance out of the Messaging test which is sort of 
like HHVM and ~8% better pipe performance.  This box is a 2 socket 16 
core box.  I've attached the script I'm using, basically I just run each 
thing 5 times, and for the perf bench sched pipe run I do NR_CPUS/2 
instances of them in parallel.

If you are interested I'd be happy to show you numbers for our HHVM 
test, but they are less straightforward and require pretty pictures and 
a book of how to read the numbers.  Thanks

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE
  2015-05-27 20:09 ` [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE Josef Bacik
@ 2015-05-27 21:03   ` Rik van Riel
  2015-05-27 21:23     ` Josef Bacik
  2015-05-28 11:53   ` Ingo Molnar
  1 sibling, 1 reply; 4+ messages in thread
From: Rik van Riel @ 2015-05-27 21:03 UTC (permalink / raw)
  To: Josef Bacik, mingo, peterz, linux-kernel, kernel-team

On 05/27/2015 04:09 PM, Josef Bacik wrote:
> On 05/26/2015 05:31 PM, Josef Bacik wrote:

>> SD_BALANCE_WAKE is supposed to find us an idle cpu to run on, however
>> it is just
>> looking for an idle sibling, preferring affinity over all else.  This
>> is not
>> helpful in all cases, and SD_BALANCE_WAKE's job is to find us an idle
>> cpu, not
>> garuntee affinity.  Fix this by first trying to find an idle sibling,
>> and then
>> if the cpu is not idle fall through to the logic to find an idle cpu. 
>> With this
>> patch we get slightly better performance than with our forward port of
>> SD_WAKE_IDLE.  Thanks,
>>
> 
> I rigged up a test script to run the perf bench sched tests and give me
> the numbers.  Here are the numbers
> 
> 4.0
> 
> Messaging: 56.934 Total runtime in seconds
> Pipe: 105620.762 ops/sec
> 
> 4.0 + my patch
> 
> Messaging: 47.374
> Pipe: 113691.199

I did not get the email with your original patch,
either to my inbox or my lkml folder, but I saw the
patch on pastebin, and it looks good.

When you resend it, please feel free to add my

Acked-by: Rik van Riel <riel@redhat.com>

Assuming the version you meant to email yesterday was
the same one that you showed me on pastebin, of course :)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE
  2015-05-27 21:03   ` Rik van Riel
@ 2015-05-27 21:23     ` Josef Bacik
  0 siblings, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2015-05-27 21:23 UTC (permalink / raw)
  To: Rik van Riel, mingo, peterz, linux-kernel, kernel-team

On 05/27/2015 05:03 PM, Rik van Riel wrote:
> On 05/27/2015 04:09 PM, Josef Bacik wrote:
>> On 05/26/2015 05:31 PM, Josef Bacik wrote:
>
>>> SD_BALANCE_WAKE is supposed to find us an idle cpu to run on, however
>>> it is just
>>> looking for an idle sibling, preferring affinity over all else.  This
>>> is not
>>> helpful in all cases, and SD_BALANCE_WAKE's job is to find us an idle
>>> cpu, not
>>> garuntee affinity.  Fix this by first trying to find an idle sibling,
>>> and then
>>> if the cpu is not idle fall through to the logic to find an idle cpu.
>>> With this
>>> patch we get slightly better performance than with our forward port of
>>> SD_WAKE_IDLE.  Thanks,
>>>
>>
>> I rigged up a test script to run the perf bench sched tests and give me
>> the numbers.  Here are the numbers
>>
>> 4.0
>>
>> Messaging: 56.934 Total runtime in seconds
>> Pipe: 105620.762 ops/sec
>>
>> 4.0 + my patch
>>
>> Messaging: 47.374
>> Pipe: 113691.199
>
> I did not get the email with your original patch,
> either to my inbox or my lkml folder, but I saw the
> patch on pastebin, and it looks good.
>
> When you resend it, please feel free to add my
>
> Acked-by: Rik van Riel <riel@redhat.com>
>
> Assuming the version you meant to email yesterday was
> the same one that you showed me on pastebin, of course :)
>

Ha yes it's the same, sorry I'm not sure what happened, I've resent it 
again from a different machine, let me know if you don't get the new one 
and I'll just send it from thunderbird.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE
  2015-05-27 20:09 ` [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE Josef Bacik
  2015-05-27 21:03   ` Rik van Riel
@ 2015-05-28 11:53   ` Ingo Molnar
  1 sibling, 0 replies; 4+ messages in thread
From: Ingo Molnar @ 2015-05-28 11:53 UTC (permalink / raw)
  To: Josef Bacik
  Cc: riel, mingo, peterz, linux-kernel, kernel-team, Arnaldo Carvalho de Melo


* Josef Bacik <jbacik@fb.com> wrote:

> On 05/26/2015 05:31 PM, Josef Bacik wrote:
> >At Facebook we have a pretty heavily multi-threaded application that is
> >sensitive to latency.  We have been pulling forward the old SD_WAKE_IDLE code
> >because it gives us a pretty significant performance gain (like 20%).  It turns
> >out this is because there are cases where the scheduler puts our task on a busy
> >CPU when there are idle CPU's in the system.  We verify this by reading the
> >cpu_delay_req_avg_us from the scheduler netlink stuff.  With our crappy patch we
> >get much lower numbers vs baseline.
> >
> >SD_BALANCE_WAKE is supposed to find us an idle cpu to run on, however it is just
> >looking for an idle sibling, preferring affinity over all else.  This is not
> >helpful in all cases, and SD_BALANCE_WAKE's job is to find us an idle cpu, not
> >garuntee affinity.  Fix this by first trying to find an idle sibling, and then
> >if the cpu is not idle fall through to the logic to find an idle cpu.  With this
> >patch we get slightly better performance than with our forward port of
> >SD_WAKE_IDLE.  Thanks,
> >
> 
> I rigged up a test script to run the perf bench sched tests and give me the
> numbers.  Here are the numbers
> 
> 4.0
> 
> Messaging: 56.934 Total runtime in seconds
> Pipe: 105620.762 ops/sec
> 
> 4.0 + my patch
> 
> Messaging: 47.374
> Pipe: 113691.199

Btw., with perf bench you don't really need much extra scripting, something like 
this should give you pretty good numbers plus an stddev estimate:

   perf stat --null --repeat 10 perf bench sched messaging -l 10000

on my box this gives:

       4.391469643 seconds time elapsed                                          ( +-  2.81% )

you can adjust the -l value to move the runtime up/down to a value that you think 
runs long enough to give stable results.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-28 11:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1432675865-378571-1-git-send-email-jbacik@fb.com>
2015-05-27 20:09 ` [PATCH] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE Josef Bacik
2015-05-27 21:03   ` Rik van Riel
2015-05-27 21:23     ` Josef Bacik
2015-05-28 11:53   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).