All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Wang <wangyun@linux.vnet.ibm.com>
To: Mike Galbraith <bitbucket@online.de>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org, mingo@kernel.org, a.p.zijlstra@chello.nl
Subject: Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()
Date: Mon, 21 Jan 2013 15:34:04 +0800	[thread overview]
Message-ID: <50FCEF6C.6010801@linux.vnet.ibm.com> (raw)
In-Reply-To: <1358750523.4994.55.camel@marge.simpson.net>

On 01/21/2013 02:42 PM, Mike Galbraith wrote:
> On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote:
> 
>> That seems like the default one, could you please show me the numbers in
>> your datapoint file?
> 
> Yup, I do not touch the workfile.  Datapoints is what you see in the
> tabulated result...
> 
> 1
> 1
> 1
> 5
> 5
> 5
> 10
> 10
> 10
> ...
> 
> so it does three consecutive runs at each load level.  I quiesce the
> box, set governor to performance, echo 250 32000 32 4096
>> /proc/sys/kernel/sem, then ./multitask -nl -f, and point it
> at ./datapoints.

I have changed the "/proc/sys/kernel/sem" to:

2000    2048000 256     1024

and run few rounds, seems like I can't reproduce this issue on my 12 cpu
X86 server:

	prev		post
Tasks    jobs/min  	jobs/min
    1      508.39    	506.69
    5     2792.63   	2792.63
   10     5454.55   	5449.64
   20    10262.49  	10271.19
   40    18089.55  	18184.55
   80    28995.22  	28960.57
  160    41365.19  	41613.73
  320    53099.67  	52767.35
  640    61308.88  	61483.83
 1280    66707.95  	66484.96
 2560    69736.58  	69350.02

Almost nothing changed...I would like to find another machine and do the
test again later.

> 
>> I'm not familiar with this benchmark, but I'd like to have a try on my
>> server, to make sure whether it is a generic issue.
> 
> One thing I didn't like about your changes is that you don't ask
> wake_affine() if it's ok to pull cross node or not, which I though might
> induce imbalance, but twiddling that didn't fix up the collapse, pretty
> much leaving only the balance path.

wake_affine() will be asked before trying to use the idle sibling
selected from current cpu's domain, doesn't it? It's just been delayed
since it's cost is too high.

But you notified me that I missed the case when prev == current, not
sure whether it's the killer, but will correct it.

> 
>>>> And I'm confusing about how those new parameter value was figured out
>>>> and how could them help solve the possible issue?
>>>
>>> Oh, that's easy.  I set sched_min_granularity_ns such that last_buddy
>>> kicks in when a third task arrives on a runqueue, and set
>>> sched_wakeup_granularity_ns near minimum that still allows wakeup
>>> preemption to occur.  Combined effect is reduced over-scheduling.
>>
>> That sounds very hard, to catch the timing, whatever, it could be an
>> important clue for analysis.
> 
> (Play with the knobs with a bunch of different loads, I think you'll
> find that those settings work well)
> 
>>>> Do you have any idea about which part in this patch set may cause the issue?
>>>
>>> Nope, I'm as puzzled by that as you are.  When the box had 40 cores,
>>> both virgin and patched showed over-scheduling effects, but not like
>>> this.  With 20 cores, symptoms changed in a most puzzling way, and I
>>> don't see how you'd be directly responsible.
>>
>> Hmm...
>>
>>>
>>>> One change by designed is that, for old logical, if it's a wake up and
>>>> we found affine sd, the select func will never go into the balance path,
>>>> but the new logical will, in some cases, do you think this could be a
>>>> problem?
>>>
>>> Since it's the high load end, where looking for an idle core is most
>>> likely to be a waste of time, it makes sense that entering the balance
>>> path would hurt _some_, it isn't free.. except for twiddling preemption
>>> knobs making the collapse just go away.  We're still going to enter that
>>> path if all cores are busy, no matter how I twiddle those knobs.
>>
>> May be we could try change this back to the old way later, after the aim
>> 7 test on my server.
> 
> Yeah, something funny is going on.  I'd like select_idle_sibling() to
> just go away, that task be integrated into one and only one short and
> sweet balance path.  I don't see why fine_idlest* needs to continue
> traversal after seeing a zero.  
It should be just fine to say gee, we're
> done.  

Yes, that's true :)

Hohum, so much for pure test and report, twiddle twiddle tweak,
> bend spindle mutilate ;-) 

Scheduler is impossible to be analysis some time, the only way to prove
is the painful endless testing...and usually, we still missed some thing
in the end...

Regards,
Michael Wang


>    
> -Mike
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


  parent reply	other threads:[~2013-01-21  7:34 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1356588535-23251-1-git-send-email-wangyun@linux.vnet.ibm.com>
2013-01-09  9:28 ` [RFC PATCH 0/2] sched: simplify the select_task_rq_fair() Michael Wang
2013-01-12  8:01   ` Mike Galbraith
2013-01-12 10:19     ` Mike Galbraith
2013-01-14  9:21       ` Mike Galbraith
2013-01-15  3:10         ` Michael Wang
2013-01-15  4:52           ` Mike Galbraith
2013-01-15  8:26             ` Michael Wang
2013-01-17  5:55         ` Michael Wang
2013-01-20  4:09           ` Mike Galbraith
2013-01-21  2:50             ` Michael Wang
2013-01-21  4:38               ` Mike Galbraith
2013-01-21  5:07                 ` Michael Wang
2013-01-21  6:42                   ` Mike Galbraith
2013-01-21  7:09                     ` Mike Galbraith
2013-01-21  7:45                       ` Michael Wang
2013-01-21  9:09                         ` Mike Galbraith
2013-01-21  9:22                           ` Michael Wang
2013-01-21  9:44                             ` Mike Galbraith
2013-01-21 10:30                               ` Mike Galbraith
2013-01-22  3:43                               ` Michael Wang
2013-01-22  8:03                                 ` Mike Galbraith
2013-01-22  8:56                                   ` Michael Wang
2013-01-22 11:34                                     ` Mike Galbraith
2013-01-23  3:01                                       ` Michael Wang
2013-01-23  5:02                                         ` Mike Galbraith
2013-01-22 14:41                                     ` Mike Galbraith
2013-01-23  2:44                                       ` Michael Wang
2013-01-23  4:31                                         ` Mike Galbraith
2013-01-23  5:09                                           ` Michael Wang
2013-01-23  6:28                                             ` Mike Galbraith
2013-01-23  7:10                                               ` Michael Wang
2013-01-23  8:20                                                 ` Mike Galbraith
2013-01-23  8:30                                                   ` Michael Wang
2013-01-23  8:49                                                     ` Mike Galbraith
2013-01-23  9:00                                                       ` Michael Wang
2013-01-23  9:18                                                         ` Mike Galbraith
2013-01-23  9:26                                                           ` Michael Wang
2013-01-23  9:37                                                             ` Mike Galbraith
2013-01-23  9:32                                                           ` Mike Galbraith
2013-01-24  6:01                                                             ` Michael Wang
2013-01-24  6:51                                                               ` Mike Galbraith
2013-01-24  7:15                                                                 ` Michael Wang
2013-01-24  7:47                                                                   ` Mike Galbraith
2013-01-24  8:14                                                                     ` Michael Wang
2013-01-24  9:07                                                                       ` Mike Galbraith
2013-01-24  9:26                                                                         ` Michael Wang
2013-01-24 10:34                                                                           ` Mike Galbraith
2013-01-25  2:14                                                                             ` Michael Wang
2013-01-24  7:00                                                               ` Michael Wang
2013-01-21  7:34                     ` Michael Wang [this message]
2013-01-21  8:26                       ` Mike Galbraith
2013-01-21  8:46                         ` Michael Wang
2013-01-21  9:11                           ` Mike Galbraith
2013-01-15  2:46     ` Michael Wang
2013-01-11  8:15 Michael Wang
2013-01-11 10:13 ` Nikunj A Dadhania
2013-01-15  2:20   ` Michael Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50FCEF6C.6010801@linux.vnet.ibm.com \
    --to=wangyun@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=bitbucket@online.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.