* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity @ 2016-05-24 17:04 Paul E. McKenney 2016-05-25 17:49 ` Paul E. McKenney 0 siblings, 1 reply; 11+ messages in thread From: Paul E. McKenney @ 2016-05-24 17:04 UTC (permalink / raw) To: peterz Cc: umgwanakikbuti, mingo, linux-kernel, bsegall, matt, morten.rasmussen, pjt, tglx, byungchul.park, ahh On Mon, May 23, 2016 at 2:19 AM +0200, Peter Zijlstra wrote: > On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > > and likely take a size XXL latency hit. Big box running master bled > > > > profusely under heavy load until I turned TTWU_QUEUE off. > > > > May as well make it official and against master.today. Fly or die > > little patchlet. > > > > sched/fair: Move se->vruntime normalization state into struct sched_entity > > Does this work? This gets rid of the additional lost wakeups introduced during the merge window, thank you! The pre-existing low-probability lost wakeups still persist, sad to say Can't have everything, I guess. Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > --- > include/linux/sched.h | 1 + > kernel/sched/core.c | 18 +++++++++++------- > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 1b43b45a22b9..a2001e01b3df 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1534,6 +1534,7 @@ struct task_struct { > unsigned sched_reset_on_fork:1; > unsigned sched_contributes_to_load:1; > unsigned sched_migrated:1; > + unsigned sched_remote_wakeup:1; > unsigned :0; /* force alignment to the next boundary */ > > /* unserialized, strictly 'current' */ > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 404c0784b1fc..7f2cae4620c7 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void) > cookie = lockdep_pin_lock(&rq->lock); > > while (llist) { > + int wake_flags = 0; > + > p = llist_entry(llist, struct task_struct, wake_entry); > llist = llist_next(llist); > - /* > - * See ttwu_queue(); we only call ttwu_queue_remote() when > - * its a x-cpu wakeup. > - */ > - ttwu_do_activate(rq, p, WF_MIGRATED, cookie); > + > + if (p->sched_remote_wakeup) > + wake_flags = WF_MIGRATED; > + > + ttwu_do_activate(rq, p, wake_flags, cookie); > } > > lockdep_unpin_lock(&rq->lock, cookie); > @@ -1819,10 +1821,12 @@ void scheduler_ipi(void) > irq_exit(); > } > > -static void ttwu_queue_remote(struct task_struct *p, int cpu) > +static void ttwu_queue_remote(struct task_struct *p, int cpu, int > wake_flags) > { > struct rq *rq = cpu_rq(cpu); > > + p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED); > + > if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) { > if (!set_nr_if_polling(rq->idle)) > smp_send_reschedule(cpu); > @@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, int > cpu, int wake_flags) > #if defined(CONFIG_SMP) > if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), > cpu)) { > sched_clock_cpu(cpu); /* sync clocks x-cpu */ > - ttwu_queue_remote(p, cpu); > + ttwu_queue_remote(p, cpu, wake_flags); > return; > } > #endif ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-24 17:04 [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Paul E. McKenney @ 2016-05-25 17:49 ` Paul E. McKenney 0 siblings, 0 replies; 11+ messages in thread From: Paul E. McKenney @ 2016-05-25 17:49 UTC (permalink / raw) To: peterz Cc: umgwanakikbuti, mingo, linux-kernel, bsegall, matt, morten.rasmussen, pjt, tglx, byungchul.park, ahh On Tue, May 24, 2016 at 10:04:17AM -0700, Paul E. McKenney wrote: > On Mon, May 23, 2016 at 2:19 AM +0200, Peter Zijlstra wrote: > > On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > > > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > > > and likely take a size XXL latency hit. Big box running master bled > > > > > profusely under heavy load until I turned TTWU_QUEUE off. > > > > > > May as well make it official and against master.today. Fly or die > > > little patchlet. > > > > > > sched/fair: Move se->vruntime normalization state into struct sched_entity > > > > Does this work? > > This gets rid of the additional lost wakeups introduced during the > merge window, thank you! > > The pre-existing low-probability lost wakeups still persist, sad to say > Can't have everything, I guess. However, their behavior has changed. Previously, the TREE03 rcutorture scenario had the most stalls. Now TREE07 appears to be more likely. The corresponding Kconfig fragments are shown below, in case it gives someone ideas about where the problem might be. My current guess, based on insufficient data, is that TREE07 has about an 80-90% chance of seeing a lost wakeup during a two-hour run. Thanx, Paul ------------------------------------------------------------------------ TREE03 ------------------------------------------------------------------------ CONFIG_SMP=y CONFIG_NR_CPUS=16 CONFIG_PREEMPT_NONE=n CONFIG_PREEMPT_VOLUNTARY=n CONFIG_PREEMPT=y #CHECK#CONFIG_PREEMPT_RCU=y CONFIG_HZ_PERIODIC=y CONFIG_NO_HZ_IDLE=n CONFIG_NO_HZ_FULL=n CONFIG_RCU_TRACE=y CONFIG_HOTPLUG_CPU=y CONFIG_RCU_FANOUT=2 CONFIG_RCU_FANOUT_LEAF=2 CONFIG_RCU_NOCB_CPU=n CONFIG_DEBUG_LOCK_ALLOC=n CONFIG_RCU_BOOST=y CONFIG_RCU_KTHREAD_PRIO=2 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n CONFIG_RCU_EXPERT=y ------------------------------------------------------------------------ TREE07 ------------------------------------------------------------------------ CONFIG_SMP=y CONFIG_NR_CPUS=16 CONFIG_CPUMASK_OFFSTACK=y CONFIG_PREEMPT_NONE=y CONFIG_PREEMPT_VOLUNTARY=n CONFIG_PREEMPT=n #CHECK#CONFIG_TREE_RCU=y CONFIG_HZ_PERIODIC=n CONFIG_NO_HZ_IDLE=n CONFIG_NO_HZ_FULL=y CONFIG_NO_HZ_FULL_ALL=n CONFIG_NO_HZ_FULL_SYSIDLE=y CONFIG_RCU_FAST_NO_HZ=n CONFIG_RCU_TRACE=y CONFIG_HOTPLUG_CPU=y CONFIG_RCU_FANOUT=2 CONFIG_RCU_FANOUT_LEAF=2 CONFIG_RCU_NOCB_CPU=n CONFIG_DEBUG_LOCK_ALLOC=n CONFIG_DEBUG_OBJECTS_RCU_HEAD=n CONFIG_RCU_EXPERT=y ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 0/3] sched: Fix wakeup preemption regression @ 2016-05-10 17:43 Peter Zijlstra 2016-05-10 17:43 ` [PATCH 2/3] sched,fair: Fix local starvation Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2016-05-10 17:43 UTC (permalink / raw) To: mingo, linux-kernel Cc: peterz, Pavan Kondeti, Ben Segall, Matt Fleming, Mike Galbraith, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter A recent commit caused an interactivity/starvation issue because we wrecked rq local wakeup preemption. These patches rectify this while also (hopefully) keeping the problem that led to the fault patch fixed. Mike, Pavan, could you guys please confirm? ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] sched,fair: Fix local starvation 2016-05-10 17:43 [PATCH 0/3] sched: Fix wakeup preemption regression Peter Zijlstra @ 2016-05-10 17:43 ` Peter Zijlstra 2016-05-21 14:04 ` Mike Galbraith 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2016-05-10 17:43 UTC (permalink / raw) To: mingo, linux-kernel Cc: peterz, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter, Mike Galbraith [-- Attachment #1: peterz-sched-fix-local-starvation.patch --] [-- Type: text/plain, Size: 5203 bytes --] Mike reported that the recent commit 3a47d5124a95 ("sched/fair: Fix fairness issue on migration") broke interactivity and the signal starve test. The problem is that I assumed ENQUEUE_WAKING was only set when we do a cross-cpu wakeup (migration), which isn't true. This means we now destroy the vruntime history of tasks and wakeup-preemption suffers. Cure this by making my assumption true, only call sched_class::task_waking() when we do a cross-cpu wakeup. This avoids the indirect call in the case we do a local wakeup. Cc: Pavan Kondeti <pkondeti@codeaurora.org> Cc: Ben Segall <bsegall@google.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: byungchul.park@lge.com Cc: Andrew Hunter <ahh@google.com> Fixes: 3a47d5124a95 ("sched/fair: Fix fairness issue on migration") Reported-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/sched/core.c | 29 +++++++++++++++++++++-------- kernel/sched/fair.c | 41 ++++++++++++++++++++++++++++++++++++++++- 2 files changed, 61 insertions(+), 9 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1709,14 +1709,22 @@ static void ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags, struct pin_cookie cookie) { + int en_flags = ENQUEUE_WAKEUP; + lockdep_assert_held(&rq->lock); #ifdef CONFIG_SMP if (p->sched_contributes_to_load) rq->nr_uninterruptible--; + + /* + * If we migrated; we must have called sched_class::task_waking(). + */ + if (wake_flags & WF_MIGRATED) + en_flags |= ENQUEUE_WAKING; #endif - ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_WAKING); + ttwu_activate(rq, p, en_flags); ttwu_do_wakeup(rq, p, wake_flags, cookie); } @@ -1762,7 +1770,11 @@ void sched_ttwu_pending(void) while (llist) { p = llist_entry(llist, struct task_struct, wake_entry); llist = llist_next(llist); - ttwu_do_activate(rq, p, 0, cookie); + /* + * See ttwu_queue(); we only call ttwu_queue_remote() when + * its a x-cpu wakeup. + */ + ttwu_do_activate(rq, p, WF_MIGRATED, cookie); } lockdep_unpin_lock(&rq->lock, cookie); @@ -1849,7 +1861,7 @@ bool cpus_share_cache(int this_cpu, int } #endif /* CONFIG_SMP */ -static void ttwu_queue(struct task_struct *p, int cpu) +static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags) { struct rq *rq = cpu_rq(cpu); struct pin_cookie cookie; @@ -1864,7 +1876,7 @@ static void ttwu_queue(struct task_struc raw_spin_lock(&rq->lock); cookie = lockdep_pin_lock(&rq->lock); - ttwu_do_activate(rq, p, 0, cookie); + ttwu_do_activate(rq, p, wake_flags, cookie); lockdep_unpin_lock(&rq->lock, cookie); raw_spin_unlock(&rq->lock); } @@ -2034,17 +2046,18 @@ try_to_wake_up(struct task_struct *p, un p->sched_contributes_to_load = !!task_contributes_to_load(p); p->state = TASK_WAKING; - if (p->sched_class->task_waking) - p->sched_class->task_waking(p); - cpu = select_task_rq(p, p->wake_cpu, SD_BALANCE_WAKE, wake_flags); if (task_cpu(p) != cpu) { wake_flags |= WF_MIGRATED; + + if (p->sched_class->task_waking) + p->sched_class->task_waking(p); + set_task_cpu(p, cpu); } #endif /* CONFIG_SMP */ - ttwu_queue(p, cpu); + ttwu_queue(p, cpu, wake_flags); stat: if (schedstat_enabled()) ttwu_stat(p, cpu, wake_flags); --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3247,6 +3247,37 @@ static inline void check_schedstat_requi #endif } + +/* + * MIGRATION + * + * dequeue + * update_curr() + * update_min_vruntime() + * vruntime -= min_vruntime + * + * enqueue + * update_curr() + * update_min_vruntime() + * vruntime += min_vruntime + * + * this way the vruntime transition between RQs is done when both + * min_vruntime are up-to-date. + * + * WAKEUP (remote) + * + * ->task_waking_fair() + * vruntime -= min_vruntime + * + * enqueue + * update_curr() + * update_min_vruntime() + * vruntime += min_vruntime + * + * this way we don't have the most up-to-date min_vruntime on the originating + * CPU and an up-to-date min_vruntime on the destination CPU. + */ + static void enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { @@ -3264,7 +3295,9 @@ enqueue_entity(struct cfs_rq *cfs_rq, st /* * Otherwise, renormalise after, such that we're placed at the current - * moment in time, instead of some random moment in the past. + * moment in time, instead of some random moment in the past. Being + * placed in the past could significantly boost this task to the + * fairness detriment of existing tasks. */ if (renorm && !curr) se->vruntime += cfs_rq->min_vruntime; @@ -4811,6 +4844,12 @@ static unsigned long cpu_avg_load_per_ta return 0; } +/* + * Called to migrate a waking task; as blocked tasks retain absolute vruntime + * the migration needs to deal with this by subtracting the old and adding the + * new min_vruntime -- the latter is done by enqueue_entity() when placing + * the task on the new runqueue. + */ static void task_waking_fair(struct task_struct *p) { struct sched_entity *se = &p->se; ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] sched,fair: Fix local starvation 2016-05-10 17:43 ` [PATCH 2/3] sched,fair: Fix local starvation Peter Zijlstra @ 2016-05-21 14:04 ` Mike Galbraith 2016-05-21 19:00 ` Mike Galbraith 0 siblings, 1 reply; 11+ messages in thread From: Mike Galbraith @ 2016-05-21 14:04 UTC (permalink / raw) To: Peter Zijlstra, mingo, linux-kernel Cc: Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Tue, 2016-05-10 at 19:43 +0200, Peter Zijlstra wrote: (Evolution authors must either not do patch review, or use some other mailer. Squint hard, this crud really is your patch;) > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > > @@ -1762,7 +1770,11 @@ void sched_ttwu_pending(void) > > > while (llist) { > > > > p = llist_entry(llist, struct task_struct, wake_entry); > > > > llist = llist_next(llist); > -> > > ttwu_do_activate(rq, p, 0, cookie); > +> > > /* > +> > > * See ttwu_queue(); we only call ttwu_queue_remote() when > +> > > * its a x-cpu wakeup. > +> > > */ > +> > > ttwu_do_activate(rq, p, WF_MIGRATED, cookie); Wakees that were not migrated/normalized eat an unwanted min_vruntime, and likely take a size XXL latency hit. Big box running master bled profusely under heavy load until I turned TTWU_QUEUE off. -Mike ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] sched,fair: Fix local starvation 2016-05-21 14:04 ` Mike Galbraith @ 2016-05-21 19:00 ` Mike Galbraith 2016-05-22 7:00 ` [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Mike Galbraith 0 siblings, 1 reply; 11+ messages in thread From: Mike Galbraith @ 2016-05-21 19:00 UTC (permalink / raw) To: Peter Zijlstra, mingo, linux-kernel Cc: Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > and likely take a size XXL latency hit. Big box running master bled > profusely under heavy load until I turned TTWU_QUEUE off. The below made big box a happy camper again. sched/fair: Move se->vruntime normalization state into struct sched_entity Make ->vruntime normalization state explicit. Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> --- include/linux/sched.h | 1 + kernel/sched/core.c | 1 + kernel/sched/fair.c | 42 ++++++++++++++---------------------------- 3 files changed, 16 insertions(+), 28 deletions(-) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1348,6 +1348,7 @@ struct sched_entity { struct rb_node run_node; struct list_head group_node; unsigned int on_rq; + bool normalized; u64 exec_start; u64 sum_exec_runtime; --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2259,6 +2259,7 @@ static void __sched_fork(unsigned long c p->se.prev_sum_exec_runtime = 0; p->se.nr_migrations = 0; p->se.vruntime = 0; + p->se.normalized = true; INIT_LIST_HEAD(&p->se.group_node); #ifdef CONFIG_FAIR_GROUP_SCHED --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3305,14 +3305,13 @@ static inline void check_schedstat_requi static void enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { - bool renorm = !(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_MIGRATED); bool curr = cfs_rq->curr == se; /* * If we're the current task, we must renormalise before calling * update_curr(). */ - if (renorm && curr) + if (se->normalized && curr) se->vruntime += cfs_rq->min_vruntime; update_curr(cfs_rq); @@ -3323,9 +3322,11 @@ enqueue_entity(struct cfs_rq *cfs_rq, st * placed in the past could significantly boost this task to the * fairness detriment of existing tasks. */ - if (renorm && !curr) + if (se->normalized && !curr) se->vruntime += cfs_rq->min_vruntime; + se->normalized = false; + enqueue_entity_load_avg(cfs_rq, se); account_entity_enqueue(cfs_rq, se); update_cfs_shares(cfs_rq); @@ -3422,8 +3423,10 @@ dequeue_entity(struct cfs_rq *cfs_rq, st * update can refer to the ->curr item and we need to reflect this * movement in our normalized position. */ - if (!(flags & DEQUEUE_SLEEP)) + if (!(flags & DEQUEUE_SLEEP)) { se->vruntime -= cfs_rq->min_vruntime; + se->normalized = true; + } /* return excess runtime on last dequeue */ return_cfs_rq_runtime(cfs_rq); @@ -5681,6 +5684,7 @@ static void migrate_task_rq_fair(struct #endif se->vruntime -= min_vruntime; + se->normalized = true; } /* @@ -8591,6 +8595,7 @@ static void task_fork_fair(struct task_s } se->vruntime -= cfs_rq->min_vruntime; + se->normalized = true; raw_spin_unlock_irqrestore(&rq->lock, flags); } @@ -8619,29 +8624,7 @@ prio_changed_fair(struct rq *rq, struct static inline bool vruntime_normalized(struct task_struct *p) { - struct sched_entity *se = &p->se; - - /* - * In both the TASK_ON_RQ_QUEUED and TASK_ON_RQ_MIGRATING cases, - * the dequeue_entity(.flags=0) will already have normalized the - * vruntime. - */ - if (p->on_rq) - return true; - - /* - * When !on_rq, vruntime of the task has usually NOT been normalized. - * But there are some cases where it has already been normalized: - * - * - A forked child which is waiting for being woken up by - * wake_up_new_task(). - * - A task which has been woken up by try_to_wake_up() and - * waiting for actually being woken up by sched_ttwu_pending(). - */ - if (!se->sum_exec_runtime || p->state == TASK_WAKING) - return true; - - return false; + return p->se.normalized; } static void detach_task_cfs_rq(struct task_struct *p) @@ -8656,6 +8639,7 @@ static void detach_task_cfs_rq(struct ta */ place_entity(cfs_rq, se, 0); se->vruntime -= cfs_rq->min_vruntime; + se->normalized = true; } /* Catch up with the cfs_rq and remove our load when we leave */ @@ -8678,8 +8662,10 @@ static void attach_task_cfs_rq(struct ta /* Synchronize task with its cfs_rq */ attach_entity_load_avg(cfs_rq, se); - if (!vruntime_normalized(p)) + if (vruntime_normalized(p)) { se->vruntime += cfs_rq->min_vruntime; + se->normalized = false; + } } static void switched_from_fair(struct rq *rq, struct task_struct *p) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-21 19:00 ` Mike Galbraith @ 2016-05-22 7:00 ` Mike Galbraith 2016-05-22 9:36 ` Peter Zijlstra 2016-05-23 9:19 ` Peter Zijlstra 0 siblings, 2 replies; 11+ messages in thread From: Mike Galbraith @ 2016-05-22 7:00 UTC (permalink / raw) To: Peter Zijlstra, mingo, linux-kernel Cc: Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > and likely take a size XXL latency hit. Big box running master bled > > profusely under heavy load until I turned TTWU_QUEUE off. May as well make it official and against master.today. Fly or die little patchlet. sched/fair: Move se->vruntime normalization state into struct sched_entity b5179ac70de ceased globally normalizing wakee vruntime in ttwu(), leaving sched_ttwu_pending() with the need to know whether each wakee on wake_list was migrated or not, to pass that on to fair class functions so they can DTRT wrt vruntime normalization. Store vruntime normalization state in struct sched_entity, so fair class functions that need it always have it, and sched_ttwu_pending() again doesn't need to care whether tasks on the wake_list have been migrated or not. Since there are now no consumers of ENQUEUE_MIGRATED, drop it as well. master v4.6-8889-gf6c658df6385 virgin 256 49096 71698.99 MB/sec warmup 1 sec latency 1136.488 ms 256 155009 72862.08 MB/sec execute 1 sec latency 3136.900 ms 256 207430 72628.04 MB/sec execute 2 sec latency 4137.001 ms 256 259635 72442.97 MB/sec execute 3 sec latency 5137.105 ms 256 311905 72371.84 MB/sec execute 4 sec latency 6137.214 ms 256 364210 72564.99 MB/sec execute 5 sec latency 7137.323 ms 256 416551 72598.74 MB/sec execute 6 sec latency 5816.895 ms 256 468824 72601.54 MB/sec execute 7 sec latency 6815.386 ms 256 520996 72621.87 MB/sec execute 8 sec latency 7815.499 ms 256 573113 72608.75 MB/sec execute 9 sec latency 8815.609 ms 256 cleanup 10 sec 0 cleanup 10 sec master v4.6-8889-gf6c658df6385 post 256 51527 75357.55 MB/sec warmup 1 sec latency 21.591 ms 256 157610 73188.06 MB/sec execute 1 sec latency 12.985 ms 256 210089 72809.01 MB/sec execute 2 sec latency 11.543 ms 256 262554 72681.86 MB/sec execute 3 sec latency 0.209 ms 256 315432 72798.65 MB/sec execute 4 sec latency 0.206 ms 256 368162 72963.33 MB/sec execute 5 sec latency 8.052 ms 256 420854 72976.50 MB/sec execute 6 sec latency 0.221 ms 256 473420 72953.76 MB/sec execute 7 sec latency 0.198 ms 256 525859 73011.17 MB/sec execute 8 sec latency 2.810 ms 256 578301 73052.84 MB/sec execute 9 sec latency 0.247 ms 256 cleanup 10 sec 0 cleanup 10 sec Fixes: b5179ac70de sched/fair: Prepare to fix fairness problems on migration Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> --- include/linux/sched.h | 1 kernel/sched/core.c | 6 +---- kernel/sched/fair.c | 60 ++++++++++++++++++++------------------------------ 3 files changed, 28 insertions(+), 39 deletions(-) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1319,6 +1319,7 @@ struct sched_entity { struct rb_node run_node; struct list_head group_node; unsigned int on_rq; + bool normalized; u64 exec_start; u64 sum_exec_runtime; --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1719,9 +1719,6 @@ ttwu_do_activate(struct rq *rq, struct t #ifdef CONFIG_SMP if (p->sched_contributes_to_load) rq->nr_uninterruptible--; - - if (wake_flags & WF_MIGRATED) - en_flags |= ENQUEUE_MIGRATED; #endif ttwu_activate(rq, p, en_flags); @@ -1774,7 +1771,7 @@ void sched_ttwu_pending(void) * See ttwu_queue(); we only call ttwu_queue_remote() when * its a x-cpu wakeup. */ - ttwu_do_activate(rq, p, WF_MIGRATED, cookie); + ttwu_do_activate(rq, p, 0, cookie); } lockdep_unpin_lock(&rq->lock, cookie); @@ -2166,6 +2163,7 @@ static void __sched_fork(unsigned long c p->se.prev_sum_exec_runtime = 0; p->se.nr_migrations = 0; p->se.vruntime = 0; + p->se.normalized = true; INIT_LIST_HEAD(&p->se.group_node); #ifdef CONFIG_FAIR_GROUP_SCHED --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3230,6 +3230,7 @@ place_entity(struct cfs_rq *cfs_rq, stru /* ensure we never gain time by being placed backwards. */ se->vruntime = max_vruntime(se->vruntime, vruntime); + se->normalized = false; } static void check_enqueue_throttle(struct cfs_rq *cfs_rq); @@ -3285,29 +3286,40 @@ static inline void check_schedstat_requi * CPU and an up-to-date min_vruntime on the destination CPU. */ +static void normalize_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + se->vruntime -= cfs_rq->min_vruntime; + se->normalized = true; +} + +static void renormalize_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + se->vruntime += cfs_rq->min_vruntime; + se->normalized = false; +} + static void enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { - bool renorm = !(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_MIGRATED); - bool curr = cfs_rq->curr == se; + bool renorm = se->normalized, curr = cfs_rq->curr == se; /* - * If we're the current task, we must renormalise before calling + * If we're the current task, we must renormalize before calling * update_curr(). */ if (renorm && curr) - se->vruntime += cfs_rq->min_vruntime; + renormalize_entity(cfs_rq, se); update_curr(cfs_rq); /* - * Otherwise, renormalise after, such that we're placed at the current + * Otherwise, renormalize after, such that we're placed at the current * moment in time, instead of some random moment in the past. Being * placed in the past could significantly boost this task to the * fairness detriment of existing tasks. */ if (renorm && !curr) - se->vruntime += cfs_rq->min_vruntime; + renormalize_entity(cfs_rq, se); enqueue_entity_load_avg(cfs_rq, se); account_entity_enqueue(cfs_rq, se); @@ -3406,7 +3418,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st * movement in our normalized position. */ if (!(flags & DEQUEUE_SLEEP)) - se->vruntime -= cfs_rq->min_vruntime; + normalize_entity(cfs_rq, se); /* return excess runtime on last dequeue */ return_cfs_rq_runtime(cfs_rq); @@ -5408,7 +5420,7 @@ static void migrate_task_rq_fair(struct min_vruntime = cfs_rq->min_vruntime; #endif - se->vruntime -= min_vruntime; + normalize_entity(cfs_rq, se); } /* @@ -8319,7 +8331,7 @@ static void task_fork_fair(struct task_s resched_curr(rq); } - se->vruntime -= cfs_rq->min_vruntime; + normalize_entity(cfs_rq, se); raw_spin_unlock_irqrestore(&rq->lock, flags); } @@ -8348,29 +8360,7 @@ prio_changed_fair(struct rq *rq, struct static inline bool vruntime_normalized(struct task_struct *p) { - struct sched_entity *se = &p->se; - - /* - * In both the TASK_ON_RQ_QUEUED and TASK_ON_RQ_MIGRATING cases, - * the dequeue_entity(.flags=0) will already have normalized the - * vruntime. - */ - if (p->on_rq) - return true; - - /* - * When !on_rq, vruntime of the task has usually NOT been normalized. - * But there are some cases where it has already been normalized: - * - * - A forked child which is waiting for being woken up by - * wake_up_new_task(). - * - A task which has been woken up by try_to_wake_up() and - * waiting for actually being woken up by sched_ttwu_pending(). - */ - if (!se->sum_exec_runtime || p->state == TASK_WAKING) - return true; - - return false; + return p->se.normalized; } static void detach_task_cfs_rq(struct task_struct *p) @@ -8384,7 +8374,7 @@ static void detach_task_cfs_rq(struct ta * cause 'unlimited' sleep bonus. */ place_entity(cfs_rq, se, 0); - se->vruntime -= cfs_rq->min_vruntime; + normalize_entity(cfs_rq, se); } /* Catch up with the cfs_rq and remove our load when we leave */ @@ -8407,8 +8397,8 @@ static void attach_task_cfs_rq(struct ta /* Synchronize task with its cfs_rq */ attach_entity_load_avg(cfs_rq, se); - if (!vruntime_normalized(p)) - se->vruntime += cfs_rq->min_vruntime; + if (vruntime_normalized(p)) + renormalize_entity(cfs_rq, se); } static void switched_from_fair(struct rq *rq, struct task_struct *p) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-22 7:00 ` [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Mike Galbraith @ 2016-05-22 9:36 ` Peter Zijlstra 2016-05-22 9:52 ` Mike Galbraith 2016-05-22 10:33 ` Peter Zijlstra 2016-05-23 9:19 ` Peter Zijlstra 1 sibling, 2 replies; 11+ messages in thread From: Peter Zijlstra @ 2016-05-22 9:36 UTC (permalink / raw) To: Mike Galbraith Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > and likely take a size XXL latency hit. Big box running master bled > > > profusely under heavy load until I turned TTWU_QUEUE off. > > May as well make it official and against master.today. Fly or die > little patchlet. > > sched/fair: Move se->vruntime normalization state into struct sched_entity Yeah, I used to have a patch like this; but for debugging. I don't particularly like carrying this information other than for verification because it means we either do too much or too little normalization. I'll try and have a look on Monday, but I got some real-life things to sort out first.. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-22 9:36 ` Peter Zijlstra @ 2016-05-22 9:52 ` Mike Galbraith 2016-05-22 10:33 ` Peter Zijlstra 1 sibling, 0 replies; 11+ messages in thread From: Mike Galbraith @ 2016-05-22 9:52 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sun, 2016-05-22 at 11:36 +0200, Peter Zijlstra wrote: > On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > > and likely take a size XXL latency hit. Big box running master bled > > > > profusely under heavy load until I turned TTWU_QUEUE off. > > > > May as well make it official and against master.today. Fly or die > > little patchlet. > > > > sched/fair: Move se->vruntime normalization state into struct sched_entity > > Yeah, I used to have a patch like this; but for debugging. I don't > particularly like carrying this information other than for verification > because it means we either do too much or too little normalization. > > I'll try and have a look on Monday, but I got some real-life things to > sort out first.. Ok (flush, say hi to the goldfish little patchlet). I don't care how it gets fixed, only that it does, and yours will likely be prettier :) -Mike ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-22 9:36 ` Peter Zijlstra 2016-05-22 9:52 ` Mike Galbraith @ 2016-05-22 10:33 ` Peter Zijlstra 1 sibling, 0 replies; 11+ messages in thread From: Peter Zijlstra @ 2016-05-22 10:33 UTC (permalink / raw) To: Mike Galbraith Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sun, May 22, 2016 at 11:36:38AM +0200, Peter Zijlstra wrote: > > > > sched/fair: Move se->vruntime normalization state into struct sched_entity > > Yeah, I used to have a patch like this; but for debugging. I don't > particularly like carrying this information other than for verification > because it means we either do too much or too little normalization. > > I'll try and have a look on Monday, but I got some real-life things to > sort out first.. Ah, I think I see why we might actually need a variable this time... Brain seems to have worked overtime while pulling weeds in the garden :-) In any case, I'll have a look on Monday. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-22 7:00 ` [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Mike Galbraith 2016-05-22 9:36 ` Peter Zijlstra @ 2016-05-23 9:19 ` Peter Zijlstra 2016-05-23 9:40 ` Mike Galbraith 1 sibling, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2016-05-23 9:19 UTC (permalink / raw) To: Mike Galbraith Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > and likely take a size XXL latency hit. Big box running master bled > > > profusely under heavy load until I turned TTWU_QUEUE off. > > May as well make it official and against master.today. Fly or die > little patchlet. > > sched/fair: Move se->vruntime normalization state into struct sched_entity Does this work? --- include/linux/sched.h | 1 + kernel/sched/core.c | 18 +++++++++++------- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 1b43b45a22b9..a2001e01b3df 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1534,6 +1534,7 @@ struct task_struct { unsigned sched_reset_on_fork:1; unsigned sched_contributes_to_load:1; unsigned sched_migrated:1; + unsigned sched_remote_wakeup:1; unsigned :0; /* force alignment to the next boundary */ /* unserialized, strictly 'current' */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 404c0784b1fc..7f2cae4620c7 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void) cookie = lockdep_pin_lock(&rq->lock); while (llist) { + int wake_flags = 0; + p = llist_entry(llist, struct task_struct, wake_entry); llist = llist_next(llist); - /* - * See ttwu_queue(); we only call ttwu_queue_remote() when - * its a x-cpu wakeup. - */ - ttwu_do_activate(rq, p, WF_MIGRATED, cookie); + + if (p->sched_remote_wakeup) + wake_flags = WF_MIGRATED; + + ttwu_do_activate(rq, p, wake_flags, cookie); } lockdep_unpin_lock(&rq->lock, cookie); @@ -1819,10 +1821,12 @@ void scheduler_ipi(void) irq_exit(); } -static void ttwu_queue_remote(struct task_struct *p, int cpu) +static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags) { struct rq *rq = cpu_rq(cpu); + p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED); + if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) { if (!set_nr_if_polling(rq->idle)) smp_send_reschedule(cpu); @@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags) #if defined(CONFIG_SMP) if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) { sched_clock_cpu(cpu); /* sync clocks x-cpu */ - ttwu_queue_remote(p, cpu); + ttwu_queue_remote(p, cpu, wake_flags); return; } #endif ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-23 9:19 ` Peter Zijlstra @ 2016-05-23 9:40 ` Mike Galbraith 2016-05-23 10:13 ` Wanpeng Li 2016-05-23 12:28 ` Peter Zijlstra 0 siblings, 2 replies; 11+ messages in thread From: Mike Galbraith @ 2016-05-23 9:40 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Mon, 2016-05-23 at 11:19 +0200, Peter Zijlstra wrote: > On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, > > > > and likely take a size XXL latency hit. Big box running master bled > > > > profusely under heavy load until I turned TTWU_QUEUE off. > > > > May as well make it official and against master.today. Fly or die > > little patchlet. > > > > sched/fair: Move se->vruntime normalization state into struct sched_entity > > Does this work? Yup, bugs--. Kinda funny, I considered ~this way first, but thought you'd not that approach.. dang, got it back-assward ;-) > --- > include/linux/sched.h | 1 + > kernel/sched/core.c | 18 +++++++++++------- > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 1b43b45a22b9..a2001e01b3df 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1534,6 +1534,7 @@ struct task_struct { > unsigned sched_reset_on_fork:1; > unsigned sched_contributes_to_load:1; > unsigned sched_migrated:1; > + unsigned sched_remote_wakeup:1; > unsigned :0; /* force alignment to the next boundary */ > > /* unserialized, strictly 'current' */ > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 404c0784b1fc..7f2cae4620c7 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void) > cookie = lockdep_pin_lock(&rq->lock); > > while (llist) { > + int wake_flags = 0; > + > p = llist_entry(llist, struct task_struct, > wake_entry); > llist = llist_next(llist); > - /* > - * See ttwu_queue(); we only call > ttwu_queue_remote() when > - * its a x-cpu wakeup. > - */ > - ttwu_do_activate(rq, p, WF_MIGRATED, cookie); > + > + if (p->sched_remote_wakeup) > + wake_flags = WF_MIGRATED; > + > + ttwu_do_activate(rq, p, wake_flags, cookie); > } > > lockdep_unpin_lock(&rq->lock, cookie); > @@ -1819,10 +1821,12 @@ void scheduler_ipi(void) > irq_exit(); > } > > -static void ttwu_queue_remote(struct task_struct *p, int cpu) > +static void ttwu_queue_remote(struct task_struct *p, int cpu, int > wake_flags) > { > struct rq *rq = cpu_rq(cpu); > > + p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED); > + > if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) { > if (!set_nr_if_polling(rq->idle)) > smp_send_reschedule(cpu); > @@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, > int cpu, int wake_flags) > #if defined(CONFIG_SMP) > if (sched_feat(TTWU_QUEUE) && > !cpus_share_cache(smp_processor_id(), cpu)) { > sched_clock_cpu(cpu); /* sync clocks x-cpu */ > - ttwu_queue_remote(p, cpu); > + ttwu_queue_remote(p, cpu, wake_flags); > return; > } > #endif ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-23 9:40 ` Mike Galbraith @ 2016-05-23 10:13 ` Wanpeng Li 2016-05-23 10:26 ` Mike Galbraith 2016-05-23 12:28 ` Peter Zijlstra 1 sibling, 1 reply; 11+ messages in thread From: Wanpeng Li @ 2016-05-23 10:13 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, Byungchul Park, Andrew Hunter 2016-05-23 17:40 GMT+08:00 Mike Galbraith <umgwanakikbuti@gmail.com>: > On Mon, 2016-05-23 at 11:19 +0200, Peter Zijlstra wrote: >> On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: >> > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: >> > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: >> > > >> > > > Wakees that were not migrated/normalized eat an unwanted min_vruntime, >> > > > and likely take a size XXL latency hit. Big box running master bled >> > > > profusely under heavy load until I turned TTWU_QUEUE off. >> > >> > May as well make it official and against master.today. Fly or die >> > little patchlet. >> > >> > sched/fair: Move se->vruntime normalization state into struct sched_entity >> >> Does this work? > > Yup, bugs--. Kinda funny, I considered ~this way first, but thought > you'd not that approach.. dang, got it back-assward ;-) > Nicer this one. Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> >> --- >> include/linux/sched.h | 1 + >> kernel/sched/core.c | 18 +++++++++++------- >> 2 files changed, 12 insertions(+), 7 deletions(-) >> >> diff --git a/include/linux/sched.h b/include/linux/sched.h >> index 1b43b45a22b9..a2001e01b3df 100644 >> --- a/include/linux/sched.h >> +++ b/include/linux/sched.h >> @@ -1534,6 +1534,7 @@ struct task_struct { >> unsigned sched_reset_on_fork:1; >> unsigned sched_contributes_to_load:1; >> unsigned sched_migrated:1; >> + unsigned sched_remote_wakeup:1; >> unsigned :0; /* force alignment to the next boundary */ >> >> /* unserialized, strictly 'current' */ >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 404c0784b1fc..7f2cae4620c7 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void) >> cookie = lockdep_pin_lock(&rq->lock); >> >> while (llist) { >> + int wake_flags = 0; >> + >> p = llist_entry(llist, struct task_struct, >> wake_entry); >> llist = llist_next(llist); >> - /* >> - * See ttwu_queue(); we only call >> ttwu_queue_remote() when >> - * its a x-cpu wakeup. >> - */ >> - ttwu_do_activate(rq, p, WF_MIGRATED, cookie); >> + >> + if (p->sched_remote_wakeup) >> + wake_flags = WF_MIGRATED; >> + >> + ttwu_do_activate(rq, p, wake_flags, cookie); >> } >> >> lockdep_unpin_lock(&rq->lock, cookie); >> @@ -1819,10 +1821,12 @@ void scheduler_ipi(void) >> irq_exit(); >> } >> >> -static void ttwu_queue_remote(struct task_struct *p, int cpu) >> +static void ttwu_queue_remote(struct task_struct *p, int cpu, int >> wake_flags) >> { >> struct rq *rq = cpu_rq(cpu); >> >> + p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED); >> + >> if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) { >> if (!set_nr_if_polling(rq->idle)) >> smp_send_reschedule(cpu); >> @@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, >> int cpu, int wake_flags) >> #if defined(CONFIG_SMP) >> if (sched_feat(TTWU_QUEUE) && >> !cpus_share_cache(smp_processor_id(), cpu)) { >> sched_clock_cpu(cpu); /* sync clocks x-cpu */ >> - ttwu_queue_remote(p, cpu); >> + ttwu_queue_remote(p, cpu, wake_flags); >> return; >> } >> #endif -- Regards, Wanpeng Li ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-23 10:13 ` Wanpeng Li @ 2016-05-23 10:26 ` Mike Galbraith 0 siblings, 0 replies; 11+ messages in thread From: Mike Galbraith @ 2016-05-23 10:26 UTC (permalink / raw) To: Wanpeng Li Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, Byungchul Park, Andrew Hunter On Mon, 2016-05-23 at 18:13 +0800, Wanpeng Li wrote: > 2016-05-23 17:40 GMT+08:00 Mike Galbraith <umgwanakikbuti@gmail.com>: > > On Mon, 2016-05-23 at 11:19 +0200, Peter Zijlstra wrote: > > > On Sun, May 22, 2016 at 09:00:01AM +0200, Mike Galbraith wrote: > > > > On Sat, 2016-05-21 at 21:00 +0200, Mike Galbraith wrote: > > > > > On Sat, 2016-05-21 at 16:04 +0200, Mike Galbraith wrote: > > > > > > > > > > > Wakees that were not migrated/normalized eat an unwanted > > > > > > min_vruntime, > > > > > > and likely take a size XXL latency hit. Big box running > > > > > > master bled > > > > > > profusely under heavy load until I turned TTWU_QUEUE off. > > > > > > > > May as well make it official and against master.today. Fly or > > > > die > > > > little patchlet. > > > > > > > > sched/fair: Move se->vruntime normalization state into struct > > > > sched_entity > > > > > > Does this work? > > > > Yup, bugs--. Kinda funny, I considered ~this way first, but > > thought > > you'd not that approach.. dang, got it back-assward ;-) > > > > Nicer this one. Well, the wakeup is a remote wakeup whether the flag is set or not, so in that regard it's not _squeaky_ clean, but is dinky, works (yay).. and doesn't invent helpers with funny names vs what they actually do. -Mike ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity 2016-05-23 9:40 ` Mike Galbraith 2016-05-23 10:13 ` Wanpeng Li @ 2016-05-23 12:28 ` Peter Zijlstra 1 sibling, 0 replies; 11+ messages in thread From: Peter Zijlstra @ 2016-05-23 12:28 UTC (permalink / raw) To: Mike Galbraith Cc: mingo, linux-kernel, Pavan Kondeti, Ben Segall, Matt Fleming, Morten Rasmussen, Paul Turner, Thomas Gleixner, byungchul.park, Andrew Hunter On Mon, May 23, 2016 at 11:40:35AM +0200, Mike Galbraith wrote: > Yup, bugs--. Kinda funny, I considered ~this way first, but thought > you'd not that approach.. dang, got it back-assward ;-) Hehe, so the new flag is in a word we've already written to on this path and it avoids growing the structure (by an undefined number of bytes, see gripes about sizeof(bool)). I am thinking of doing the thing you did for a debug aid -- then again; if the Google guys would finally get the global vruntime patches sorted, we can go and kill all this code. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-05-25 17:49 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-05-24 17:04 [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Paul E. McKenney 2016-05-25 17:49 ` Paul E. McKenney -- strict thread matches above, loose matches on Subject: below -- 2016-05-10 17:43 [PATCH 0/3] sched: Fix wakeup preemption regression Peter Zijlstra 2016-05-10 17:43 ` [PATCH 2/3] sched,fair: Fix local starvation Peter Zijlstra 2016-05-21 14:04 ` Mike Galbraith 2016-05-21 19:00 ` Mike Galbraith 2016-05-22 7:00 ` [patch] sched/fair: Move se->vruntime normalization state into struct sched_entity Mike Galbraith 2016-05-22 9:36 ` Peter Zijlstra 2016-05-22 9:52 ` Mike Galbraith 2016-05-22 10:33 ` Peter Zijlstra 2016-05-23 9:19 ` Peter Zijlstra 2016-05-23 9:40 ` Mike Galbraith 2016-05-23 10:13 ` Wanpeng Li 2016-05-23 10:26 ` Mike Galbraith 2016-05-23 12:28 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).