From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754911AbbKCQpL (ORCPT ); Tue, 3 Nov 2015 11:45:11 -0500 Received: from mga01.intel.com ([192.55.52.88]:40820 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751549AbbKCQpI (ORCPT ); Tue, 3 Nov 2015 11:45:08 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,239,1444719600"; d="scan'208";a="810796967" Date: Tue, 3 Nov 2015 08:45:01 -0800 From: Jacob Pan To: Peter Zijlstra Cc: Thomas Gleixner , LKML , Arjan van de Ven , Paul Turner , Len Brown , Srinivas Pandruvada , Tim Chen , Andi Kleen , Rafael Wysocki Subject: Re: [RFC PATCH 3/3] sched: introduce synchronized idle injection Message-ID: <20151103084501.289ec5d1@yairi> In-Reply-To: <20151103133120.GD17308@twins.programming.kicks-ass.net> References: <1446509428-5616-1-git-send-email-jacob.jun.pan@linux.intel.com> <1446509428-5616-4-git-send-email-jacob.jun.pan@linux.intel.com> <20151103133120.GD17308@twins.programming.kicks-ass.net> Organization: OTC X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Nov 2015 14:31:20 +0100 Peter Zijlstra wrote: > > @@ -5136,6 +5148,16 @@ pick_next_task_fair(struct rq *rq, struct > > task_struct *prev) struct task_struct *p; > > int new_tasks; > > > > +#ifdef CONFIG_CFS_IDLE_INJECT > > + if (cfs_rq->force_throttled && > > + !idle_cpu(cpu_of(rq)) && > > + !unlikely(local_softirq_pending())) { > > + /* forced idle, pick no task */ > > + trace_sched_cfs_idle_inject(cpu_of(rq), 1); > > + update_curr(cfs_rq); > > + return NULL; > > + } > > +#endif > > again: > > #ifdef CONFIG_FAIR_GROUP_SCHED > > if (!cfs_rq->nr_running) > > So this is horrible... > > This is a fast path, and you just put at least one cachemiss in it, a > branch (without hint) and some goofy code (wth are we checking > softirqs?). > softirq is checked here since it is one of the conditions to stop sched tick. can_stop_idle_tick(). but we don't have to check here, you are right. > How about you frob things such that cfs_rq->nr_running == 0 and we'll > hit the idle: path, at that point you can test if we're forced idle > and skip the load-balancing attempt. > > There's probably a fair number of icky cases to deal with if you frob > cfs_rq->nr_running, like the enqueue path which will add to it. We'll > have to come up with something to not slow that down either. > > The thing is, both schedule and enqueue are very hot and this is code > that will 'never' run. Fair enough, I will give that a try, I guess it would be costly and hard to scale if we were to dequeue/enqueue every se for every period of injection plus locking. Let me get some data first. I understand we don't want to sacrifice the hot patch for some code almost 'never' run. But I also have follow up plan to use this code for consolidating/synchronizing idle during balanced semi-active workload. In that case, it may run more often. e.g. Before: CPU0 ______||| || |___________| || || |_____ CPU1 _________||| || |_______| || |_______ After: CPU0 ______||| || |___________| || || |_____ CPU1 ______||| || |___________| || |_______ The goal is to have overlapping idle time if the load is already balanced. The energy saving can be significant. Jacob