From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933997AbcJRJH7 (ORCPT ); Tue, 18 Oct 2016 05:07:59 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:43997 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752084AbcJRJHy (ORCPT ); Tue, 18 Oct 2016 05:07:54 -0400 Date: Tue, 18 Oct 2016 11:07:47 +0200 From: Peter Zijlstra To: Dietmar Eggemann Cc: Vincent Guittot , Joseph Salisbury , Ingo Molnar , Linus Torvalds , Thomas Gleixner , LKML , Mike Galbraith , omer.akram@canonical.com Subject: Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes Message-ID: <20161018090747.GW3142@twins.programming.kicks-ass.net> References: <20161014151827.GA10379@linaro.org> <2bb765e7-8a5f-c525-a6ae-fbec6fae6354@canonical.com> <20161017090903.GA11962@linaro.org> <4e15ad55-beeb-e860-0420-8f439d076758@arm.com> <20161017131952.GR3117@twins.programming.kicks-ass.net> <94cc6deb-f93e-60ec-5834-e84a8b98e73c@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <94cc6deb-f93e-60ec-5834-e84a8b98e73c@arm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 17, 2016 at 11:52:39PM +0100, Dietmar Eggemann wrote: > > Something looks weird related to the use of for_each_possible_cpu(i) in > online_fair_sched_group() on my i5-3320M CPU (4 logical cpus). > > In case I print out cpu id and the cpu masks inside the for_each_possible_cpu(i) > I get: > > [ 5.462368] cpu=0 cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 OK, you have a buggy BIOS :-) It enumerates too many CPU slots. There is no reason to have 4 empty CPU slots on a machine that cannot do physical hotplug. This also explains why it doesn't show on many machines, most machines will not have this and possible_mask == present_mask == online_mask for most all cases. x86 folk, can we detect the lack of physical hotplug capability and FW_WARN on this and lowering possible_mask to present_mask? > [ 5.462370] cpu=1 cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462370] cpu=2 cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462371] cpu=3 cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462372] *cpu=4* cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462373] *cpu=5* cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462374] *cpu=6* cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > [ 5.462375] *cpu=7* cpu_possible_mask=0-7 cpu_online_mask=0-3 cpu_present_mask=0-3 cpu_active_mask=0-3 > > T430:/sys/fs/cgroup/cpu,cpuacct/system.slice# ls -l | grep '^d' | wc -l > 80 > > /proc/sched_debug: > > cfs_rq[0]:/system.slice > ... > .tg_load_avg : 323584 > ... > > 80 * 1024 * 4 (not existent cpu4-cpu7) = 327680 (with a little bit of decay, > this could be this extra load on the systen.slice tg) > > Using for_each_online_cpu(i) instead of for_each_possible_cpu(i) in > online_fair_sched_group() works on this machine, i.e. the .tg_load_avg > of system.slice tg is 0 after startup. Right, so the reason for using present_mask is that it avoids having to deal with hotplug, also all the per-cpu memory is allocated and present for !online CPUs anyway, so might as well set it up properly anyway. (You might want to start booting your laptop with "possible_cpus=4" to save some memory FWIW.) But yes, we have a bug here too... /me ponders So aside from funny BIOSes, this should also show up when creating cgroups when you have offlined a few CPUs, which is far more common I'd think. On IRC you mentioned that adding list_add_leaf_cfs_rq() to online_fair_sched_group() cures this, this would actually match with unregister_fair_sched_group() doing list_del_leaf_cfs_rq() and avoid a few instructions on the enqueue path, so that's all good. I'm just not immediately seeing how that cures things. The only relevant user of the leaf_cfs_rq list seems to be update_blocked_averages() which is called from the balance code (idle_balance() and rebalance_domains()). But neither should call that for offline (or !present) CPUs. Humm..