From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753972AbcHVBsX (ORCPT ); Sun, 21 Aug 2016 21:48:23 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36245 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753093AbcHVBsV (ORCPT ); Sun, 21 Aug 2016 21:48:21 -0400 MIME-Version: 1.0 In-Reply-To: <20160819140307.GA25262@e105550-lin.cambridge.arm.com> References: <1469453670-2660-1-git-send-email-morten.rasmussen@arm.com> <1469453670-2660-11-git-send-email-morten.rasmussen@arm.com> <20160815142342.GV6879@twins.programming.kicks-ass.net> <20160815154237.GE3391@e105550-lin.cambridge.arm.com> <20160818084053.GG3391@e105550-lin.cambridge.arm.com> <20160818102438.GA27873@e105550-lin.cambridge.arm.com> <20160818134517.GC27873@e105550-lin.cambridge.arm.com> <20160819140307.GA25262@e105550-lin.cambridge.arm.com> From: Wanpeng Li Date: Mon, 22 Aug 2016 09:48:19 +0800 Message-ID: Subject: Re: [PATCH v3 10/13] sched/fair: Compute task/cpu utilization at wake-up more correctly To: Morten Rasmussen Cc: Peter Zijlstra , Ingo Molnar , Dietmar Eggemann , Yuyang Du , Vincent Guittot , Mike Galbraith , sgurrappadi@nvidia.com, Koan-Sin Tan , =?UTF-8?B?5bCP5p6X5pWs5aSq?= , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2016-08-19 22:03 GMT+08:00 Morten Rasmussen : > On Fri, Aug 19, 2016 at 09:43:00AM +0800, Wanpeng Li wrote: >> 2016-08-18 21:45 GMT+08:00 Morten Rasmussen : >> > I assume you are referring to using task_util_peak() instead of >> > task_util() in wake_cap()? >> >> Yes. >> >> > >> > The peak value should never exceed the util_avg accumulated by the task >> > last time it ran. So any spike has to be caused by the task accumulating >> > more utilization last time it ran. We don't know if it a spike or a more >> >> I see. >> >> > permanent change in behaviour, so we have to guess. So a spike on an >> > asymmetric system could cause us to disable wake affine in some >> > circumstances (either prev_cpu or waker cpu has to be low compute >> > capacity) for the following wake-up. >> > >> > SMP should be unaffected as we should bail out on the previous >> > condition. >> >> Why capacity_orig instead of capacity since it is checked each time >> wakeup and maybe rt class/interrupt have already occupied many cpu >> utilization. > > We could switch to capacity for this condition if we also change the > spare capacity evaluation in find_idlest_group() to do the same. It > would open up for SMP systems to take find_idlest_group() route if the > SD_BALANCE_WAKE flag is set. > > The reason why I have avoided capacity and used capacity_orig instead > is that in previous discussions about scheduling behaviour under > rt/dl/irq pressure it has been clear to me whether we want to move tasks > away from cpus with capacity < capacity_orig or not. The choice depends > on the use-case. > > In some cases taking rt/dl/irq pressure into account is more complicated > as we don't know the capacities available in a sched_group without > iterating over all the cpus. However, I don't think it would complicate > these patches. It is more a question whether everyone are happy with > additional conditions in their wake-up path. I guess we could make it a > sched_feature if people are interested? > > In short, I used capacity_orig to play it safe ;-) Actually you mixed capacity_orig and capacity when evaluating max spare cap. Regards, Wanpeng Li