From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B11C4708D for ; Fri, 6 Jan 2023 11:28:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231907AbjAFL2g (ORCPT ); Fri, 6 Jan 2023 06:28:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229476AbjAFL2d (ORCPT ); Fri, 6 Jan 2023 06:28:33 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A642126C0 for ; Fri, 6 Jan 2023 03:28:31 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1772A11FB; Fri, 6 Jan 2023 03:29:13 -0800 (PST) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B9FE53F23F; Fri, 6 Jan 2023 03:28:27 -0800 (PST) Message-ID: <1913041e-ee67-1e65-68fa-ef08b97ed9d5@arm.com> Date: Fri, 6 Jan 2023 12:28:26 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 Subject: Re: [RFC PATCH v4 1/2] sched/fair: Introduce short duration task check Content-Language: en-US To: Chen Yu Cc: Peter Zijlstra , Vincent Guittot , Tim Chen , Mel Gorman , Juri Lelli , Rik van Riel , Aaron Lu , Abel Wu , K Prateek Nayak , Yicong Yang , "Gautham R . Shenoy" , Ingo Molnar , Steven Rostedt , Ben Segall , Daniel Bristot de Oliveira , Valentin Schneider , Hillf Danton , Honglei Wang , Len Brown , Chen Yu , Tianchen Ding , Joel Fernandes , Josh Don , linux-kernel@vger.kernel.org References: <82689dd6-9db1-dd00-069b-73a637a21126@arm.com> From: Dietmar Eggemann In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/01/2023 09:34, Chen Yu wrote: > Hi Dietmar, > thanks for reviewing the patch! > On 2023-01-05 at 12:33:16 +0100, Dietmar Eggemann wrote: >> On 16/12/2022 07:11, Chen Yu wrote: >> >> [...] >> >>> @@ -5995,6 +6005,18 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) >>> >>> static void set_next_buddy(struct sched_entity *se); >>> >>> +static inline void dur_avg_update(struct task_struct *p, bool task_sleep) >>> +{ >>> + u64 dur; >>> + >>> + if (!task_sleep) >>> + return; >>> + >>> + dur = p->se.sum_exec_runtime - p->se.prev_sum_exec_runtime_vol; >>> + p->se.prev_sum_exec_runtime_vol = p->se.sum_exec_runtime; >> >> Shouldn't se->prev_sum_exec_runtime_vol be set in enqueue_task_fair() >> and not in dequeue_task_fair()->dur_avg_update()? Otherwise `dur` will >> contain sleep time. >> > After the task p is dequeued, p's sum_exec_runtime will not be increased. True. > Unless task p is switched in again, p's sum_exec_runtime will continue to > increase. So dur should not include the sleep time, because we substract Not sure I get this sentence? p's se->sum_exec_runtime will only increase if p is current, so running? > between the sum_exec_runtime rather than rq->clock_task. Not sure if I understand > this correctly? No, you're right. We're not dealing with time snapshots but rather with sum_exec_runtime snapshots. So the value will not change between dequeue and the next enqueue. e ... enqueue_task_fair() d ... dequeue_task_fair() s ... set_next_entity() p ... put_prev_entity() u ... update_curr_fair()->update_curr() p1: ---|---||--|--|---|--|--||--- d es u p s u pd ^ ^ | | (A) (B) Same se->prev_sum_exec_runtime_vol value in (A) and (B). > My original thought was that, record the average run time of every section: > Only consider that task voluntarily relinquishes the CPU. > For example, suppose on CPU1, task p1 and p2 run alternatively: > > --------------------> time > > | p1 runs 1ms | p2 preempt p1 | p1 switch in, runs 0.5ms and blocks | > ^ ^ ^ > |_____________| |_____________________________________| > ^ > | > p1 dequeued > > p1's duration in one section is (1 + 0.5)ms. Because if p2 does not > preempt p1, p1 can run 1.5ms. This reflects the nature of a task, > how long it wishes to run at most. > >> Like we do for se->prev_sum_exec_runtime in set_next_entity() but for >> one `set_next_entity()-put_prev_entity()` run section. >> >> AFAICS, you want to measure the exec_runtime sum over all run sections >> between enqueue and dequeue. > Yes, we tried to record the 'decayed' average exec_runtime for each section. > Say, task p runs for a ms , then p is dequeued and blocks for b ms, and then > runs for c ms, its average duration is 0.875 * a + 0.125 * c , which is > what update_avg() does. OK.