From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 341CAC433F5 for ; Mon, 11 Apr 2022 06:17:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244865AbiDKGTg (ORCPT ); Mon, 11 Apr 2022 02:19:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235930AbiDKGTd (ORCPT ); Mon, 11 Apr 2022 02:19:33 -0400 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 287033CFE1 for ; Sun, 10 Apr 2022 23:17:15 -0700 (PDT) X-UUID: ef001d495bb64ac88490f754d25659d9-20220411 X-UUID: ef001d495bb64ac88490f754d25659d9-20220411 Received: from mtkexhb01.mediatek.inc [(172.21.101.102)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-SHA384 256/256) with ESMTP id 340244809; Mon, 11 Apr 2022 14:17:11 +0800 Received: from mtkcas11.mediatek.inc (172.21.101.40) by mtkmbs10n2.mediatek.inc (172.21.101.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.2.792.3; Mon, 11 Apr 2022 14:17:09 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkcas11.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 11 Apr 2022 14:17:08 +0800 From: Kuyo Chang To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , "Mel Gorman" , Daniel Bristot de Oliveira , Matthias Brugger CC: , kuyo chang , , , Subject: [PATCH 1/1] sched/pelt: Refine the enqueue_load_avg calculate method Date: Mon, 11 Apr 2022 14:16:56 +0800 Message-ID: <20220411061702.22978-1-kuyo.chang@mediatek.com> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 Content-Type: text/plain X-MTK: N Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: kuyo chang I meet the warning message at cfs_rq_is_decayed at below code. SCHED_WARN_ON(cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg) Following is the calltrace. Call trace: __update_blocked_fair update_blocked_averages newidle_balance pick_next_task_fair __schedule schedule pipe_read vfs_read ksys_read After code analyzing and some debug messages, I found it exits a corner case at attach_entity_load_avg which will cause load_sum is zero and load_avg is not. Consider se_weight is 88761 according by sched_prio_to_weight table. And assume the get_pelt_divider() is 47742, se->avg.load_avg is 1. By the calculating for se->avg.load_sum as following will become zero as following. se->avg.load_sum = div_u64(se->avg.load_avg * se->avg.load_sum, se_weight(se)); se->avg.load_sum = 1*47742/88761 = 0. After enqueue_load_avg code as below. cfs_rq->avg.load_avg += se->avg.load_avg; cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; Then the load_sum for cfs_rq will be 1 while the load_sum for cfs_rq is 0. So it will hit the warning message. After all, I refer the following commit patch to do the similar thing at enqueue_load_avg. sched/pelt: Relax the sync of load_sum with load_avg After long time testing, the kernel warning was gone and the system runs as well as before. Signed-off-by: kuyo chang --- kernel/sched/fair.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d4bd299d67ab..30d8b6dba249 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3074,8 +3074,10 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se) static inline void enqueue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { - cfs_rq->avg.load_avg += se->avg.load_avg; - cfs_rq->avg.load_sum += se_weight(se) * se->avg.load_sum; + add_positive(&cfs_rq->avg.load_avg, se->avg.load_avg); + add_positive(&cfs_rq->avg.load_sum, se_weight(se) * se->avg.load_sum); + cfs_rq->avg.load_sum = max_t(u32, cfs_rq->avg.load_sum, + cfs_rq->avg.load_avg * PELT_MIN_DIVIDER); } static inline void -- 2.18.0