From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F912C10F00 for ; Fri, 6 Mar 2020 12:07:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D2D220848 for ; Fri, 6 Mar 2020 12:07:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="CqMIIU6O" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726490AbgCFMHS (ORCPT ); Fri, 6 Mar 2020 07:07:18 -0500 Received: from mail-lf1-f68.google.com ([209.85.167.68]:37733 "EHLO mail-lf1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726171AbgCFMHS (ORCPT ); Fri, 6 Mar 2020 07:07:18 -0500 Received: by mail-lf1-f68.google.com with SMTP id j11so1724602lfg.4 for ; Fri, 06 Mar 2020 04:07:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4KPUPSTkugtYA0PbEeoVta4pOw/ai16ArIKK838lBEU=; b=CqMIIU6OgUcYfP+nO15DqV4Gn+28U4l0XKINHu7YR8cz06yd1dTfyQOySiqK8perVe q9cBlZ2hKQbb6Y6/ajVOTatYAlb8vOP2A+OXF4OW7VofwPUW8se8+6H3YntUcm9i03X0 dhmQuNmiKWxKSbFtmu3Dise3GGTlB0YOpqwp7veoFIANAJ/YXBAnSy0VfnbTKvbDT1un de/8n7Nsne1PBues3q5A7DLVutU3rmafGrzSpvI7sWUzRtt1FntXj7ANmZRArJrX4P6N v3sBy6cEI8TxFNCkLrLgmFRuuUUbivLim/IAEzmhkO9x3QDAIpTgcMcIbIzsGSY+zlbK L3Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4KPUPSTkugtYA0PbEeoVta4pOw/ai16ArIKK838lBEU=; b=oHuE0nyiE3FsBrCAk8rrPvvWGYn8S+VwBIIWyGB69I11L63c2ddKZbWP/IQLP/1IAs HsPk6+0ejP8uo3ZyNv1kHq8zUlkZ8u/0+XeLmBHGcqIK4uef9+oAmQ4kVJbvp4Gj7k7k 957SktwsPqM84aMy/5qZgXO2pbE9fZYu+Jvjq8gFQHyqe0RDb4bDPluLizcVUwMK5Gv2 EaJi9sLy8Ng0PwoOjVSEjloUADRHMjElYT4Xlk+Bf1X2QNYAkY6ptW6CM6vgmZ40EA9N BuvCW5/fR4Xpsg05/pSivi58jAIXvW/zOzZpSjNOiJ6wcf/QENnQft7gHUWMUUT2M4c3 W63w== X-Gm-Message-State: ANhLgQ3TIiIilIkEXtztIhdiOzSEGQENBUTj46nbLjcycg263nFlrSGa ehFF9nYZA3kU1rcp0TyuwJATLIa6jV4j4RVvapgAsA== X-Google-Smtp-Source: ADFU+vvQ1aR+zpKusZ09dVvP7kf4bC66fgMS07Kbjp8L2S6t2eYS+496jT2rtl2R7T0gZJhh4p4P35QHGaUh8yls+jk= X-Received: by 2002:a19:c215:: with SMTP id l21mr1732735lfc.95.1583496435711; Fri, 06 Mar 2020 04:07:15 -0800 (PST) MIME-Version: 1.0 References: <20200305172921.22743-1-vincent.guittot@linaro.org> In-Reply-To: From: Vincent Guittot Date: Fri, 6 Mar 2020 13:07:04 +0100 Message-ID: Subject: Re: [PATCH] sched/fair: fix enqueue_task_fair warning To: Dietmar Eggemann Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , linux-kernel , Christian Borntraeger , "# v4 . 16+" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 6 Mar 2020 at 10:12, Vincent Guittot wrote: > > On Thu, 5 Mar 2020 at 20:07, Dietmar Eggemann wrote: > > > > On 05/03/2020 18:29, Vincent Guittot wrote: > > > When a cfs rq is throttled, the latter and its child are removed from the > > > leaf list but their nr_running is not changed which includes staying higher > > > than 1. When a task is enqueued in this throttled branch, the cfs rqs must > > > be added back in order to ensure correct ordering in the list but this can > > > only happens if nr_running == 1. > > > When cfs bandwidth is used, we call unconditionnaly list_add_leaf_cfs_rq() > > > when enqueuing an entity to make sure that the complete branch will be > > > added. > > > > > > Reported-by: Christian Borntraeger > > > Tested-by: Christian Borntraeger > > > Cc: stable@vger.kernel.org #v5.1+ > > > Signed-off-by: Vincent Guittot > > > --- > > > kernel/sched/fair.c | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index fcc968669aea..bdc5bb72ab31 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -4117,6 +4117,7 @@ static inline void check_schedstat_required(void) > > > #endif > > > } > > > > > > +static inline bool cfs_bandwidth_used(void); > > > > > > /* > > > * MIGRATION > > > @@ -4195,10 +4196,16 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) > > > __enqueue_entity(cfs_rq, se); > > > se->on_rq = 1; > > > > > > - if (cfs_rq->nr_running == 1) { > > > + /* > > > + * When bandwidth control is enabled, cfs might have been removed because of > > > + * a parent been throttled but cfs->nr_running > 1. Try to add it > > > + * unconditionnally. > > > + */ > > > + if (cfs_rq->nr_running == 1 || cfs_bandwidth_used()) > > > list_add_leaf_cfs_rq(cfs_rq); > > > + > > > + if (cfs_rq->nr_running == 1) > > > check_enqueue_throttle(cfs_rq); > > > - } > > > } > > > > > > static void __clear_buddies_last(struct sched_entity *se) > > > > I experimented with an rt-app based setup on Arm64 Juno (6 CPUs): > > > > cgroupv1 hierarchy A/B/C, all CFS bw controlled (30,000/100,000) > > > > I create A/B/C outside rt-app so I can have rt-app runs with an already > > existing taskgroup hierarchy. There is a 4 secs gap between consecutive > > rt-app runs. > > > > The rt-app files contains 6 periodic CFS tasks (25,000/100,000) running > > in /A/B/C, /A/B, /A (3 rt-app task phases). > > > > I get w/ the patch (and the debug patch applied to unthrottle_cfs_rq()): > > > > root@juno:~# > > [ 409.236925] CPU1 path=/A/B on_list=1 nr_running=1 throttled=1 > > [ 409.242682] CPU1 path=/A on_list=0 nr_running=0 throttled=1 > > [ 409.248260] CPU1 path=/ on_list=1 nr_running=0 throttled=0 > > [ 409.253748] ------------[ cut here ]------------ > > [ 409.258365] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list > > [ 409.258382] WARNING: CPU: 1 PID: 0 at kernel/sched/fair.c:380 > > unthrottle_cfs_rq+0x21c/0x2a8 > > ... > > [ 409.275196] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc3-dirty #62 > > [ 409.281990] Hardware name: ARM Juno development board (r0) (DT) > > ... > > [ 409.384644] Call trace: > > [ 409.387089] unthrottle_cfs_rq+0x21c/0x2a8 > > [ 409.391188] distribute_cfs_runtime+0xf4/0x198 > > [ 409.395634] sched_cfs_period_timer+0x134/0x240 > > [ 409.400168] __hrtimer_run_queues+0x10c/0x3c0 > > [ 409.404527] hrtimer_interrupt+0xd4/0x250 > > [ 409.408539] tick_handle_oneshot_broadcast+0x17c/0x208 > > [ 409.413683] sp804_timer_interrupt+0x30/0x40 > > > > If I add the following snippet the issue goes away: If it's fine for you, I'm going to add this in a new version of the patch > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index e9fd5379bb7e..5e03be046aba 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -4627,11 +4627,17 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq) > > break; > > } > > > > - assert_list_leaf_cfs_rq(rq); > > - > > if (!se) > > add_nr_running(rq, task_delta); > > will add similar comment as for enqueue_task_fair + /* + * The cfs_rq_throttled() breaks in the above iteration can result in + * incomplete leaf list maintenance, resulting in triggering the assertion + * below. + */ > > + for_each_sched_entity(se) { > > + cfs_rq = cfs_rq_of(se); > > + > > + list_add_leaf_cfs_rq(cfs_rq); > > + } > > Yes make sense. > > > + > > + assert_list_leaf_cfs_rq(rq); > > + > > /* Determine whether we need to wake up potentially idle CPU: */ > > if (rq->curr == rq->idle && rq->cfs.nr_running) > > resched_curr(rq);