From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 403E7C32771 for ; Fri, 19 Aug 2022 12:35:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348613AbiHSMfx (ORCPT ); Fri, 19 Aug 2022 08:35:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347895AbiHSMfu (ORCPT ); Fri, 19 Aug 2022 08:35:50 -0400 Received: from mail-yw1-x112d.google.com (mail-yw1-x112d.google.com [IPv6:2607:f8b0:4864:20::112d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34121D6336 for ; Fri, 19 Aug 2022 05:35:49 -0700 (PDT) Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-31f445bd486so116936097b3.13 for ; Fri, 19 Aug 2022 05:35:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc; bh=14TBO6qOq1fRQ+FbYLje80t5Pt+y91sZvBjhjHqXvZg=; b=qeqVIXgtGQtGiQDJq7lingW4Jp3U3jFObhQl62jNRWGefBmVkC2E+HYKgkvmTIgczP 5XGRbCi2zhPBW9ibK94cnjaQYXiCP4p4hPoW5+4VmvcsxEijAVBXvnQJVNdi924dA6aM 1QCzY1mccq+ClAhb3RG+LMSq6NxE2qMlnD8m+O5Zk3Merw0/252adoeqokTa70fcXqSO XsBTYJ5Fn3ZPfhHeSmcGLSks7ZZjNu3T7whTsdSO+kNvJQqc2WhtnC1w/y3mssB++mAj i1Bm0FbOy0MCTqg8GdYrunhoaEVVRys92t6S7sbmj8U1a2cq+QwTNA3vyKUZATWkkj/h Ld4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc; bh=14TBO6qOq1fRQ+FbYLje80t5Pt+y91sZvBjhjHqXvZg=; b=OFcCDh63EYEjrromkpX+ap3VRQ9uqY0fBhTxeXtcqjrIVY6n7oTjmXHx952qr0oLv4 RAAbS8XpDFMjybmF/ap6FAdYaB6X5OD8dE0oLpzWqZlpVvrfZWJ3jjnvMqBpIjYEGOpD coSOlDfJ6imanUZh5lIc0Tcd207aeDWSl6a5TgW8bP9szHxGfCKk7HViFA7AlJj3O2rm BeOG9ubuOcUrxSLtP3MuG01jxDayrWZvQ1vPGzV/DoL0ZyyWUrQiII2ma0WFiKTiZUTA IlMVEJhR30JWcU9+uF+dsxK0apRHpk/+OkyQm6neiVMc48nRCpptnpsgRQ9JEEuCTail Si+A== X-Gm-Message-State: ACgBeo0/ShhD1TnRCOoY2xgKcuzZSvUytHxbomOGqVLCz/lLN6EODyj3 SFzmaiXH2BBcgBN2cN6hxefCrzjyYKLzu7lMqYl4Ng== X-Google-Smtp-Source: AA6agR6OS7pLRthOidAx+JKrQI9sTIIGI2hB3yGaQ1H1LxS+5UqWvryrXVrXGWMhaGy3M7MuQB/AFjdMDnQO1ughwLw= X-Received: by 2002:a25:b849:0:b0:67b:93e9:2acc with SMTP id b9-20020a25b849000000b0067b93e92accmr7750153ybm.241.1660912548339; Fri, 19 Aug 2022 05:35:48 -0700 (PDT) MIME-Version: 1.0 References: <20220810015636.3865248-1-zhangsong34@huawei.com> <13a7a412-5e2e-6ef8-acd6-a761aad66c3a@bytedance.com> <6ae319c0-e6ed-4aad-64b8-d3f6cbea688d@huawei.com> <9a63b371-9940-caee-7fa1-2c230bec0bd1@bytedance.com> <20220818083133.GA536@vingu-book> <798411ac-6edb-d22c-5378-297268e77b1a@huawei.com> In-Reply-To: <798411ac-6edb-d22c-5378-297268e77b1a@huawei.com> From: Vincent Guittot Date: Fri, 19 Aug 2022 14:35:37 +0200 Message-ID: Subject: Re: [PATCH v2] sched/fair: Introduce priority load balance to reduce interference from IDLE tasks To: "zhangsong (J)" Cc: Abel Wu , mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, kernel test robot Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Zhang, On Fri, 19 Aug 2022 at 12:54, zhangsong (J) wrote: > > > On 2022/8/18 16:31, Vincent Guittot wrote: > > Le jeudi 18 ao=C3=BBt 2022 =C3=A0 10:46:55 (+0800), Abel Wu a =C3=A9cri= t : > >> On 8/17/22 8:58 PM, Vincent Guittot Wrote: > >>> On Tue, 16 Aug 2022 at 04:53, zhangsong (J) = wrote: > >>>> > > ... > > > >>>> Yes, this is usually a corner case, but suppose that some non-idle t= asks bounds to CPU 1-2 > >>>> > >>>> and idle tasks bounds to CPU 0-1, so CPU 1 may has many idle tasks a= nd some non-idle > >>>> > >>>> tasks while idle tasks on CPU 1 can not be pulled to CPU 2, when tri= gger load balance if > >>>> > >>>> CPU 2 should pull some tasks from CPU 1, the bad result is idle task= s of CPU 1 cannot be > >>>> > >>>> migrated and non-idle tasks also cannot be migrated in case of env->= loop_max constraint. > >>> env->loop_max adds a break but load_balance will continue with next > >>> tasks so it also tries to pull your non idle task at the end after > >>> several breaks. > >> Loop will be terminated without LBF_NEED_BREAK if exceeds loop_max :) > > Argh yes, my brain is not yet back from vacation > > I have been confused by loop_max and loop_break being set to the same v= alue 32 > > > > Zhang Song, Could you try the patch below ? If it works, I will prepare= a > > clean patch with all tags > > > > > > > > sched/fair: make sure to try to detach at least one movable task > > > > During load balance we try at most env->loop_max time to move a task. B= ut > > it can happen that the LRU tasks (ie tail of the cfs_tasks list) can't > > be moved to dst_cpu because of affinity. In this case, loop in the list > > until we found at least one. > > > > Signed-off-by: Vincent Guittot > > --- > > kernel/sched/fair.c | 12 +++++++++--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index da388657d5ac..02b7b808e186 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -8052,8 +8052,12 @@ static int detach_tasks(struct lb_env *env) > > p =3D list_last_entry(tasks, struct task_struct, se.group= _node); > > > > env->loop++; > > - /* We've more or less seen every task there is, call it q= uits */ > > - if (env->loop > env->loop_max) > > + /* > > + * We've more or less seen every task there is, call it q= uits > > + * unless we haven't found any movable task yet. > > + */ > > + if (env->loop > env->loop_max && > > + !(env->flags & LBF_ALL_PINNED)) > > break; > > > > /* take a breather every nr_migrate tasks */ > > @@ -10182,7 +10186,9 @@ static int load_balance(int this_cpu, struct rq= *this_rq, > > > > if (env.flags & LBF_NEED_BREAK) { > > env.flags &=3D ~LBF_NEED_BREAK; > > - goto more_balance; > > + /* Stop if we tried all running tasks */ > > + if (env.loop < busiest->nr_running) > > + goto more_balance; > > } > > > > /* > > -- > > 2.17.1 > > Thanks for your reply. > I have tried your patch and run test compared with it, it seems that the > patch you provide makes no sense. > The test result is below(1000 idle tasks bounds to CPU 0-1 and 10 normal > tasks bounds to CPU 1-2): > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Without patch: > > > 6,777.37 msec cpu-clock # 1.355 CPUs utiliz= ed > 20,812 context-switches # 0.003 M/sec > 0 cpu-migrations # 0.000 K/sec > 0 page-faults # 0.000 K/sec > 13,333,983,148 cycles # 1.967 GHz > 6,457,930,305 instructions # 0.48 insn per cy= cle > 2,125,644,649 branches # 313.639 M/sec > 1,690,587 branch-misses # 0.08% of all > branches > 5.001931983 seconds time elapsed > > With your patch: > > > 6,791.46 msec cpu-clock # 1.358 CPUs utiliz= ed > 20,996 context-switches # 0.003 M/sec > 0 cpu-migrations # 0.000 K/sec > 0 page-faults # 0.000 K/sec > 13,467,573,052 cycles # 1.983 GHz > 6,516,989,062 instructions # 0.48 insn per cy= cle > 2,145,139,220 branches # 315.858 M/sec > 1,751,454 branch-misses # 0.08% of all > branches > > 5.002274267 seconds time elapsed > > With my patch: > > > 7,495.14 msec cpu-clock # 1.499 CPUs utiliz= ed > 23,176 context-switches # 0.003 M/sec > 309 cpu-migrations # 0.041 K/sec > 0 page-faults # 0.000 K/sec > 14,849,083,489 cycles # 1.981 GHz > 7,180,832,268 instructions # 0.48 insn per cy= cle > 2,363,300,644 branches # 315.311 M/sec > 1,964,169 branch-misses # 0.08% of all > branches > > 5.001713352 seconds time elapsed > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Obviously, when your patch is applied, the cpu-migrations of normal > tasks is still 0 and the > CPU ulization of normal tasks have no improvement compared with no patch > applied. > When apply my patch, the cpu-migrations and CPU ulization of normal > tasks can both improve. > I cannot explain the result with your patch, you also can test it by > yourself. Do you have more details about the test that your are running ? Do cpu0-2 share their cache ? Which kingd of task are the normal and idle tasks ? always running tasks ? I'm going to try to reproduce your problem locally Regards, Vincent > > Best, > Zhang Song > > > > >>>> This will cause non-idle tasks cannot achieve more CPU utilization= . > >>> Your problem is not linked to IDLE vs NORMAL tasks but to the large > >>> number of pinned tasks that can't migrate on CPU2. You can end with > >>> the same behavior without using IDLE tasks but only NORMAL tasks. > >> I feel the same thing. > >> > >> Best, > >> Abel > > .