From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF618C433B4 for ; Tue, 27 Apr 2021 11:35:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C4161613DD for ; Tue, 27 Apr 2021 11:35:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235443AbhD0Lge (ORCPT ); Tue, 27 Apr 2021 07:36:34 -0400 Received: from foss.arm.com ([217.140.110.172]:51174 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235225AbhD0Lge (ORCPT ); Tue, 27 Apr 2021 07:36:34 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E5514D6E; Tue, 27 Apr 2021 04:35:50 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3997B3F694; Tue, 27 Apr 2021 04:35:45 -0700 (PDT) Subject: Re: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks within one LLC To: Barry Song , tim.c.chen@linux.intel.com, catalin.marinas@arm.com, will@kernel.org, rjw@rjwysocki.net, vincent.guittot@linaro.org, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, lenb@kernel.org, peterz@infradead.org, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de Cc: msys.mizuma@gmail.com, valentin.schneider@arm.com, gregkh@linuxfoundation.org, jonathan.cameron@huawei.com, juri.lelli@redhat.com, mark.rutland@arm.com, sudeep.holla@arm.com, aubrey.li@linux.intel.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, x86@kernel.org, xuwei5@huawei.com, prime.zeng@hisilicon.com, guodong.xu@linaro.org, yangyicong@huawei.com, liguozhu@hisilicon.com, linuxarm@openeuler.org, hpa@zytor.com References: <20210420001844.9116-1-song.bao.hua@hisilicon.com> <20210420001844.9116-4-song.bao.hua@hisilicon.com> From: Dietmar Eggemann Message-ID: <80f489f9-8c88-95d8-8241-f0cfd2c2ac66@arm.com> Date: Tue, 27 Apr 2021 13:35:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <20210420001844.9116-4-song.bao.hua@hisilicon.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org On 20/04/2021 02:18, Barry Song wrote: [...] > @@ -5786,11 +5786,12 @@ static void record_wakee(struct task_struct *p) > * whatever is irrelevant, spread criteria is apparent partner count exceeds > * socket size. > */ > -static int wake_wide(struct task_struct *p) > +static int wake_wide(struct task_struct *p, int cluster) > { > unsigned int master = current->wakee_flips; > unsigned int slave = p->wakee_flips; > - int factor = __this_cpu_read(sd_llc_size); > + int factor = cluster ? __this_cpu_read(sd_cluster_size) : > + __this_cpu_read(sd_llc_size); I don't see that the wake_wide() change has any effect here. None of the sched domains has SD_BALANCE_WAKE set so a wakeup (WF_TTWU) can never end up in the slow path. Have you seen a diff when running your `lmbench stream` workload in what wake_wide() returns when you use `sd cluster size` instead of `sd llc size` as factor? I guess for you, wakeups are now subdivided into faster (cluster = 4 CPUs) and fast (llc = 24 CPUs) via sis(), not into fast (sis()) and slow (find_idlest_cpu()). > > if (master < slave) > swap(master, slave); [...] > @@ -6745,6 +6748,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > int want_affine = 0; > /* SD_flags and WF_flags share the first nibble */ > int sd_flag = wake_flags & 0xF; > + /* > + * if cpu and prev_cpu share LLC, consider cluster sibling rather > + * than llc. this is typically true while tasks are bound within > + * one numa > + */ > + int cluster = sched_cluster_active() && cpus_share_cache(cpu, prev_cpu, 0); So you changed from scanning cluster before LLC to scan either cluster or LLC. And this is based on whether `this_cpu` and `prev_cpu` are sharing LLC or not. So you only see an effect when running the workload with `numactl -N X ...`. > > if (wake_flags & WF_TTWU) { > record_wakee(p); > @@ -6756,7 +6765,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > new_cpu = prev_cpu; > } > > - want_affine = !wake_wide(p) && cpumask_test_cpu(cpu, p->cpus_ptr); > + want_affine = !wake_wide(p, cluster) && cpumask_test_cpu(cpu, p->cpus_ptr); > } > > rcu_read_lock(); > @@ -6768,7 +6777,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > if (want_affine && (tmp->flags & SD_WAKE_AFFINE) && > cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) { > if (cpu != prev_cpu) > - new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync); > + new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync, cluster); > > sd = NULL; /* Prefer wake_affine over balance flags */ > break; > @@ -6785,7 +6794,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag); > } else if (wake_flags & WF_TTWU) { /* XXX always ? */ > /* Fast path */ > - new_cpu = select_idle_sibling(p, prev_cpu, new_cpu); > + new_cpu = select_idle_sibling(p, prev_cpu, new_cpu, cluster); > > if (want_affine) > current->recent_used_cpu = cpu; [...] From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4418C433B4 for ; Tue, 27 Apr 2021 11:38:06 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3AAF460FE8 for ; Tue, 27 Apr 2021 11:38:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AAF460FE8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:Cc:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=RZsBTtr4c8F2gUDhrHfOSyR4Xloz0cXVvKpPrRp/pfA=; b=g6njvRS7oP7j+Jm19kCG4ZmOW iJIEw5iBfNuWcjX2GMkCCSoemHNwTG8FxMiWID7415hsuTL4Lsggxhg7vVl/DW4v8siqPAP+Qktis VLCZyVi/z0wW7RMghBKr1LWTgX97oPzd7Mb2dyFe4Ib5VrQvk4hfRwA/crn1hyzlmmq3YJa2HJ6Jg lk2zCCkkbZvhfZ38Bj3BiPm4XncgiIkgOuwi9iKaWWH8+y8hQT8uRfUY2XA8Nr3mf31m2CjfxSISs 4dNyDGPmg7p0YGVb5axLtNSkB8eqwHHYLaHVQ8m3Kn4p4g4bnX9mkRmvJVgInIgRWW1bGwZ7x5ZU2 Oqq2xGUEw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lbM0b-001Wdc-OC; Tue, 27 Apr 2021 11:36:07 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lbM0Z-001Wcv-0T for linux-arm-kernel@desiato.infradead.org; Tue, 27 Apr 2021 11:36:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To: Subject:Sender:Reply-To:Content-ID:Content-Description; bh=nK8bPecDc2zYm7C8veqfsTh/VgluH7bZLzc64WYLTwM=; b=I3MG/aEpEeM3C03ACfzYg+fmqS J/UiITWv5N3n8w2pYidBl/xwoGrqL0O9jyIKzjXQsHa4XnWJCJTYsmia6HhHAjeiZyU1WdMLEzuil 3WbclOIGGSWF6wTzl+SufiP+GiwMEPBKGIY5m+XL7kSOOT1CmHlG1paVmu65y0UOlNlqHtr9HT4TK jqv1nNR/gZ1HQva0b14GIqDMxSZ87ODnVFsEosJjm2u/2BsBMBpkRfc4V0isE9cxOnS6OmAbR9nnb zweiMiPtk1MLWJFeKVKjm6yLw7EplECEiCF5nJ1Se2EykaoFobq0/rPDcDI5QTNIHV/9DMk68S1eA Dd8lwP/g==; Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lbM0W-00Gh6V-89 for linux-arm-kernel@lists.infradead.org; Tue, 27 Apr 2021 11:36:01 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E5514D6E; Tue, 27 Apr 2021 04:35:50 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3997B3F694; Tue, 27 Apr 2021 04:35:45 -0700 (PDT) Subject: Re: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks within one LLC To: Barry Song , tim.c.chen@linux.intel.com, catalin.marinas@arm.com, will@kernel.org, rjw@rjwysocki.net, vincent.guittot@linaro.org, bp@alien8.de, tglx@linutronix.de, mingo@redhat.com, lenb@kernel.org, peterz@infradead.org, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de Cc: msys.mizuma@gmail.com, valentin.schneider@arm.com, gregkh@linuxfoundation.org, jonathan.cameron@huawei.com, juri.lelli@redhat.com, mark.rutland@arm.com, sudeep.holla@arm.com, aubrey.li@linux.intel.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, x86@kernel.org, xuwei5@huawei.com, prime.zeng@hisilicon.com, guodong.xu@linaro.org, yangyicong@huawei.com, liguozhu@hisilicon.com, linuxarm@openeuler.org, hpa@zytor.com References: <20210420001844.9116-1-song.bao.hua@hisilicon.com> <20210420001844.9116-4-song.bao.hua@hisilicon.com> From: Dietmar Eggemann Message-ID: <80f489f9-8c88-95d8-8241-f0cfd2c2ac66@arm.com> Date: Tue, 27 Apr 2021 13:35:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <20210420001844.9116-4-song.bao.hua@hisilicon.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210427_043600_398378_79DAF09B X-CRM114-Status: GOOD ( 16.99 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 20/04/2021 02:18, Barry Song wrote: [...] > @@ -5786,11 +5786,12 @@ static void record_wakee(struct task_struct *p) > * whatever is irrelevant, spread criteria is apparent partner count exceeds > * socket size. > */ > -static int wake_wide(struct task_struct *p) > +static int wake_wide(struct task_struct *p, int cluster) > { > unsigned int master = current->wakee_flips; > unsigned int slave = p->wakee_flips; > - int factor = __this_cpu_read(sd_llc_size); > + int factor = cluster ? __this_cpu_read(sd_cluster_size) : > + __this_cpu_read(sd_llc_size); I don't see that the wake_wide() change has any effect here. None of the sched domains has SD_BALANCE_WAKE set so a wakeup (WF_TTWU) can never end up in the slow path. Have you seen a diff when running your `lmbench stream` workload in what wake_wide() returns when you use `sd cluster size` instead of `sd llc size` as factor? I guess for you, wakeups are now subdivided into faster (cluster = 4 CPUs) and fast (llc = 24 CPUs) via sis(), not into fast (sis()) and slow (find_idlest_cpu()). > > if (master < slave) > swap(master, slave); [...] > @@ -6745,6 +6748,12 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > int want_affine = 0; > /* SD_flags and WF_flags share the first nibble */ > int sd_flag = wake_flags & 0xF; > + /* > + * if cpu and prev_cpu share LLC, consider cluster sibling rather > + * than llc. this is typically true while tasks are bound within > + * one numa > + */ > + int cluster = sched_cluster_active() && cpus_share_cache(cpu, prev_cpu, 0); So you changed from scanning cluster before LLC to scan either cluster or LLC. And this is based on whether `this_cpu` and `prev_cpu` are sharing LLC or not. So you only see an effect when running the workload with `numactl -N X ...`. > > if (wake_flags & WF_TTWU) { > record_wakee(p); > @@ -6756,7 +6765,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > new_cpu = prev_cpu; > } > > - want_affine = !wake_wide(p) && cpumask_test_cpu(cpu, p->cpus_ptr); > + want_affine = !wake_wide(p, cluster) && cpumask_test_cpu(cpu, p->cpus_ptr); > } > > rcu_read_lock(); > @@ -6768,7 +6777,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > if (want_affine && (tmp->flags & SD_WAKE_AFFINE) && > cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) { > if (cpu != prev_cpu) > - new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync); > + new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync, cluster); > > sd = NULL; /* Prefer wake_affine over balance flags */ > break; > @@ -6785,7 +6794,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) > new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag); > } else if (wake_flags & WF_TTWU) { /* XXX always ? */ > /* Fast path */ > - new_cpu = select_idle_sibling(p, prev_cpu, new_cpu); > + new_cpu = select_idle_sibling(p, prev_cpu, new_cpu, cluster); > > if (want_affine) > current->recent_used_cpu = cpu; [...] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel