From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 973ECC2B9F8 for ; Tue, 25 May 2021 08:17:03 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 528B561400 for ; Tue, 25 May 2021 08:17:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 528B561400 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=cLs4rQb91Cq8U7r0q6nDoBZ/hWZBsnLLcpgL14iDvjk=; b=izHVXfL1GjAr9X RMrvZVhtRFzCn9oUpj2fQfCMuh2aOrxtxohtC9HLDGzQpM1nt4YBovomtGAXj1h09I3XMxAfNkKSl zZCcBjyP4uVkt8t/hbCoiqdrhtji6FvDsEsHUdGahvmt+z8Cr8kjT/b8ujYVuhEnmqJqFiPMPbhxp 3lqXXfOpQ/shvKaRfqQFBrscjsX10zrVQzJiqM7B4dKDhvMmyViSJ1j2PtXgq5sb8nneh+POVtGsr mEY3pAXDpzYoJTyEnaR1nUB21asVSlGZ/GGT6fdpnvR19vfxvOfCG7GC7FlE9YvUTsh4gkHBr1Acz I7LnHdS39z3TcTSMDOQA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1llSDT-0042eJ-2g; Tue, 25 May 2021 08:15:07 +0000 Received: from szxga06-in.huawei.com ([45.249.212.32]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1llSDO-0042bt-JU for linux-arm-kernel@lists.infradead.org; Tue, 25 May 2021 08:15:05 +0000 Received: from dggems706-chm.china.huawei.com (unknown [172.30.72.58]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Fq6JG5qgtzmbVW; Tue, 25 May 2021 16:12:30 +0800 (CST) Received: from lhreml716-chm.china.huawei.com (10.201.108.67) by dggems706-chm.china.huawei.com (10.3.19.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Tue, 25 May 2021 16:14:50 +0800 Received: from dggemi761-chm.china.huawei.com (10.1.198.147) by lhreml716-chm.china.huawei.com (10.201.108.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Tue, 25 May 2021 09:14:47 +0100 Received: from dggemi761-chm.china.huawei.com ([10.9.49.202]) by dggemi761-chm.china.huawei.com ([10.9.49.202]) with mapi id 15.01.2176.012; Tue, 25 May 2021 16:14:45 +0800 From: "Song Bao Hua (Barry Song)" To: Dietmar Eggemann , Vincent Guittot CC: "tim.c.chen@linux.intel.com" , "catalin.marinas@arm.com" , "will@kernel.org" , "rjw@rjwysocki.net" , "bp@alien8.de" , "tglx@linutronix.de" , "mingo@redhat.com" , "lenb@kernel.org" , "peterz@infradead.org" , "rostedt@goodmis.org" , "bsegall@google.com" , "mgorman@suse.de" , "msys.mizuma@gmail.com" , "valentin.schneider@arm.com" , "gregkh@linuxfoundation.org" , Jonathan Cameron , "juri.lelli@redhat.com" , "mark.rutland@arm.com" , "sudeep.holla@arm.com" , "aubrey.li@linux.intel.com" , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "x86@kernel.org" , "xuwei (O)" , "Zengtao (B)" , "guodong.xu@linaro.org" , yangyicong , "Liguozhu (Kenneth)" , "linuxarm@openeuler.org" , "hpa@zytor.com" Subject: RE: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks within one LLC Thread-Topic: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks within one LLC Thread-Index: AQHXPa2htgkN1X7dCEatlQZ1LQ756arRBrBQgACVCdCAAreDAIADr3lwgAjj3ACAExY5IA== Date: Tue, 25 May 2021 08:14:45 +0000 Message-ID: References: <20210420001844.9116-1-song.bao.hua@hisilicon.com> <20210420001844.9116-4-song.bao.hua@hisilicon.com> <80f489f9-8c88-95d8-8241-f0cfd2c2ac66@arm.com> <8b5277d9-e367-566d-6bd1-44ac78d21d3f@arm.com> <185746c4d02a485ca8f3509439328b26@hisilicon.com> <4d1f063504b1420c9f836d1f1a7f8e77@hisilicon.com> <142c7192-cde8-6dbe-bb9d-f0fce21ec959@arm.com> <45cce983-79ca-392a-f590-9168da7aefab@arm.com> In-Reply-To: <45cce983-79ca-392a-f590-9168da7aefab@arm.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.201.248] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210525_011502_989421_FF84C8A2 X-CRM114-Status: GOOD ( 31.71 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org > -----Original Message----- > From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > Sent: Friday, May 14, 2021 12:32 AM > To: Song Bao Hua (Barry Song) ; Vincent Guittot > > Cc: tim.c.chen@linux.intel.com; catalin.marinas@arm.com; will@kernel.org; > rjw@rjwysocki.net; bp@alien8.de; tglx@linutronix.de; mingo@redhat.com; > lenb@kernel.org; peterz@infradead.org; rostedt@goodmis.org; > bsegall@google.com; mgorman@suse.de; msys.mizuma@gmail.com; > valentin.schneider@arm.com; gregkh@linuxfoundation.org; Jonathan Cameron > ; juri.lelli@redhat.com; mark.rutland@arm.com; > sudeep.holla@arm.com; aubrey.li@linux.intel.com; > linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; > linux-acpi@vger.kernel.org; x86@kernel.org; xuwei (O) ; > Zengtao (B) ; guodong.xu@linaro.org; yangyicong > ; Liguozhu (Kenneth) ; > linuxarm@openeuler.org; hpa@zytor.com > Subject: Re: [RFC PATCH v6 3/4] scheduler: scan idle cpu in cluster for tasks > within one LLC > > On 07/05/2021 15:07, Song Bao Hua (Barry Song) wrote: > > > > > >> -----Original Message----- > >> From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > > [...] > > >> On 03/05/2021 13:35, Song Bao Hua (Barry Song) wrote: > >> > >> [...] > >> > >>>> From: Song Bao Hua (Barry Song) > >> > >> [...] > >> > >>>>> From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > >> > >> [...] > >> > >>>>> On 29/04/2021 00:41, Song Bao Hua (Barry Song) wrote: > >>>>>> > >>>>>> > >>>>>>> -----Original Message----- > >>>>>>> From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > >>>>> > >>>>> [...] > >>>>> > >>>>>>>>>> From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > >>>>>>> > >>>>>>> [...] > >>>>>>> > >>>>>>>>>> On 20/04/2021 02:18, Barry Song wrote: > >> > >> [...] > >> > >>> > >>> On the other hand, according to "sched: Implement smarter wake-affine logic" > >>> > >> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > >> ?id=62470419 > >>> > >>> Proper factor in wake_wide is mainly beneficial of 1:n tasks like > >> postgresql/pgbench. > >>> So using the smaller cluster size as factor might help make wake_affine > false > >> so > >>> improve pgbench. > >>> > >>> From the commit log, while clients = 2*cpus, the commit made the biggest > >>> improvement. In my case, It should be clients=48 for a machine whose LLC > >>> size is 24. > >>> > >>> In Linux, I created a 240MB database and ran "pgbench -c 48 -S -T 20 pgbench" > >>> under two different scenarios: > >>> 1. page cache always hit, so no real I/O for database read > >>> 2. echo 3 > /proc/sys/vm/drop_caches > >>> > >>> For case 1, using cluster_size and using llc_size will result in similar > >>> tps= ~108000, all of 24 cpus have 100% cpu utilization. > >>> > >>> For case 2, using llc_size still shows better performance. > >>> > >>> tps for each test round(cluster size as factor in wake_wide): > >>> 1398.450887 1275.020401 1632.542437 1412.241627 1611.095692 1381.354294 > >> 1539.877146 > >>> avg tps = 1464 > >>> > >>> tps for each test round(llc size as factor in wake_wide): > >>> 1718.402983 1443.169823 1502.353823 1607.415861 1597.396924 1745.651814 > >> 1876.802168 > >>> avg tps = 1641 (+12%) > >>> > >>> so it seems using cluster_size as factor in "slave >= factor && master >= > >> slave * > >>> factor" isn't a good choice for my machine at least. > >> > >> So SD size = 4 (instead of 24) seems to be too small for `-c 48`. > >> > >> Just curious, have you seen the benefit of using wake wide on SD size = > >> 24 (LLC) compared to not using it at all? > > > > At least in my benchmark made today, I have not seen any benefit to use > > llc_size. Always returning 0 in wake_wide() seems to be much better. > > > > postgres@ubuntu:$pgbench -i pgbench > > postgres@pgbench:$ pgbench -T 120 -c 48 pgbench > > > > using llc_size, it got to 123tps > > always returning 0 in wake_wide(), it got to 158tps > > > > actually, I really couldn't reproduce the performance improvement > > the commit "sched: Implement smarter wake-affine logic" mentioned. > > on the other hand, the commit log didn't present the pgbench command > > parameter used. I guess the benchmark result will highly depend on > > the command parameter and disk I/O speed. > > I see. And it was a way smaller machine (12 CPUs) back then. > > You could run pgbench via mmtests https://github.com/gormanm/mmtests. > > I.e the `timed-ro-medium` test. > > mmtests# ./run-mmtests.sh --config > ./configs/config-db-pgbench-timed-ro-medium test_tag > > /shellpacks/shellpack-bench-pgbench contains all the individual test > steps. Something you could use as a template for your pgbench standalone > tests as well. > > I ran this test on an Intel Xeon E5-2690 v2 with 40 CPUs and 64GB of > memory on v5.12 vanilla and w/o wakewide. > The test uses `scale_factor = 2570` on this machine. I guess this > relates to ~41GB? At least this was the size of the: Thanks. Dietmar, sorry for slow response. Sick leave for the whole last week. I feel it makes much more sense to use mmtests which is setting scale_factor according to total memory size, thus, considering the impact of page cache. And it is also doing database warming-up for 30minutes. I will get more data and compare three cases: 1. use cluster as wake_wide factor 2. use llc as wake_wide factor 3. always return 0 in wake_wide. and post the result afterwards. > > #mmtests/work/testdisk/data/pgdata directory when the test started. > > > mmtests/work/log# ../../compare-kernels.sh --baseline base --compare > wo_wakewide | grep ^Hmean > > > #clients v5.12 vanilla v5.12 w/o wakewide > > Hmean 1 10903.88 ( 0.00%) 10792.59 * -1.02%* > Hmean 6 28480.60 ( 0.00%) 27954.97 * -1.85%* > Hmean 12 49197.55 ( 0.00%) 47758.16 * -2.93%* > Hmean 22 72902.37 ( 0.00%) 71314.01 * -2.18%* > Hmean 30 75468.16 ( 0.00%) 75929.17 * 0.61%* > Hmean 48 60155.58 ( 0.00%) 60471.91 * 0.53%* > Hmean 80 62202.38 ( 0.00%) 60814.76 * -2.23%* > > > So there are some improvements w/ wakewide but nothing of the scale > showed in the original wakewide patch. > > I'm not an expert on how to set up these pgbench tests though. So maybe > other pgbench related mmtests configs or some more fine-grained tuning > can produce bigger diffs? Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel