From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E29C6C433FE for ; Thu, 3 Feb 2022 10:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6BBEC6B00C9; Thu, 3 Feb 2022 05:10:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 66C5A8D0124; Thu, 3 Feb 2022 05:10:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 534C36B00D0; Thu, 3 Feb 2022 05:10:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 428966B00C9 for ; Thu, 3 Feb 2022 05:10:00 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0622D824C420 for ; Thu, 3 Feb 2022 10:10:00 +0000 (UTC) X-FDA: 79101047760.11.B77FC75 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf19.hostedemail.com (Postfix) with ESMTP id 8671B1A0005 for ; Thu, 3 Feb 2022 10:09:59 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 61229210F1; Thu, 3 Feb 2022 10:09:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1643882998; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9zFkm+At4fsLY/ljWxDSVrjH2o/pRQykOngKpkVgqb0=; b=tzkhWTg9ke74Mi4nFSABuebQJx17OGKVDisXk5lnt6nfwQQ8SwX2LU8h/lHvoGMWcNmhmW smcpKZULMDHPZE3Wy6KNJKC9oJ5L6yGXs27cbYOK9sxc5V5zFF+jBiu4eqIg6cC0ihwG5E 0uotaPy/NxlfSx8O/JJnlJioP5cZ5Gk= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 21D9FA3B81; Thu, 3 Feb 2022 10:09:58 +0000 (UTC) Date: Thu, 3 Feb 2022 11:09:57 +0100 From: Michal Hocko To: Waiman Long , Sebastian Andrzej Siewior Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Johannes Weiner , Michal =?iso-8859-1?Q?Koutn=FD?= , Peter Zijlstra , Thomas Gleixner , Vladimir Davydov Subject: Re: [PATCH 3/4] mm/memcg: Add a local_lock_t for IRQ and TASK object. Message-ID: References: <20220125164337.2071854-1-bigeasy@linutronix.de> <20220125164337.2071854-4-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: 5ax9hcgxmbtr9ohfz6ifspfh4hi3qqxf X-Rspam-User: nil Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=tzkhWTg9; spf=pass (imf19.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 8671B1A0005 X-HE-Tag: 1643882999-754264 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 03-02-22 10:54:07, Sebastian Andrzej Siewior wrote: > On 2022-02-01 16:29:35 [+0100], Michal Hocko wrote: > > > > Sorry, I know that this all is not really related to your work but if > > > > the original optimization is solely based on artificial benchmarks then > > > > I would rather drop it and also make your RT patchset easier. > > > > > > Do you have any real-world benchmark in mind? Like something that is > > > already used for testing/ benchmarking and would fit here? > > > > Anything that even remotely resembles a real allocation heavy workload. > > So I figured out that build the kernel as user triggers the allocation > path in_task() and in_interrupt(). I booted a PREEMPT_NONE kernel and > run "perf stat -r 5 b.sh" where b.sh unpacks a kernel and runs a > allmodconfig build on /dev/shm. The slow disk should not be a problem. > > With the optimisation: > | Performance counter stats for './b.sh' (5 runs): > | > | 43.367.405,59 msec task-clock # 30,901 CPUs utilized ( +- 0,01% ) > | 7.393.238 context-switches # 170,499 /sec ( +- 0,13% ) > | 832.364 cpu-migrations # 19,196 /sec ( +- 0,15% ) > | 625.235.644 page-faults # 14,419 K/sec ( +- 0,00% ) > | 103.822.081.026.160 cycles # 2,394 GHz ( +- 0,01% ) > | 75.392.684.840.822 stalled-cycles-frontend # 72,63% frontend cycles idle ( +- 0,02% ) > | 54.971.177.787.990 stalled-cycles-backend # 52,95% backend cycles idle ( +- 0,02% ) > | 69.543.893.308.966 instructions # 0,67 insn per cycle > | # 1,08 stalled cycles per insn ( +- 0,00% ) > | 14.585.269.354.314 branches # 336,357 M/sec ( +- 0,00% ) > | 558.029.270.966 branch-misses # 3,83% of all branches ( +- 0,01% ) > | > | 1403,441 +- 0,466 seconds time elapsed ( +- 0,03% ) > > > With the optimisation disabled: > | Performance counter stats for './b.sh' (5 runs): > | > | 43.354.742,31 msec task-clock # 30,869 CPUs utilized ( +- 0,01% ) > | 7.394.210 context-switches # 170,601 /sec ( +- 0,06% ) > | 842.835 cpu-migrations # 19,446 /sec ( +- 0,63% ) > | 625.242.341 page-faults # 14,426 K/sec ( +- 0,00% ) > | 103.791.714.272.978 cycles # 2,395 GHz ( +- 0,01% ) > | 75.369.652.256.425 stalled-cycles-frontend # 72,64% frontend cycles idle ( +- 0,01% ) > | 54.947.610.706.450 stalled-cycles-backend # 52,96% backend cycles idle ( +- 0,01% ) > | 69.529.388.440.691 instructions # 0,67 insn per cycle > | # 1,08 stalled cycles per insn ( +- 0,01% ) > | 14.584.515.016.870 branches # 336,497 M/sec ( +- 0,00% ) > | 557.716.885.609 branch-misses # 3,82% of all branches ( +- 0,02% ) > | > | 1404,47 +- 1,05 seconds time elapsed ( +- 0,08% ) > > I'm still open to a more specific test ;) Thanks for this test. I do assume that both have been run inside a non-root memcg. Weiman, what was the original motivation for 559271146efc0? Because as this RT patch shows it makes future changes much more complex and I would prefer a simpler and easier to maintain code than some micro optimizations that do not have any visible effect on real workloads. -- Michal Hocko SUSE Labs