From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.skyhub.de (mail.skyhub.de [5.9.137.197]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8AB9B23C2 for ; Tue, 10 May 2022 09:31:31 +0000 (UTC) Received: from zn.tnic (p5de8eeb4.dip0.t-ipconnect.de [93.232.238.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.skyhub.de (SuperMail on ZX Spectrum 128k) with ESMTPSA id 30D2C1EC0606; Tue, 10 May 2022 11:31:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alien8.de; s=dkim; t=1652175085; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references; bh=zqHWw8+bN0CjYM1LFbOF/a1YqjiAXhwdtZ9bv89tejM=; b=IjYjMdHVWY8Lu//rAJt83cY2wkPUgmUjvn2LBEvnkPwYTgN8IWDOZlTcgPuuaXhZ6dl4tx 8fubhtBhHO4Jw/YL5WTUIapE2OADzT4wDJ6ii/ZRY6J8U1lQ2Uv0qOlH9TwOJa1J0v3287 ZoG0/sFu/wFNhPqomO64beN6k0lRvHY= Date: Tue, 10 May 2022 11:31:28 +0200 From: Borislav Petkov To: Linus Torvalds Cc: Mark Hemment , Andrew Morton , the arch/x86 maintainers , Peter Zijlstra , patrice.chotard@foss.st.com, Mikulas Patocka , Lukas Czerner , Christoph Hellwig , "Darrick J. Wong" , Chuck Lever , Hugh Dickins , patches@lists.linux.dev, Linux-MM , mm-commits@vger.kernel.org, Mel Gorman Subject: clear_user (was: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE) Message-ID: References: Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Lemme fix that subject so that I can find it easier in my avalanche mbox... On Wed, May 04, 2022 at 02:09:52PM -0700, Linus Torvalds wrote: > I don't tend to particularly care about "how many times has this been > called" kind of trace profiles. It's the actual expense in CPU cycles > I tend to care about. Yeah, but, I wanted to measure how much perf improvement that would bring with the git test suite and wanted to know how often clear_user() is called in conjunction with it. Because the benchmarks I ran would show very small improvements and a PF benchmark would even show weird things like slowdowns with higher core counts. So for a ~6m running test suite, the function gets called under 700K times, all from padzero: <...>-2536 [006] ..... 261.208801: padzero: to: 0x55b0663ed214, size: 3564, cycles: 21900 <...>-2536 [006] ..... 261.208819: padzero: to: 0x7f061adca078, size: 3976, cycles: 17160 <...>-2537 [008] ..... 261.211027: padzero: to: 0x5572d019e240, size: 3520, cycles: 23850 <...>-2537 [008] ..... 261.211049: padzero: to: 0x7f1288dc9078, size: 3976, cycles: 15900 ... which is around 1%-ish of the total time and which is consistent with the benchmark numbers. So Mel gave me the idea to simply measure how fast the function becomes. I.e.: start = rdtsc_ordered(); ret = __clear_user(to, n); end = rdtsc_ordered(); Computing the mean average of all the samples collected during the test suite run then shows some improvement: clear_user_original: Amean: 9219.71 (Sum: 6340154910, samples: 687674) fsrm: Amean: 8030.63 (Sum: 5522277720, samples: 687652) That's on Zen3. I'll run this on Icelake now too. > I haven't really done serious profiling work for a while (which is > just as well, because it's one of the things that went backwards when > I switch to the Zen 2 threadripper for my main machine) Because of the not as advanced perf support there? Any pain points I can forward? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette