linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	kirill@shutemov.name, mhocko@kernel.org,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Tony Luck <tony.luck@intel.com>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Xiaoyao Li <xiaoyao.li@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH 7/8] x86/cpu/intel: enable X86_FEATURE_NT_GOOD on Intel Broadwellx
Date: Wed, 14 Oct 2020 17:31:27 +0200	[thread overview]
Message-ID: <20201014153127.GB1424414@gmail.com> (raw)
In-Reply-To: <20201014083300.19077-8-ankur.a.arora@oracle.com>


* Ankur Arora <ankur.a.arora@oracle.com> wrote:

> System:           Oracle X6-2
> CPU:              2 nodes * 10 cores/node * 2 threads/core
> 		  Intel Xeon E5-2630 v4 (Broadwellx, 6:79:1)
> Memory:           256 GB evenly split between nodes
> Microcode:        0xb00002e
> scaling_governor: performance
> L3 size:          25MB
> intel_pstate/no_turbo: 1
> 
> Performance comparison of 'perf bench mem memset -l 1' for x86-64-stosb
> (X86_FEATURE_ERMS) and x86-64-movnt (X86_FEATURE_NT_GOOD):
> 
>               x86-64-stosb (5 runs)     x86-64-movnt (5 runs)       speedup
>               -----------------------   -----------------------     -------
>      size       BW        (   pstdev)          BW   (   pstdev)
> 
>      16MB      17.35 GB/s ( +- 9.27%)    11.83 GB/s ( +- 0.19%)     -31.81%
>     128MB       5.31 GB/s ( +- 0.13%)    11.72 GB/s ( +- 0.44%)    +121.84%
>    1024MB       5.42 GB/s ( +- 0.13%)    11.78 GB/s ( +- 0.03%)    +117.34%
>    4096MB       5.41 GB/s ( +- 0.41%)    11.76 GB/s ( +- 0.07%)    +117.37%

> +	if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X)
> +		set_cpu_cap(c, X86_FEATURE_NT_GOOD);

So while I agree with how you've done careful measurements to isolate bad 
microarchitectures where non-temporal stores are slow, I do think this 
approach of opt-in doesn't scale and is hard to maintain.

Instead I'd suggest enabling this by default everywhere, and creating a 
X86_FEATURE_NT_BAD quirk table for the bad microarchitectures.

This means that with new microarchitectures we'd get automatic enablement, 
and hopefully chip testing would identify cases where performance isn't as 
good.

I.e. the 'trust but verify' method.

Thanks,

	Ingo

  reply	other threads:[~2020-10-14 15:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-14  8:32 [PATCH 0/8] Use uncached writes while clearing gigantic pages Ankur Arora
2020-10-14  8:32 ` [PATCH 1/8] x86/cpuid: add X86_FEATURE_NT_GOOD Ankur Arora
2020-10-14  8:32 ` [PATCH 2/8] x86/asm: add memset_movnti() Ankur Arora
2020-10-14  8:32 ` [PATCH 3/8] perf bench: " Ankur Arora
2020-10-14  8:32 ` [PATCH 4/8] x86/asm: add clear_page_nt() Ankur Arora
2020-10-14 19:56   ` Borislav Petkov
2020-10-14 21:11     ` Ankur Arora
2020-10-14  8:32 ` [PATCH 5/8] x86/clear_page: add clear_page_uncached() Ankur Arora
2020-10-14 11:10   ` kernel test robot
2020-10-14 13:04   ` kernel test robot
2020-10-14 15:45   ` Andy Lutomirski
2020-10-14 19:58     ` Borislav Petkov
2020-10-14 21:07       ` Andy Lutomirski
2020-10-14 21:12         ` Borislav Petkov
2020-10-15  3:37           ` Ankur Arora
2020-10-15 10:35             ` Borislav Petkov
2020-10-15 21:20               ` Ankur Arora
2020-10-16 18:21                 ` Borislav Petkov
2020-10-15  3:21         ` Ankur Arora
2020-10-15 10:40           ` Borislav Petkov
2020-10-15 21:40             ` Ankur Arora
2020-10-14 20:54     ` Ankur Arora
2020-10-14  8:32 ` [PATCH 6/8] mm, clear_huge_page: use clear_page_uncached() for gigantic pages Ankur Arora
2020-10-14 15:28   ` Ingo Molnar
2020-10-14 19:15     ` Ankur Arora
2020-10-14  8:32 ` [PATCH 7/8] x86/cpu/intel: enable X86_FEATURE_NT_GOOD on Intel Broadwellx Ankur Arora
2020-10-14 15:31   ` Ingo Molnar [this message]
2020-10-14 19:23     ` Ankur Arora
2020-10-14  8:32 ` [PATCH 8/8] x86/cpu/amd: enable X86_FEATURE_NT_GOOD on AMD Zen Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201014153127.GB1424414@gmail.com \
    --to=mingo@kernel.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill@shutemov.name \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rppt@linux.ibm.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).