linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
@ 2019-07-11  8:12 Uros Bizjak
  2019-07-11 14:39 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Uros Bizjak @ 2019-07-11  8:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: x86, Andrew Lutomirski, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1516 bytes --]

Recent patch [1] disabled a self-snoop feature on a list of processor
models with a known errata, so we are confident that the feature
should work on remaining models also for other purposes than to speed
up MTRR programming.

I would like to resurrect an old patch [2] that avoids calling clflush
and wbinvd
to invalidate caches when CPU supports selfsnoop.

The patch was ported to latest Fedora kernel (5.1.16) and tested with
CONFIG_CPA_DEBUG on INTEL_FAM6_IVYBRIDGE_X. The relevant ports of
dmesg show:

...
< hundreds of CPA protect messages, resulting from set_memory_rw CPA
undo test in mm/init_64.c >
CPA  protect  Rodata RO: 0xffffffffbd1fe000 - 0xffffffffbd1fefff PFN
1461fe req 8000000000000063 prevent 0000000000000002
CPA  protect  Rodata RO: 0xffff889c461fe000 - 0xffff889c461fefff PFN
1461fe req 8000000000000063 prevent 0000000000000002
Testing CPA: again
Freeing unused kernel image memory: 2016K
Freeing unused kernel image memory: 4K
x86/mm: Checked W+X mappings: passed, no W+X pages found.
rodata_test: all tests were successful
x86/mm: Checking user space page tables
x86/mm: Checked W+X mappings: passed, no W+X pages found.

and from CPA selftest:

CPA self-test:
 4k 36352 large 4021 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
 4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
 4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
ok.

[1] https://lkml.org/lkml/2019/6/27/828
[2] https://lkml.org/lkml/2009/4/8/508

Uros.

[-- Attachment #2: 0001-Disable-CPA-cache-flush-for-selfsnoop-targets.patch --]
[-- Type: text/x-patch, Size: 1003 bytes --]

From 17cbad5d45b6ae26dbcfe02deba597c83cd63a0d Mon Sep 17 00:00:00 2001
From: Uros Bizjak <ubizjak@gmail.com>
Date: Wed, 10 Jul 2019 15:01:44 +0200
Subject: [PATCH] Disable CPA cache flush for selfsnoop targets

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/mm/pageattr.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 6a9a77a403c9..8893ef5f70cd 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1725,10 +1725,11 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages,
 		goto out;
 
 	/*
-	 * No need to flush, when we did not set any of the caching
-	 * attributes:
+	 * No need to flush when CPU supports self snoop or
+	 * when we did not set any of the caching attributes:
 	 */
-	cache = !!pgprot2cachemode(mask_set);
+	cache = !static_cpu_has(X86_FEATURE_SELFSNOOP) &&
+		pgprot2cachemode(mask_set);
 
 	/*
 	 * On error; flush everything to be sure.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-11  8:12 [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets Uros Bizjak
@ 2019-07-11 14:39 ` Andy Lutomirski
  2019-07-11 19:00   ` Uros Bizjak
  2019-07-15  8:24 ` Thomas Gleixner
  2019-07-15 16:31 ` Andi Kleen
  2 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2019-07-11 14:39 UTC (permalink / raw)
  To: Uros Bizjak, Dave Hansen, Peter Zijlstra
  Cc: LKML, X86 ML, Andrew Lutomirski, Thomas Gleixner

On Thu, Jul 11, 2019 at 1:13 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> Recent patch [1] disabled a self-snoop feature on a list of processor
> models with a known errata, so we are confident that the feature
> should work on remaining models also for other purposes than to speed
> up MTRR programming.
>
> I would like to resurrect an old patch [2] that avoids calling clflush
> and wbinvd
> to invalidate caches when CPU supports selfsnoop.

The big question here is: what are all the reasons that we might need
to flush?  Certainly, for stuff like SEV and MKTME, we need to flush
regardless of any self-snoop capability.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-11 14:39 ` Andy Lutomirski
@ 2019-07-11 19:00   ` Uros Bizjak
  0 siblings, 0 replies; 17+ messages in thread
From: Uros Bizjak @ 2019-07-11 19:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, LKML, X86 ML, Thomas Gleixner

On Thu, Jul 11, 2019 at 4:39 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Thu, Jul 11, 2019 at 1:13 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > Recent patch [1] disabled a self-snoop feature on a list of processor
> > models with a known errata, so we are confident that the feature
> > should work on remaining models also for other purposes than to speed
> > up MTRR programming.
> >
> > I would like to resurrect an old patch [2] that avoids calling clflush
> > and wbinvd
> > to invalidate caches when CPU supports selfsnoop.
>
> The big question here is: what are all the reasons that we might need
> to flush?  Certainly, for stuff like SEV and MKTME, we need to flush
> regardless of any self-snoop capability.

No AMD target defines self-snoop capability, and set_memory_encrypted
forces cache clearing in __set_memory_enc_dec:

    /*
     * Before changing the encryption attribute, we need to flush caches.
     */
    cpa_flush(&cpa, 1);

Uros.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-11  8:12 [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets Uros Bizjak
  2019-07-11 14:39 ` Andy Lutomirski
@ 2019-07-15  8:24 ` Thomas Gleixner
  2019-07-15 12:04   ` Uros Bizjak
  2019-07-15 16:31 ` Andi Kleen
  2 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2019-07-15  8:24 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: linux-kernel, x86, Andrew Lutomirski

Uros,

On Thu, 11 Jul 2019, Uros Bizjak wrote:
> Recent patch [1] disabled a self-snoop feature on a list of processor
> models with a known errata, so we are confident that the feature
> should work on remaining models also for other purposes than to speed
> up MTRR programming.
> 
> I would like to resurrect an old patch [2] that avoids calling clflush
> and wbinvd
> to invalidate caches when CPU supports selfsnoop.

Please do not attach patches, send them inline and please add a proper
changelog. Just saying 'Disable CPA cache flush for selfsnoop targets' in
the subject line then nada gives absolutely zero information.
 
> The patch was ported to latest Fedora kernel (5.1.16) and tested with
> CONFIG_CPA_DEBUG on INTEL_FAM6_IVYBRIDGE_X. The relevant ports of
> dmesg show:
> 
> ...
> < hundreds of CPA protect messages, resulting from set_memory_rw CPA
> undo test in mm/init_64.c >
> CPA  protect  Rodata RO: 0xffffffffbd1fe000 - 0xffffffffbd1fefff PFN
> 1461fe req 8000000000000063 prevent 0000000000000002
> CPA  protect  Rodata RO: 0xffff889c461fe000 - 0xffff889c461fefff PFN
> 1461fe req 8000000000000063 prevent 0000000000000002
> Testing CPA: again
> Freeing unused kernel image memory: 2016K
> Freeing unused kernel image memory: 4K
> x86/mm: Checked W+X mappings: passed, no W+X pages found.
> rodata_test: all tests were successful
> x86/mm: Checking user space page tables
> x86/mm: Checked W+X mappings: passed, no W+X pages found.
> 
> and from CPA selftest:
> 
> CPA self-test:
>  4k 36352 large 4021 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
>  4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
>  4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
> ok.

These outputs are pretty useless simply because the selftest only verifies
the inner workings of CPA itself, but has nothing to do with the
correctness vs. cache flushing.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15  8:24 ` Thomas Gleixner
@ 2019-07-15 12:04   ` Uros Bizjak
  0 siblings, 0 replies; 17+ messages in thread
From: Uros Bizjak @ 2019-07-15 12:04 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, X86 ML, Andrew Lutomirski

On Mon, Jul 15, 2019 at 10:24 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Uros,
>
> On Thu, 11 Jul 2019, Uros Bizjak wrote:
> > Recent patch [1] disabled a self-snoop feature on a list of processor
> > models with a known errata, so we are confident that the feature
> > should work on remaining models also for other purposes than to speed
> > up MTRR programming.
> >
> > I would like to resurrect an old patch [2] that avoids calling clflush
> > and wbinvd
> > to invalidate caches when CPU supports selfsnoop.
>
> Please do not attach patches, send them inline and please add a proper
> changelog. Just saying 'Disable CPA cache flush for selfsnoop targets' in
> the subject line then nada gives absolutely zero information.

Thanks for your remarks and instructions!

I'll send a new revision of the patch with expanded ChangeLog later today,
saying something along the lines of:

"CPUs which have self-snooping capability can handle conflicting
memory type across CPUs by snooping its own cache. Commit #fd329f276ecaa
("x86/mtrr: Skip cache flushes on CPUs with cache self-snooping")
avoids cache flushes when MTRR registers are programmed. The Page
Attribute Table (PAT) is a companion feature to the MTRRs, and according
to section 11.12.4 of the Intel 64 and IA 32 Architectures Software
Developer's Manual, if the CPU supports cache self-snooping, it is not
necessary to flush caches when remapping a page that was previously
mapped as a different memory type.

Note that commit #1e03bff360010
("x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errata")
cleared cache self-snoop capability for CPUs where conflicting memory types
lead to unpredictable behavior, machine check errors, or hangs."

> > The patch was ported to latest Fedora kernel (5.1.16) and tested with
> > CONFIG_CPA_DEBUG on INTEL_FAM6_IVYBRIDGE_X. The relevant ports of
> > dmesg show:
> >
> > ...
> > < hundreds of CPA protect messages, resulting from set_memory_rw CPA
> > undo test in mm/init_64.c >
> > CPA  protect  Rodata RO: 0xffffffffbd1fe000 - 0xffffffffbd1fefff PFN
> > 1461fe req 8000000000000063 prevent 0000000000000002
> > CPA  protect  Rodata RO: 0xffff889c461fe000 - 0xffff889c461fefff PFN
> > 1461fe req 8000000000000063 prevent 0000000000000002
> > Testing CPA: again
> > Freeing unused kernel image memory: 2016K
> > Freeing unused kernel image memory: 4K
> > x86/mm: Checked W+X mappings: passed, no W+X pages found.
> > rodata_test: all tests were successful
> > x86/mm: Checking user space page tables
> > x86/mm: Checked W+X mappings: passed, no W+X pages found.
> >
> > and from CPA selftest:
> >
> > CPA self-test:
> >  4k 36352 large 4021 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
> >  4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
> >  4k 180224 large 3740 gb 0 x 81[ffff889b00098000-ffff889bdf7ff000] miss 133120
> > ok.
>
> These outputs are pretty useless simply because the selftest only verifies
> the inner workings of CPA itself, but has nothing to do with the
> correctness vs. cache flushing.

Please note that CONFIG_CPA_DEBUG also spawns a pageattr-test kthread
which remaps a memory page every 30 seconds. I was confident enough to
run the patched kernel (with CONFIG_CPA_DEBUG) on my main workstation
(Ivybridge-X, Fedora 30), already for a week without a single problem.

Uros.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-11  8:12 [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets Uros Bizjak
  2019-07-11 14:39 ` Andy Lutomirski
  2019-07-15  8:24 ` Thomas Gleixner
@ 2019-07-15 16:31 ` Andi Kleen
  2019-07-15 18:41   ` Thomas Gleixner
  2 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2019-07-15 16:31 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: linux-kernel, x86, Andrew Lutomirski, Thomas Gleixner

Uros Bizjak <ubizjak@gmail.com> writes:

> Recent patch [1] disabled a self-snoop feature on a list of processor
> models with a known errata, so we are confident that the feature
> should work on remaining models also for other purposes than to speed
> up MTRR programming.

MTRR is very different than TLBs.

From my understanding not flushing with PAT is only safe everywhere when
the memory is only used for coherent devices (like the Internal GPU on
Intel CPUs). We don't have any infrastructure to track or enforce
this unfortunately.

-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 16:31 ` Andi Kleen
@ 2019-07-15 18:41   ` Thomas Gleixner
  2019-07-15 19:19     ` Uros Bizjak
  2019-07-15 19:39     ` Andi Kleen
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Gleixner @ 2019-07-15 18:41 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Uros Bizjak, linux-kernel, x86, Andrew Lutomirski

On Mon, 15 Jul 2019, Andi Kleen wrote:
> Uros Bizjak <ubizjak@gmail.com> writes:
> 
> > Recent patch [1] disabled a self-snoop feature on a list of processor
> > models with a known errata, so we are confident that the feature
> > should work on remaining models also for other purposes than to speed
> > up MTRR programming.
> 
> MTRR is very different than TLBs.
> 
> >From my understanding not flushing with PAT is only safe everywhere when
> the memory is only used for coherent devices (like the Internal GPU on
> Intel CPUs). We don't have any infrastructure to track or enforce
> this unfortunately.

Right, we don't know where the PAT invocation comes from and whether they
are safe to omit flushing the cache. The module load code would be one
obvious candidate.

But unless there is some really worthwhile speedup, e.g. for boot, then
adding some flag to let CPA know about the safe 'no flush' operation might
be not worth it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 18:41   ` Thomas Gleixner
@ 2019-07-15 19:19     ` Uros Bizjak
  2019-07-15 19:30       ` Thomas Gleixner
  2019-07-15 19:39     ` Andi Kleen
  1 sibling, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2019-07-15 19:19 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Andi Kleen, LKML, X86 ML, Andrew Lutomirski

On Mon, Jul 15, 2019 at 8:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, 15 Jul 2019, Andi Kleen wrote:
> > Uros Bizjak <ubizjak@gmail.com> writes:
> >
> > > Recent patch [1] disabled a self-snoop feature on a list of processor
> > > models with a known errata, so we are confident that the feature
> > > should work on remaining models also for other purposes than to speed
> > > up MTRR programming.
> >
> > MTRR is very different than TLBs.
> >
> > >From my understanding not flushing with PAT is only safe everywhere when
> > the memory is only used for coherent devices (like the Internal GPU on
> > Intel CPUs). We don't have any infrastructure to track or enforce
> > this unfortunately.
>
> Right, we don't know where the PAT invocation comes from and whether they
> are safe to omit flushing the cache. The module load code would be one
> obvious candidate.
>
> But unless there is some really worthwhile speedup, e.g. for boot, then
> adding some flag to let CPA know about the safe 'no flush' operation might
> be not worth it.

For the reference, FreeBSD implements this approach, later changed to
use pmap_invalidate_cache_range ifunc (that calls
pmap_invalidate_cache_range_selfsnoop for targets with self-snoop
capability) and pmap_force_invalidate_cache_range [1]. The full
cross-referenced source is at [2].

[1] https://reviews.freebsd.org/D16736
[2] http://fxr.watson.org/fxr/source/amd64/amd64/pmap.c

Uros.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 19:19     ` Uros Bizjak
@ 2019-07-15 19:30       ` Thomas Gleixner
  2019-07-15 19:38         ` Andi Kleen
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2019-07-15 19:30 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Andi Kleen, LKML, X86 ML, Andrew Lutomirski

On Mon, 15 Jul 2019, Uros Bizjak wrote:
> On Mon, Jul 15, 2019 at 8:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Mon, 15 Jul 2019, Andi Kleen wrote:
> > > Uros Bizjak <ubizjak@gmail.com> writes:
> > >
> > > > Recent patch [1] disabled a self-snoop feature on a list of processor
> > > > models with a known errata, so we are confident that the feature
> > > > should work on remaining models also for other purposes than to speed
> > > > up MTRR programming.
> > >
> > > MTRR is very different than TLBs.
> > >
> > > >From my understanding not flushing with PAT is only safe everywhere when
> > > the memory is only used for coherent devices (like the Internal GPU on
> > > Intel CPUs). We don't have any infrastructure to track or enforce
> > > this unfortunately.
> >
> > Right, we don't know where the PAT invocation comes from and whether they
> > are safe to omit flushing the cache. The module load code would be one
> > obvious candidate.
> >
> > But unless there is some really worthwhile speedup, e.g. for boot, then
> > adding some flag to let CPA know about the safe 'no flush' operation might
> > be not worth it.
> 
> For the reference, FreeBSD implements this approach, later changed to
> use pmap_invalidate_cache_range ifunc (that calls
> pmap_invalidate_cache_range_selfsnoop for targets with self-snoop
> capability) and pmap_force_invalidate_cache_range [1]. The full
> cross-referenced source is at [2].

That does not answer the question whether it's worthwhile to do that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 19:30       ` Thomas Gleixner
@ 2019-07-15 19:38         ` Andi Kleen
  2019-07-15 22:12           ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2019-07-15 19:38 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Uros Bizjak, LKML, X86 ML, Andrew Lutomirski

> 
> That does not answer the question whether it's worthwhile to do that.

It's likely worthwhile for (Intel integrated) graphics. 

There was also a recent issue with 3dxp/dax, which uses ioremap in some
cases.

-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 18:41   ` Thomas Gleixner
  2019-07-15 19:19     ` Uros Bizjak
@ 2019-07-15 19:39     ` Andi Kleen
  2019-07-15 19:44       ` Andy Lutomirski
  2019-07-15 19:46       ` Thomas Gleixner
  1 sibling, 2 replies; 17+ messages in thread
From: Andi Kleen @ 2019-07-15 19:39 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Uros Bizjak, linux-kernel, x86, Andrew Lutomirski

> Right, we don't know where the PAT invocation comes from and whether they
> are safe to omit flushing the cache. The module load code would be one
> obvious candidate.

Module load just changes the writable/executable status, right? That shouldn't
need to flush in any case because it doesn't change the caching attributes.

-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 19:39     ` Andi Kleen
@ 2019-07-15 19:44       ` Andy Lutomirski
  2019-07-15 19:46       ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Andy Lutomirski @ 2019-07-15 19:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Thomas Gleixner, Uros Bizjak, LKML, X86 ML, Andrew Lutomirski

On Mon, Jul 15, 2019 at 12:39 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> > Right, we don't know where the PAT invocation comes from and whether they
> > are safe to omit flushing the cache. The module load code would be one
> > obvious candidate.
>
> Module load just changes the writable/executable status, right? That shouldn't
> need to flush in any case because it doesn't change the caching attributes.
>

Indeed.  module load should require a single TLB flush and no cache
flushes.  I don't think we're currently efficient enough to do it with
a single TLB flush, but we should be able to...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 19:39     ` Andi Kleen
  2019-07-15 19:44       ` Andy Lutomirski
@ 2019-07-15 19:46       ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2019-07-15 19:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Uros Bizjak, linux-kernel, x86, Andrew Lutomirski

On Mon, 15 Jul 2019, Andi Kleen wrote:

> > Right, we don't know where the PAT invocation comes from and whether they
> > are safe to omit flushing the cache. The module load code would be one
> > obvious candidate.
> 
> Module load just changes the writable/executable status, right? That shouldn't
> need to flush in any case because it doesn't change the caching attributes.

Ah right. We don't flush when the caching attributes are not changed.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 19:38         ` Andi Kleen
@ 2019-07-15 22:12           ` Andy Lutomirski
  2019-07-15 22:53             ` Andi Kleen
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2019-07-15 22:12 UTC (permalink / raw)
  To: Andi Kleen, Dave Hansen, Peter Zijlstra
  Cc: Thomas Gleixner, Uros Bizjak, LKML, X86 ML, Andrew Lutomirski

On Mon, Jul 15, 2019 at 12:38 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> >
> > That does not answer the question whether it's worthwhile to do that.
>
> It's likely worthwhile for (Intel integrated) graphics.
>
> There was also a recent issue with 3dxp/dax, which uses ioremap in some
> cases.
>


FWIW, I applied this simpler patch:

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 6a9a77a403c9..a933f99b176a 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1729,6 +1729,7 @@ static int change_page_attr_set_clr(unsigned
long *addr, int numpages,
         * attributes:
         */
        cache = !!pgprot2cachemode(mask_set);
+       WARN_ON(cache);

        /*
         * On error; flush everything to be sure.

and booted a VM, including loading a module.  The warning did not
fire.  For the most part, we use PAT for things like ioremap_wc(), but
there's no flush, since there's no preexisting mapping at all.

I haven't tested on a real kernel with i915.  Does i915 really hit
this code path?  Does it happen more than once or twice at boot?

The only case I can think of where this would really matter is DAX, if
anyone uses WT for DAX.

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 22:12           ` Andy Lutomirski
@ 2019-07-15 22:53             ` Andi Kleen
  2019-07-15 23:10               ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2019-07-15 22:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Uros Bizjak, LKML, X86 ML

> I haven't tested on a real kernel with i915.  Does i915 really hit
> this code path?  Does it happen more than once or twice at boot?

Yes some workloads allocate/free a lot of write combined memory
for graphics objects.

-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 22:53             ` Andi Kleen
@ 2019-07-15 23:10               ` Andy Lutomirski
  2019-07-16  0:32                 ` Andi Kleen
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2019-07-15 23:10 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andy Lutomirski, Dave Hansen, Peter Zijlstra, Thomas Gleixner,
	Uros Bizjak, LKML, X86 ML

On Mon, Jul 15, 2019 at 3:53 PM Andi Kleen <ak@linux.intel.com> wrote:
>
> > I haven't tested on a real kernel with i915.  Does i915 really hit
> > this code path?  Does it happen more than once or twice at boot?
>
> Yes some workloads allocate/free a lot of write combined memory
> for graphics objects.
>

But where does that memory come from?  If it's from device memory
(i.e. memory that's not in the kernel direct map), then, unless I
missed something, we're never changing the cache mode per se -- we're
just ioremap_wc-ing it, which doesn't require a flush.

IOW I'm wondering if there's any workload where this patch makes a difference.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets
  2019-07-15 23:10               ` Andy Lutomirski
@ 2019-07-16  0:32                 ` Andi Kleen
  0 siblings, 0 replies; 17+ messages in thread
From: Andi Kleen @ 2019-07-16  0:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dave Hansen, Peter Zijlstra, Thomas Gleixner, Uros Bizjak, LKML, X86 ML

On Mon, Jul 15, 2019 at 04:10:36PM -0700, Andy Lutomirski wrote:
> On Mon, Jul 15, 2019 at 3:53 PM Andi Kleen <ak@linux.intel.com> wrote:
> >
> > > I haven't tested on a real kernel with i915.  Does i915 really hit
> > > this code path?  Does it happen more than once or twice at boot?
> >
> > Yes some workloads allocate/free a lot of write combined memory
> > for graphics objects.
> >
> 
> But where does that memory come from?  If it's from device memory
> (i.e. memory that's not in the kernel direct map), then, unless I
> missed something, we're never changing the cache mode per se -- we're
> just ioremap_wc-ing it, which doesn't require a flush.

Integraded graphics doesn't have device memory. There's an reserved
memory area, but a lot of the buffers the GPU works with
come from main memory.


-Andi

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-07-16  0:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-11  8:12 [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets Uros Bizjak
2019-07-11 14:39 ` Andy Lutomirski
2019-07-11 19:00   ` Uros Bizjak
2019-07-15  8:24 ` Thomas Gleixner
2019-07-15 12:04   ` Uros Bizjak
2019-07-15 16:31 ` Andi Kleen
2019-07-15 18:41   ` Thomas Gleixner
2019-07-15 19:19     ` Uros Bizjak
2019-07-15 19:30       ` Thomas Gleixner
2019-07-15 19:38         ` Andi Kleen
2019-07-15 22:12           ` Andy Lutomirski
2019-07-15 22:53             ` Andi Kleen
2019-07-15 23:10               ` Andy Lutomirski
2019-07-16  0:32                 ` Andi Kleen
2019-07-15 19:39     ` Andi Kleen
2019-07-15 19:44       ` Andy Lutomirski
2019-07-15 19:46       ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).