[PATCH 0/6] add support for new persistent memory instructions

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] add support for new persistent memory instructions
@ 2014-11-11 18:43 Ross Zwisler
  2014-11-11 18:43 ` [PATCH 1/6] x86: Add support for the pcommit instruction Ross Zwisler
                   ` (5 more replies)
  0 siblings, 6 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

This patch set adds support for two new persistent memory instructions, pcommit
and clwb.  These instructions were announced in the document "Intel
Architecture Instruction Set Extensions Programming Reference" with reference
number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

These patches apply cleanly to v3.18-rc4.

Here are some things of note:

 - As with the clflushopt patches before this, I'm assuming that the addressing
   mode generated by the original clflush instruction will match the new
   clflush instruction with the 0x66 prefix for clflushopt, and for the
   xsaveopt instruction with the 0x66 prefix for clwb.  For all the test cases
   that I've come up with and for the new clwb code generated by this patch
   series, this has proven to be true on my test machine.

 - According to the SDM, xsaveopt has a form where it has a REX.W prefix.  I
   believe that this prefix will not be generated by gcc in x86_64 kernel code.
   Based on this, I don't believe I need to account for this extra prefix when
   dealing with the assembly language created for clwb.  Please correct me if
   I'm wrong.

 - The last three patches in this series update existing uses of clflushopt to
   use clwb instead.  The assertion is that clwb is preferable to clflushopt in
   these cases because after a clwb the cache line will be clean and ready for
   eviction, but that there is a possibility that it might be referenced again
   in the future while it is still in the CPU cache, giving us a performance
   boost.

Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org

Ross Zwisler (6):
  x86: Add support for the pcommit instruction
  x86/alternative: Add alternative_io_2
  x86: Add support for the clwb instruction
  x86: Use clwb in clflush_cache_range
  x86: Use clwb in drm_clflush_page
  x86: Use clwb in drm_clflush_virt_range

 arch/x86/include/asm/alternative.h   | 14 ++++++++++++++
 arch/x86/include/asm/cpufeature.h    |  2 ++
 arch/x86/include/asm/special_insns.h | 16 ++++++++++++++++
 arch/x86/mm/pageattr.c               |  8 ++++----
 drivers/gpu/drm/drm_cache.c          | 12 ++++++------
 5 files changed, 42 insertions(+), 10 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/6] x86: Add support for the pcommit instruction
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-13  3:25   ` Andy Lutomirski
  2014-11-11 18:43 ` [PATCH 2/6] x86/alternative: Add alternative_io_2 Ross Zwisler
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

Add support for the new pcommit instruction.  This instruction was
announced in the document "Intel Architecture Instruction Set Extensions
Programming Reference" with reference number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 arch/x86/include/asm/cpufeature.h    | 1 +
 arch/x86/include/asm/special_insns.h | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 0bb1335..b3e6b89 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -225,6 +225,7 @@
 #define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
+#define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index e820c08..1709a2e 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,12 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void pcommit(void)
+{
+	alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
+		    X86_FEATURE_PCOMMIT);
+}
+
 #define nop() asm volatile ("nop")
 
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/6] x86/alternative: Add alternative_io_2
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
  2014-11-11 18:43 ` [PATCH 1/6] x86: Add support for the pcommit instruction Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-11 18:43 ` [PATCH 3/6] x86: Add support for the clwb instruction Ross Zwisler
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

Add alternative_io_2 in the spirit of alternative_input_2 and
alternative_io.  This will allow us to have instructions with an output
parameter that vary based on two CPU features.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 arch/x86/include/asm/alternative.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 473bdbe..7d9ead9 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -180,6 +180,20 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
 		: output : "i" (0), ## input)
 
+/*
+ * This is similar to alternative_io. But it has two features and
+ * respective instructions.
+ *
+ * If CPU has feature2, newinstr2 is used.
+ * Otherwise, if CPU has feature1, newinstr1 is used.
+ * Otherwise, oldinstr is used.
+ */
+#define alternative_io_2(oldinstr, newinstr1, feature1, newinstr2,	     \
+			   feature2, output, input...)			     \
+	asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1,	     \
+		newinstr2, feature2)					     \
+		: output : "i" (0), ## input)
+
 /* Like alternative_io, but for replacing a direct call with another one. */
 #define alternative_call(oldfunc, newfunc, feature, output, input...)	\
 	asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
  2014-11-11 18:43 ` [PATCH 1/6] x86: Add support for the pcommit instruction Ross Zwisler
  2014-11-11 18:43 ` [PATCH 2/6] x86/alternative: Add alternative_io_2 Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-11 19:12   ` Borislav Petkov
  2014-11-12 12:39   ` Borislav Petkov
  2014-11-11 18:43 ` [PATCH 4/6] x86: Use clwb in clflush_cache_range Ross Zwisler
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

Add support for the new clwb instruction.  This instruction was
announced in the document "Intel Architecture Instruction Set Extensions
Programming Reference" with reference number 319433-022.

https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf

Here are some things of note:

 - As with the clflushopt patches before this, I'm assuming that the addressing
   mode generated by the original clflush instruction will match the new
   clflush instruction with the 0x66 prefix for clflushopt, and for the
   xsaveopt instruction with the 0x66 prefix for clwb.  For all the test cases
   that I've come up with and for the new clwb code generated by this patch
   series, this has proven to be true on my test machine.

 - According to the SDM, xsaveopt has a form where it has a REX.W prefix.  I
   believe that this prefix will not be generated by gcc in x86_64 kernel code.
   Based on this, I don't believe I need to account for this extra prefix when
   dealing with the assembly language created for clwb.  Please correct me if
   I'm wrong.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 arch/x86/include/asm/cpufeature.h    |  1 +
 arch/x86/include/asm/special_insns.h | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index b3e6b89..fbbed34 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -227,6 +227,7 @@
 #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
 #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
 #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
+#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
 #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 1709a2e..a328460 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -199,6 +199,16 @@ static inline void clflushopt(volatile void *__p)
 		       "+m" (*(volatile char __force *)__p));
 }
 
+static inline void clwb(volatile void *__p)
+{
+	alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
+			 ".byte 0x66; clflush %P0",
+			 X86_FEATURE_CLFLUSHOPT,
+			 ".byte 0x66; xsaveopt %P0",
+			 X86_FEATURE_CLWB,
+			 "+m" (*(volatile char __force *)__p));
+}
+
 static inline void pcommit(void)
 {
 	alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/6] x86: Use clwb in clflush_cache_range
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
                   ` (2 preceding siblings ...)
  2014-11-11 18:43 ` [PATCH 3/6] x86: Add support for the clwb instruction Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-11 18:43 ` [PATCH 5/6] x86: Use clwb in drm_clflush_page Ross Zwisler
  2014-11-11 18:43 ` [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range Ross Zwisler
  5 siblings, 0 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

If clwb is available on the system, use it in clflush_cache_range.
If clwb is not available, fall back to clflushopt if you can.
If clflushopt is not supported, fall all the way back to clflush.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 arch/x86/mm/pageattr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 36de293..5229d45 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -126,8 +126,8 @@ within(unsigned long addr, unsigned long start, unsigned long end)
  * @vaddr:	virtual start address
  * @size:	number of bytes to flush
  *
- * clflushopt is an unordered instruction which needs fencing with mfence or
- * sfence to avoid ordering issues.
+ * clflushopt and clwb are unordered instructions which need fencing with
+ * mfence or sfence to avoid ordering issues.
  */
 void clflush_cache_range(void *vaddr, unsigned int size)
 {
@@ -136,11 +136,11 @@ void clflush_cache_range(void *vaddr, unsigned int size)
 	mb();
 
 	for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
-		clflushopt(vaddr);
+		clwb(vaddr);
 	/*
 	 * Flush any possible final partial cacheline:
 	 */
-	clflushopt(vend);
+	clwb(vend);
 
 	mb();
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/6] x86: Use clwb in drm_clflush_page
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
                   ` (3 preceding siblings ...)
  2014-11-11 18:43 ` [PATCH 4/6] x86: Use clwb in clflush_cache_range Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-11 18:43 ` [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range Ross Zwisler
  5 siblings, 0 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

If clwb is available on the system, use it in drm_clflush_page.
If clwb is not available, fall back to clflushopt if you can.
If clflushopt is not supported, fall all the way back to clflush.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 drivers/gpu/drm/drm_cache.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index a6b6906..aad9d82 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -34,9 +34,9 @@
 #if defined(CONFIG_X86)
 
 /*
- * clflushopt is an unordered instruction which needs fencing with mfence or
- * sfence to avoid ordering issues.  For drm_clflush_page this fencing happens
- * in the caller.
+ * clwb and clflushopt are unordered instructions which need fencing with
+ * mfence or sfence to avoid ordering issues.  For drm_clflush_page this
+ * fencing happens in the caller.
  */
 static void
 drm_clflush_page(struct page *page)
@@ -50,7 +50,7 @@ drm_clflush_page(struct page *page)
 
 	page_virtual = kmap_atomic(page);
 	for (i = 0; i < PAGE_SIZE; i += size)
-		clflushopt(page_virtual + i);
+		clwb(page_virtual + i);
 	kunmap_atomic(page_virtual);
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
                   ` (4 preceding siblings ...)
  2014-11-11 18:43 ` [PATCH 5/6] x86: Use clwb in drm_clflush_page Ross Zwisler
@ 2014-11-11 18:43 ` Ross Zwisler
  2014-11-13  3:14   ` Andy Lutomirski
  5 siblings, 1 reply; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

If clwb is available on the system, use it in drm_clflush_virt_range.
If clwb is not available, fall back to clflushopt if you can.
If clflushopt is not supported, fall all the way back to clflush.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: H Peter Anvin <h.peter.anvin@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: x86@kernel.org
---
 drivers/gpu/drm/drm_cache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index aad9d82..84e9a04 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -138,8 +138,8 @@ drm_clflush_virt_range(void *addr, unsigned long length)
 		void *end = addr + length;
 		mb();
 		for (; addr < end; addr += boot_cpu_data.x86_clflush_size)
-			clflushopt(addr);
-		clflushopt(end - 1);
+			clwb(addr);
+		clwb(end - 1);
 		mb();
 		return;
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 18:43 ` [PATCH 3/6] x86: Add support for the clwb instruction Ross Zwisler
@ 2014-11-11 19:12   ` Borislav Petkov
  2014-11-11 19:19     ` Borislav Petkov
  2014-11-11 19:48     ` Ross Zwisler
  2014-11-12 12:39   ` Borislav Petkov
  1 sibling, 2 replies; 29+ messages in thread
From: Borislav Petkov @ 2014-11-11 19:12 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote:
> Add support for the new clwb instruction.  This instruction was
> announced in the document "Intel Architecture Instruction Set Extensions
> Programming Reference" with reference number 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> Here are some things of note:
> 
>  - As with the clflushopt patches before this, I'm assuming that the addressing
>    mode generated by the original clflush instruction will match the new
>    clflush instruction with the 0x66 prefix for clflushopt, and for the
>    xsaveopt instruction with the 0x66 prefix for clwb.  For all the test cases
>    that I've come up with and for the new clwb code generated by this patch
>    series, this has proven to be true on my test machine.
> 
>  - According to the SDM, xsaveopt has a form where it has a REX.W prefix.  I
>    believe that this prefix will not be generated by gcc in x86_64 kernel code.
>    Based on this, I don't believe I need to account for this extra prefix when
>    dealing with the assembly language created for clwb.  Please correct me if
>    I'm wrong.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: dri-devel@lists.freedesktop.org
> Cc: x86@kernel.org
> ---
>  arch/x86/include/asm/cpufeature.h    |  1 +
>  arch/x86/include/asm/special_insns.h | 10 ++++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index b3e6b89..fbbed34 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -227,6 +227,7 @@
>  #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
>  #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
>  #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
> +#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
>  #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
>  #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
>  #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 1709a2e..a328460 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -199,6 +199,16 @@ static inline void clflushopt(volatile void *__p)
>  		       "+m" (*(volatile char __force *)__p));
>  }
>  
> +static inline void clwb(volatile void *__p)
> +{
> +	alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",

Any particular reason for using 0x3e as a prefix to have the insns be
the same size or is it simply because CLFLUSH can stomach it?

:-)

> +			 ".byte 0x66; clflush %P0",
> +			 X86_FEATURE_CLFLUSHOPT,
> +			 ".byte 0x66; xsaveopt %P0",

Huh, XSAVEOPT?!? Shouldn't that be CLWB??

> +			 X86_FEATURE_CLWB,
> +			 "+m" (*(volatile char __force *)__p));
> +}
> +
>  static inline void pcommit(void)
>  {
>  	alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:12   ` Borislav Petkov
@ 2014-11-11 19:19     ` Borislav Petkov
  2014-11-11 19:40       ` Ross Zwisler
  2014-11-11 19:48     ` Ross Zwisler
  1 sibling, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-11 19:19 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, Nov 11, 2014 at 08:12:39PM +0100, Borislav Petkov wrote:
> > +			 ".byte 0x66; xsaveopt %P0",
> 
> Huh, XSAVEOPT?!? Shouldn't that be CLWB??

Bah, the same opcodes, only 0x66 prefix makes it into CLWB. Could use a
comment I guess.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:19     ` Borislav Petkov
@ 2014-11-11 19:40       ` Ross Zwisler
  2014-11-11 19:46         ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 19:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, 2014-11-11 at 20:19 +0100, Borislav Petkov wrote:
> On Tue, Nov 11, 2014 at 08:12:39PM +0100, Borislav Petkov wrote:
> > > +			 ".byte 0x66; xsaveopt %P0",
> > 
> > Huh, XSAVEOPT?!? Shouldn't that be CLWB??
> 
> Bah, the same opcodes, only 0x66 prefix makes it into CLWB. Could use a
> comment I guess.

Yep, it's weird, I know.  :)  I'll add a comment.

Thanks,
- Ross


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:40       ` Ross Zwisler
@ 2014-11-11 19:46         ` Borislav Petkov
  2014-11-11 19:54           ` Ross Zwisler
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-11 19:46 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, Nov 11, 2014 at 12:40:00PM -0700, Ross Zwisler wrote:
> Yep, it's weird, I know.  :)

But sure, saving opcode space, makes sense to me.

Btw, I'd still be interested about this:

> +static inline void clwb(volatile void *__p)
> +{
> +     alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",

Any particular reason for using 0x3e as a prefix to have the insns be
the same size or is it simply because CLFLUSH can stomach it?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:12   ` Borislav Petkov
  2014-11-11 19:19     ` Borislav Petkov
@ 2014-11-11 19:48     ` Ross Zwisler
  2014-11-11 19:54       ` Borislav Petkov
  1 sibling, 1 reply; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 19:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, 2014-11-11 at 20:12 +0100, Borislav Petkov wrote:
> On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote:
> > Add support for the new clwb instruction.  This instruction was
> > announced in the document "Intel Architecture Instruction Set Extensions
> > Programming Reference" with reference number 319433-022.
> > 
> > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> > 
> > Here are some things of note:
> > 
> >  - As with the clflushopt patches before this, I'm assuming that the addressing
> >    mode generated by the original clflush instruction will match the new
> >    clflush instruction with the 0x66 prefix for clflushopt, and for the
> >    xsaveopt instruction with the 0x66 prefix for clwb.  For all the test cases
> >    that I've come up with and for the new clwb code generated by this patch
> >    series, this has proven to be true on my test machine.
> > 
> >  - According to the SDM, xsaveopt has a form where it has a REX.W prefix.  I
> >    believe that this prefix will not be generated by gcc in x86_64 kernel code.
> >    Based on this, I don't believe I need to account for this extra prefix when
> >    dealing with the assembly language created for clwb.  Please correct me if
> >    I'm wrong.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: H Peter Anvin <h.peter.anvin@intel.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: x86@kernel.org
> > ---
> >  arch/x86/include/asm/cpufeature.h    |  1 +
> >  arch/x86/include/asm/special_insns.h | 10 ++++++++++
> >  2 files changed, 11 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> > index b3e6b89..fbbed34 100644
> > --- a/arch/x86/include/asm/cpufeature.h
> > +++ b/arch/x86/include/asm/cpufeature.h
> > @@ -227,6 +227,7 @@
> >  #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
> >  #define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
> >  #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
> > +#define X86_FEATURE_CLWB	( 9*32+24) /* CLWB instruction */
> >  #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
> >  #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
> >  #define X86_FEATURE_AVX512CD	( 9*32+28) /* AVX-512 Conflict Detection */
> > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> > index 1709a2e..a328460 100644
> > --- a/arch/x86/include/asm/special_insns.h
> > +++ b/arch/x86/include/asm/special_insns.h
> > @@ -199,6 +199,16 @@ static inline void clflushopt(volatile void *__p)
> >  		       "+m" (*(volatile char __force *)__p));
> >  }
> >  
> > +static inline void clwb(volatile void *__p)
> > +{
> > +	alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
> 
> Any particular reason for using 0x3e as a prefix to have the insns be
> the same size or is it simply because CLFLUSH can stomach it?
> 
> :-)

Essentially we need one additional byte at the beginning of the clflush so
that we can flip it into a clflushopt by changing that byte into a 0x66
prefix.  Two options are to either insert a 1 byte ASM_NOP1, or to add a 1
byte NOP_DS_PREFIX.  Both have no functional effect with the plain clflush,
but I've been told that executing a clflush + prefix should be faster than
executing a clflush + NOP.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:48     ` Ross Zwisler
@ 2014-11-11 19:54       ` Borislav Petkov
  0 siblings, 0 replies; 29+ messages in thread
From: Borislav Petkov @ 2014-11-11 19:54 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, Nov 11, 2014 at 12:48:52PM -0700, Ross Zwisler wrote:
> Essentially we need one additional byte at the beginning of the clflush so
> that we can flip it into a clflushopt by changing that byte into a 0x66
> prefix.  Two options are to either insert a 1 byte ASM_NOP1, or to add a 1
> byte NOP_DS_PREFIX.  Both have no functional effect with the plain clflush,
> but I've been told that executing a clflush + prefix should be faster than
> executing a clflush + NOP.

I see. :)

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 19:46         ` Borislav Petkov
@ 2014-11-11 19:54           ` Ross Zwisler
  0 siblings, 0 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-11 19:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, 2014-11-11 at 20:46 +0100, Borislav Petkov wrote:
> On Tue, Nov 11, 2014 at 12:40:00PM -0700, Ross Zwisler wrote:
> > Yep, it's weird, I know.  :)
> 
> But sure, saving opcode space, makes sense to me.
> 
> Btw, I'd still be interested about this:
> 
> > +static inline void clwb(volatile void *__p)
> > +{
> > +     alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
> 
> Any particular reason for using 0x3e as a prefix to have the insns be
> the same size or is it simply because CLFLUSH can stomach it?

Ah, sorry, I was still responding to your first mail.  :)  Response
copied here to save searching:

        Essentially we need one additional byte at the beginning of the
        clflush so that we can flip it into a clflushopt by changing that byte
        into a 0x66 prefix.  Two options are to either insert a 1 byte
        ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX.  Both have no functional
        effect with the plain clflush, but I've been told that executing a
        clflush + prefix should be faster than executing a clflush + NOP.

I agree, this is useful info - I'll add it to the patch comments for v2.

Thank you for the feedback.

- Ross


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-11 18:43 ` [PATCH 3/6] x86: Add support for the clwb instruction Ross Zwisler
  2014-11-11 19:12   ` Borislav Petkov
@ 2014-11-12 12:39   ` Borislav Petkov
  2014-11-12 13:38     ` Anvin, H Peter
  1 sibling, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-12 12:39 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote:
> +static inline void clwb(volatile void *__p)
> +{
> +	alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
> +			 ".byte 0x66; clflush %P0",
> +			 X86_FEATURE_CLFLUSHOPT,
> +			 ".byte 0x66; xsaveopt %P0",

Btw, I'm afraid you're going to have to spell out those new instruction
mnemonics as they aren't supported by older compilers:

  CC      drivers/gpu/drm/drm_cache.o
{standard input}: Assembler messages:
{standard input}:63: Error: no such instruction: `xsaveopt (%rax)'
{standard input}:249: Error: no such instruction: `xsaveopt (%rdi)'
{standard input}:283: Error: no such instruction: `xsaveopt -1(%rdx)'
make[1]: *** [drivers/gpu/drm/drm_cache.o] Error 1
make: *** [drivers/gpu/drm/drm_cache.o] Error 2
[boris@etch:14:35:40:k:14)->  gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

This is on a very old guest I have which has this gcc in it and the
kernel supports everything gcc >= 3.2.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-12 12:39   ` Borislav Petkov
@ 2014-11-12 13:38     ` Anvin, H Peter
  2014-11-12 14:12       ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Anvin, H Peter @ 2014-11-12 13:38 UTC (permalink / raw)
  To: Borislav Petkov, Ross Zwisler
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, David Airlie, dri-devel, x86

No, it doesn't.  x86 requires 3.4+ at a minimum.

________________________________
From: Borislav Petkov
Sent: Wednesday, November 12, 2014 4:39:09 AM
To: Ross Zwisler
Cc: linux-kernel@vger.kernel.org; Anvin, H Peter; Ingo Molnar; Thomas Gleixner; David Airlie; dri-devel@lists.freedesktop.org; x86@kernel.org
Subject: Re: [PATCH 3/6] x86: Add support for the clwb instruction

On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote:
> +static inline void clwb(volatile void *__p)
> +{
> +     alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0",
> +                      ".byte 0x66; clflush %P0",
> +                      X86_FEATURE_CLFLUSHOPT,
> +                      ".byte 0x66; xsaveopt %P0",

Btw, I'm afraid you're going to have to spell out those new instruction
mnemonics as they aren't supported by older compilers:

  CC      drivers/gpu/drm/drm_cache.o
{standard input}: Assembler messages:
{standard input}:63: Error: no such instruction: `xsaveopt (%rax)'
{standard input}:249: Error: no such instruction: `xsaveopt (%rdi)'
{standard input}:283: Error: no such instruction: `xsaveopt -1(%rdx)'
make[1]: *** [drivers/gpu/drm/drm_cache.o] Error 1
make: *** [drivers/gpu/drm/drm_cache.o] Error 2
[boris@etch:14:35:40:k:14)->  gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

This is on a very old guest I have which has this gcc in it and the
kernel supports everything gcc >= 3.2.

--
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-12 13:38     ` Anvin, H Peter
@ 2014-11-12 14:12       ` Borislav Petkov
  2014-11-13 17:47         ` Ross Zwisler
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-12 14:12 UTC (permalink / raw)
  To: Anvin, H Peter
  Cc: Ross Zwisler, linux-kernel, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Wed, Nov 12, 2014 at 01:38:45PM +0000, Anvin, H Peter wrote:
> No, it doesn't.  x86 requires 3.4+ at a minimum.

The only test I see is:

#if GCC_VERSION < 30200
# error Sorry, your compiler is too old - please upgrade it.
#endif

And even if we do require 3.4, the build fails with 4.1+ so...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-11 18:43 ` [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range Ross Zwisler
@ 2014-11-13  3:14   ` Andy Lutomirski
  2014-11-13 11:20     ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2014-11-13  3:14 UTC (permalink / raw)
  To: Ross Zwisler, linux-kernel
  Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, David Airlie,
	dri-devel, x86

On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> If clwb is available on the system, use it in drm_clflush_virt_range.
> If clwb is not available, fall back to clflushopt if you can.
> If clflushopt is not supported, fall all the way back to clflush.

I don't know exactly what drm_clflush_virt_range (and the other
functions you're modifying similarly) are for, but it seems plausible to
me that they're used before reads to make sure that non-coherent memory
sees updated data.  If that's true, then this will break it.

But maybe all the users are write to coherent memory that just need to
ensure that whatever's backing the memory knows about the write.

FWIW, it may make sense to rename this function to drm_clwb_virt_range
if you make this change.

--Andy

> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: dri-devel@lists.freedesktop.org
> Cc: x86@kernel.org
> ---
>  drivers/gpu/drm/drm_cache.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> index aad9d82..84e9a04 100644
> --- a/drivers/gpu/drm/drm_cache.c
> +++ b/drivers/gpu/drm/drm_cache.c
> @@ -138,8 +138,8 @@ drm_clflush_virt_range(void *addr, unsigned long length)
>  		void *end = addr + length;
>  		mb();
>  		for (; addr < end; addr += boot_cpu_data.x86_clflush_size)
> -			clflushopt(addr);
> -		clflushopt(end - 1);
> +			clwb(addr);
> +		clwb(end - 1);
>  		mb();
>  		return;
>  	}
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] x86: Add support for the pcommit instruction
  2014-11-11 18:43 ` [PATCH 1/6] x86: Add support for the pcommit instruction Ross Zwisler
@ 2014-11-13  3:25   ` Andy Lutomirski
  2014-11-14 21:07     ` Ross Zwisler
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2014-11-13  3:25 UTC (permalink / raw)
  To: Ross Zwisler, linux-kernel
  Cc: H Peter Anvin, Ingo Molnar, Thomas Gleixner, David Airlie,
	dri-devel, x86

On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> Add support for the new pcommit instruction.  This instruction was
> announced in the document "Intel Architecture Instruction Set Extensions
> Programming Reference" with reference number 319433-022.
> 
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Cc: H Peter Anvin <h.peter.anvin@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: dri-devel@lists.freedesktop.org
> Cc: x86@kernel.org
> ---
>  arch/x86/include/asm/cpufeature.h    | 1 +
>  arch/x86/include/asm/special_insns.h | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index 0bb1335..b3e6b89 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -225,6 +225,7 @@
>  #define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
>  #define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
>  #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
> +#define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
>  #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
>  #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
>  #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index e820c08..1709a2e 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -199,6 +199,12 @@ static inline void clflushopt(volatile void *__p)
>  		       "+m" (*(volatile char __force *)__p));
>  }
>  
> +static inline void pcommit(void)
> +{
> +	alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
> +		    X86_FEATURE_PCOMMIT);
> +}
> +

Should this patch add the feature bit and cpuinfo entry to go with it?

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13  3:14   ` Andy Lutomirski
@ 2014-11-13 11:20     ` Borislav Petkov
  2014-11-13 16:38       ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-13 11:20 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ross Zwisler, linux-kernel, H Peter Anvin, Ingo Molnar,
	Thomas Gleixner, David Airlie, dri-devel, x86

On Wed, Nov 12, 2014 at 07:14:21PM -0800, Andy Lutomirski wrote:
> On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > If clwb is available on the system, use it in drm_clflush_virt_range.
> > If clwb is not available, fall back to clflushopt if you can.
> > If clflushopt is not supported, fall all the way back to clflush.
> 
> I don't know exactly what drm_clflush_virt_range (and the other
> functions you're modifying similarly) are for, but it seems plausible to
> me that they're used before reads to make sure that non-coherent memory
> sees updated data.  If that's true, then this will break it.

Why would it break it? The updated cachelines will be in memory and
subsequent reads will be serviced from the cache instead from going to
memory as it is not invalidated as it would be by CLFLUSH.

/me is puzzled.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 11:20     ` Borislav Petkov
@ 2014-11-13 16:38       ` Andy Lutomirski
  2014-11-13 17:11         ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2014-11-13 16:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: DRI, Ross Zwisler, X86 ML, Thomas Gleixner, David Airlie,
	H Peter Anvin, Ingo Molnar, linux-kernel

On Nov 13, 2014 3:20 AM, "Borislav Petkov" <bp@alien8.de> wrote:
>
> On Wed, Nov 12, 2014 at 07:14:21PM -0800, Andy Lutomirski wrote:
> > On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > > If clwb is available on the system, use it in drm_clflush_virt_range.
> > > If clwb is not available, fall back to clflushopt if you can.
> > > If clflushopt is not supported, fall all the way back to clflush.
> >
> > I don't know exactly what drm_clflush_virt_range (and the other
> > functions you're modifying similarly) are for, but it seems plausible to
> > me that they're used before reads to make sure that non-coherent memory
> > sees updated data.  If that's true, then this will break it.
>
> Why would it break it? The updated cachelines will be in memory and
> subsequent reads will be serviced from the cache instead from going to
> memory as it is not invalidated as it would be by CLFLUSH.
>
> /me is puzzled.

Suppose you map some device memory WB, and then the device
non-coherently updates.  If you want the CPU to see it, you need
clflush or clflushopt.  Some architectures might do this for
dma_sync_single_for_cpu with DMA_FROM_DEVICE.

I'm not sure that such a thing exists on x86.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 16:38       ` Andy Lutomirski
@ 2014-11-13 17:11         ` Borislav Petkov
  2014-11-13 17:33           ` Ville Syrjälä
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-13 17:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: DRI, Ross Zwisler, X86 ML, Thomas Gleixner, David Airlie,
	H Peter Anvin, Ingo Molnar, linux-kernel, intel-gfx

On Thu, Nov 13, 2014 at 08:38:23AM -0800, Andy Lutomirski wrote:
> On Nov 13, 2014 3:20 AM, "Borislav Petkov" <bp@alien8.de> wrote:
> >
> > On Wed, Nov 12, 2014 at 07:14:21PM -0800, Andy Lutomirski wrote:
> > > On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > > > If clwb is available on the system, use it in drm_clflush_virt_range.
> > > > If clwb is not available, fall back to clflushopt if you can.
> > > > If clflushopt is not supported, fall all the way back to clflush.
> > >
> > > I don't know exactly what drm_clflush_virt_range (and the other
> > > functions you're modifying similarly) are for, but it seems plausible to
> > > me that they're used before reads to make sure that non-coherent memory
> > > sees updated data.  If that's true, then this will break it.
> >
> > Why would it break it? The updated cachelines will be in memory and
> > subsequent reads will be serviced from the cache instead from going to
> > memory as it is not invalidated as it would be by CLFLUSH.
> >
> > /me is puzzled.
> 
> Suppose you map some device memory WB, and then the device
> non-coherently updates.  If you want the CPU to see it, you need
> clflush or clflushopt.  Some architectures might do this for
> dma_sync_single_for_cpu with DMA_FROM_DEVICE.

Ah, you're talking about the other way around - the device does the
writes. Well, the usage sites are all in i915*, maybe we should ask
them - it looks to me like this is only the CPU making stuff visible in
the shared buffer but I don't know that code... intel-gfx CCed although
dri-devel is already on CC.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 17:11         ` Borislav Petkov
@ 2014-11-13 17:33           ` Ville Syrjälä
  2014-11-13 17:47             ` Borislav Petkov
  2014-11-13 18:43             ` Ville Syrjälä
  0 siblings, 2 replies; 29+ messages in thread
From: Ville Syrjälä @ 2014-11-13 17:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, intel-gfx, X86 ML, linux-kernel, DRI,
	Ingo Molnar, Ross Zwisler, H Peter Anvin, Thomas Gleixner

On Thu, Nov 13, 2014 at 06:11:33PM +0100, Borislav Petkov wrote:
> On Thu, Nov 13, 2014 at 08:38:23AM -0800, Andy Lutomirski wrote:
> > On Nov 13, 2014 3:20 AM, "Borislav Petkov" <bp@alien8.de> wrote:
> > >
> > > On Wed, Nov 12, 2014 at 07:14:21PM -0800, Andy Lutomirski wrote:
> > > > On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > > > > If clwb is available on the system, use it in drm_clflush_virt_range.
> > > > > If clwb is not available, fall back to clflushopt if you can.
> > > > > If clflushopt is not supported, fall all the way back to clflush.
> > > >
> > > > I don't know exactly what drm_clflush_virt_range (and the other
> > > > functions you're modifying similarly) are for, but it seems plausible to
> > > > me that they're used before reads to make sure that non-coherent memory
> > > > sees updated data.  If that's true, then this will break it.
> > >
> > > Why would it break it? The updated cachelines will be in memory and
> > > subsequent reads will be serviced from the cache instead from going to
> > > memory as it is not invalidated as it would be by CLFLUSH.
> > >
> > > /me is puzzled.
> > 
> > Suppose you map some device memory WB, and then the device
> > non-coherently updates.  If you want the CPU to see it, you need
> > clflush or clflushopt.  Some architectures might do this for
> > dma_sync_single_for_cpu with DMA_FROM_DEVICE.
> 
> Ah, you're talking about the other way around - the device does the
> writes. Well, the usage sites are all in i915*, maybe we should ask
> them - it looks to me like this is only the CPU making stuff visible in
> the shared buffer but I don't know that code... intel-gfx CCed although
> dri-devel is already on CC.

We use it both ways in i915. So please don't break it.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/6] x86: Add support for the clwb instruction
  2014-11-12 14:12       ` Borislav Petkov
@ 2014-11-13 17:47         ` Ross Zwisler
  0 siblings, 0 replies; 29+ messages in thread
From: Ross Zwisler @ 2014-11-13 17:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Anvin, H Peter, linux-kernel, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Wed, 2014-11-12 at 15:12 +0100, Borislav Petkov wrote:
> On Wed, Nov 12, 2014 at 01:38:45PM +0000, Anvin, H Peter wrote:
> > No, it doesn't.  x86 requires 3.4+ at a minimum.
> 
> The only test I see is:
> 
> #if GCC_VERSION < 30200
> # error Sorry, your compiler is too old - please upgrade it.
> #endif
> 
> And even if we do require 3.4, the build fails with 4.1+ so...

Ah, dang, you're right.  Okay, I'll figure out how to do this without
using xsaveopt.

Thank you for pointing this out.

- Ross


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 17:33           ` Ville Syrjälä
@ 2014-11-13 17:47             ` Borislav Petkov
  2014-11-13 18:14               ` Ville Syrjälä
  2014-11-13 18:43             ` Ville Syrjälä
  1 sibling, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2014-11-13 17:47 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Andy Lutomirski, intel-gfx, X86 ML, linux-kernel, DRI,
	Ingo Molnar, Ross Zwisler, H Peter Anvin, Thomas Gleixner

On Thu, Nov 13, 2014 at 07:33:54PM +0200, Ville Syrjälä wrote:
> We use it both ways in i915. So please don't break it.

Haha, we started from Intel with Ross' patch and made a full circle
back. Maybe you guys should talk about it.

LOL. :-)

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 17:47             ` Borislav Petkov
@ 2014-11-13 18:14               ` Ville Syrjälä
  0 siblings, 0 replies; 29+ messages in thread
From: Ville Syrjälä @ 2014-11-13 18:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, intel-gfx, X86 ML, linux-kernel, DRI,
	Ingo Molnar, Ross Zwisler, H Peter Anvin, Thomas Gleixner

On Thu, Nov 13, 2014 at 06:47:34PM +0100, Borislav Petkov wrote:
> On Thu, Nov 13, 2014 at 07:33:54PM +0200, Ville Syrjälä wrote:
> > We use it both ways in i915. So please don't break it.
> 
> Haha, we started from Intel with Ross' patch and made a full circle
> back. Maybe you guys should talk about it.
> 
> LOL. :-)

Indeed. The problem I see with these patches is that they don't actually
tell what the new instruction does, so a casual glance doesn't really
raise any red flags. Another excuse I can use is that I just got used
to the fact that the x86 hasn't historically bothered separating
invalidate and writeback and just does both. In my previous life on the
dark^Warm side I did actually know the difference :)

But there's plenty of blame to go around to the other side of the fence
too. We should have documented what we expect from these functions.
Currently you just have to know/guess, and that's just not good enough.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range
  2014-11-13 17:33           ` Ville Syrjälä
  2014-11-13 17:47             ` Borislav Petkov
@ 2014-11-13 18:43             ` Ville Syrjälä
  1 sibling, 0 replies; 29+ messages in thread
From: Ville Syrjälä @ 2014-11-13 18:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, intel-gfx, X86 ML, linux-kernel, DRI,
	Ingo Molnar, Ross Zwisler, H Peter Anvin, Thomas Gleixner

On Thu, Nov 13, 2014 at 07:33:54PM +0200, Ville Syrjälä wrote:
> On Thu, Nov 13, 2014 at 06:11:33PM +0100, Borislav Petkov wrote:
> > On Thu, Nov 13, 2014 at 08:38:23AM -0800, Andy Lutomirski wrote:
> > > On Nov 13, 2014 3:20 AM, "Borislav Petkov" <bp@alien8.de> wrote:
> > > >
> > > > On Wed, Nov 12, 2014 at 07:14:21PM -0800, Andy Lutomirski wrote:
> > > > > On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > > > > > If clwb is available on the system, use it in drm_clflush_virt_range.
> > > > > > If clwb is not available, fall back to clflushopt if you can.
> > > > > > If clflushopt is not supported, fall all the way back to clflush.
> > > > >
> > > > > I don't know exactly what drm_clflush_virt_range (and the other
> > > > > functions you're modifying similarly) are for, but it seems plausible to
> > > > > me that they're used before reads to make sure that non-coherent memory
> > > > > sees updated data.  If that's true, then this will break it.
> > > >
> > > > Why would it break it? The updated cachelines will be in memory and
> > > > subsequent reads will be serviced from the cache instead from going to
> > > > memory as it is not invalidated as it would be by CLFLUSH.
> > > >
> > > > /me is puzzled.
> > > 
> > > Suppose you map some device memory WB, and then the device
> > > non-coherently updates.  If you want the CPU to see it, you need
> > > clflush or clflushopt.  Some architectures might do this for
> > > dma_sync_single_for_cpu with DMA_FROM_DEVICE.
> > 
> > Ah, you're talking about the other way around - the device does the
> > writes. Well, the usage sites are all in i915*, maybe we should ask
> > them - it looks to me like this is only the CPU making stuff visible in
> > the shared buffer but I don't know that code... intel-gfx CCed although
> > dri-devel is already on CC.
> 
> We use it both ways in i915. So please don't break it.

Actually to clarify a bit, I think we shouldn't actually need the
invalidate part for any modern platform with shared LLC. Apart from the
display those tend to be fully coherent, so we only have to care about
the display controller not seeing stale data in memory.

I have no idea which platforms support this instruction. If
Baytrail/Braswell/Cherrytrail are on the list, then we have a problem.
Otherwise we should probably be fine with it. But I'm not 100% sure
about any future platforms that are still under wraps.

Also ttm seems to use some of this stuff. Not sure what they expect
from it.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] x86: Add support for the pcommit instruction
  2014-11-13  3:25   ` Andy Lutomirski
@ 2014-11-14 21:07     ` Ross Zwisler
  2014-11-14 21:09       ` Andy Lutomirski
  0 siblings, 1 reply; 29+ messages in thread
From: Ross Zwisler @ 2014-11-14 21:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, dri-devel, x86

On Wed, 2014-11-12 at 19:25 -0800, Andy Lutomirski wrote:
> On 11/11/2014 10:43 AM, Ross Zwisler wrote:
> > Add support for the new pcommit instruction.  This instruction was
> > announced in the document "Intel Architecture Instruction Set Extensions
> > Programming Reference" with reference number 319433-022.
> > 
> > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: H Peter Anvin <h.peter.anvin@intel.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: x86@kernel.org
> > ---
> >  arch/x86/include/asm/cpufeature.h    | 1 +
> >  arch/x86/include/asm/special_insns.h | 6 ++++++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> > index 0bb1335..b3e6b89 100644
> > --- a/arch/x86/include/asm/cpufeature.h
> > +++ b/arch/x86/include/asm/cpufeature.h
> > @@ -225,6 +225,7 @@
> >  #define X86_FEATURE_RDSEED	( 9*32+18) /* The RDSEED instruction */
> >  #define X86_FEATURE_ADX		( 9*32+19) /* The ADCX and ADOX instructions */
> >  #define X86_FEATURE_SMAP	( 9*32+20) /* Supervisor Mode Access Prevention */
> > +#define X86_FEATURE_PCOMMIT	( 9*32+22) /* PCOMMIT instruction */
> >  #define X86_FEATURE_CLFLUSHOPT	( 9*32+23) /* CLFLUSHOPT instruction */
> >  #define X86_FEATURE_AVX512PF	( 9*32+26) /* AVX-512 Prefetch */
> >  #define X86_FEATURE_AVX512ER	( 9*32+27) /* AVX-512 Exponential and Reciprocal */
> > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> > index e820c08..1709a2e 100644
> > --- a/arch/x86/include/asm/special_insns.h
> > +++ b/arch/x86/include/asm/special_insns.h
> > @@ -199,6 +199,12 @@ static inline void clflushopt(volatile void *__p)
> >  		       "+m" (*(volatile char __force *)__p));
> >  }
> >  
> > +static inline void pcommit(void)
> > +{
> > +	alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
> > +		    X86_FEATURE_PCOMMIT);
> > +}
> > +
> 
> Should this patch add the feature bit and cpuinfo entry to go with it?
> 
> --Andy

I think this patch does everything we need?  The text for cpuinfo is
auto-generated in arch/x86/kernel/cpu/capflags.c from the flags defined
in arch/x86/include/asm/cpufeature.h, I think.  Here's what I get in
cpuinfo on my system with a faked-out CPUID saying that clwb and pcommit
are present:

$ grep 'flags' /proc/cpuinfo 
flags		: fpu <snip> erms pcommit clflushopt clwb xsaveopt

The X86_FEATURE_CLWB and X86_FEATURE_PCOMMIT flags are being set up
according to what's in CPUID, and the proper alternatives are being
triggered.  I stuck some debug code in the alternatives code to see what
was being patched in the presence and absence of each of the flags.

Is there something else I'm missing?

Thanks,
- Ross


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] x86: Add support for the pcommit instruction
  2014-11-14 21:07     ` Ross Zwisler
@ 2014-11-14 21:09       ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2014-11-14 21:09 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: linux-kernel, H Peter Anvin, Ingo Molnar, Thomas Gleixner,
	David Airlie, DRI, X86 ML

On Fri, Nov 14, 2014 at 1:07 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> On Wed, 2014-11-12 at 19:25 -0800, Andy Lutomirski wrote:
>> On 11/11/2014 10:43 AM, Ross Zwisler wrote:
>> > Add support for the new pcommit instruction.  This instruction was
>> > announced in the document "Intel Architecture Instruction Set Extensions
>> > Programming Reference" with reference number 319433-022.
>> >
>> > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
>> >
>> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
>> > Cc: H Peter Anvin <h.peter.anvin@intel.com>
>> > Cc: Ingo Molnar <mingo@kernel.org>
>> > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > Cc: David Airlie <airlied@linux.ie>
>> > Cc: dri-devel@lists.freedesktop.org
>> > Cc: x86@kernel.org
>> > ---
>> >  arch/x86/include/asm/cpufeature.h    | 1 +
>> >  arch/x86/include/asm/special_insns.h | 6 ++++++
>> >  2 files changed, 7 insertions(+)
>> >
>> > diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
>> > index 0bb1335..b3e6b89 100644
>> > --- a/arch/x86/include/asm/cpufeature.h
>> > +++ b/arch/x86/include/asm/cpufeature.h
>> > @@ -225,6 +225,7 @@
>> >  #define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */
>> >  #define X86_FEATURE_ADX            ( 9*32+19) /* The ADCX and ADOX instructions */
>> >  #define X86_FEATURE_SMAP   ( 9*32+20) /* Supervisor Mode Access Prevention */
>> > +#define X86_FEATURE_PCOMMIT        ( 9*32+22) /* PCOMMIT instruction */
>> >  #define X86_FEATURE_CLFLUSHOPT     ( 9*32+23) /* CLFLUSHOPT instruction */
>> >  #define X86_FEATURE_AVX512PF       ( 9*32+26) /* AVX-512 Prefetch */
>> >  #define X86_FEATURE_AVX512ER       ( 9*32+27) /* AVX-512 Exponential and Reciprocal */
>> > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
>> > index e820c08..1709a2e 100644
>> > --- a/arch/x86/include/asm/special_insns.h
>> > +++ b/arch/x86/include/asm/special_insns.h
>> > @@ -199,6 +199,12 @@ static inline void clflushopt(volatile void *__p)
>> >                    "+m" (*(volatile char __force *)__p));
>> >  }
>> >
>> > +static inline void pcommit(void)
>> > +{
>> > +   alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8",
>> > +               X86_FEATURE_PCOMMIT);
>> > +}
>> > +
>>
>> Should this patch add the feature bit and cpuinfo entry to go with it?
>>
>> --Andy
>
> I think this patch does everything we need?  The text for cpuinfo is
> auto-generated in arch/x86/kernel/cpu/capflags.c from the flags defined
> in arch/x86/include/asm/cpufeature.h, I think.  Here's what I get in
> cpuinfo on my system with a faked-out CPUID saying that clwb and pcommit
> are present:
>
> $ grep 'flags' /proc/cpuinfo
> flags           : fpu <snip> erms pcommit clflushopt clwb xsaveopt
>
> The X86_FEATURE_CLWB and X86_FEATURE_PCOMMIT flags are being set up
> according to what's in CPUID, and the proper alternatives are being
> triggered.  I stuck some debug code in the alternatives code to see what
> was being patched in the presence and absence of each of the flags.
>
> Is there something else I'm missing?

No.  I just missed the magical auto-generation part.

--Andy

>
> Thanks,
> - Ross
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-11-14 21:09 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-11 18:43 [PATCH 0/6] add support for new persistent memory instructions Ross Zwisler
2014-11-11 18:43 ` [PATCH 1/6] x86: Add support for the pcommit instruction Ross Zwisler
2014-11-13  3:25   ` Andy Lutomirski
2014-11-14 21:07     ` Ross Zwisler
2014-11-14 21:09       ` Andy Lutomirski
2014-11-11 18:43 ` [PATCH 2/6] x86/alternative: Add alternative_io_2 Ross Zwisler
2014-11-11 18:43 ` [PATCH 3/6] x86: Add support for the clwb instruction Ross Zwisler
2014-11-11 19:12   ` Borislav Petkov
2014-11-11 19:19     ` Borislav Petkov
2014-11-11 19:40       ` Ross Zwisler
2014-11-11 19:46         ` Borislav Petkov
2014-11-11 19:54           ` Ross Zwisler
2014-11-11 19:48     ` Ross Zwisler
2014-11-11 19:54       ` Borislav Petkov
2014-11-12 12:39   ` Borislav Petkov
2014-11-12 13:38     ` Anvin, H Peter
2014-11-12 14:12       ` Borislav Petkov
2014-11-13 17:47         ` Ross Zwisler
2014-11-11 18:43 ` [PATCH 4/6] x86: Use clwb in clflush_cache_range Ross Zwisler
2014-11-11 18:43 ` [PATCH 5/6] x86: Use clwb in drm_clflush_page Ross Zwisler
2014-11-11 18:43 ` [PATCH 6/6] x86: Use clwb in drm_clflush_virt_range Ross Zwisler
2014-11-13  3:14   ` Andy Lutomirski
2014-11-13 11:20     ` Borislav Petkov
2014-11-13 16:38       ` Andy Lutomirski
2014-11-13 17:11         ` Borislav Petkov
2014-11-13 17:33           ` Ville Syrjälä
2014-11-13 17:47             ` Borislav Petkov
2014-11-13 18:14               ` Ville Syrjälä
2014-11-13 18:43             ` Ville Syrjälä

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).