From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753229AbbDCFLi (ORCPT ); Fri, 3 Apr 2015 01:11:38 -0400 Received: from terminus.zytor.com ([198.137.202.10]:43128 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752397AbbDCFLd (ORCPT ); Fri, 3 Apr 2015 01:11:33 -0400 Date: Thu, 2 Apr 2015 22:10:58 -0700 From: tip-bot for Ross Zwisler Message-ID: Cc: luto@amacapital.net, ross.zwisler@linux.intel.com, linux-kernel@vger.kernel.org, bp@alien8.de, brgerst@gmail.com, bp@suse.de, mingo@kernel.org, hpa@linux.intel.com, hpa@zytor.com, torvalds@linux-foundation.org, dvlasenk@redhat.com, tglx@linutronix.de Reply-To: mingo@kernel.org, hpa@zytor.com, hpa@linux.intel.com, tglx@linutronix.de, torvalds@linux-foundation.org, dvlasenk@redhat.com, luto@amacapital.net, linux-kernel@vger.kernel.org, ross.zwisler@linux.intel.com, brgerst@gmail.com, bp@alien8.de, bp@suse.de In-Reply-To: <1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com> References: <1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/asm] x86/asm: Add support for the CLWB instruction Git-Commit-ID: d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9 Gitweb: http://git.kernel.org/tip/d9dc64f30abe42f71bc7e9eb9d38c41006cf39f9 Author: Ross Zwisler AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700 Committer: Ingo Molnar CommitDate: Fri, 3 Apr 2015 06:56:38 +0200 x86/asm: Add support for the CLWB instruction Add support for the new CLWB (cache line write back) instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf The CLWB instruction is used to write back the contents of dirtied cache lines to memory without evicting the cache lines from the processor's cache hierarchy. This should be used in favor of clflushopt or clflush in cases where you require the cache line to be written to memory but plan to access the data again in the near future. One of the main use cases for this is with persistent memory where CLWB can be used with PCOMMIT to ensure that data has been accepted to memory and is durable on the DIMM. This function shows how to properly use CLWB/CLFLUSHOPT/CLFLUSH and PCOMMIT with appropriate fencing: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * Use SFENCE to order CLWB/CLFLUSHOPT/CLFLUSH cache flushes. * (MFENCE via mb() also works) */ wmb(); /* PCOMMIT and the required SFENCE for ordering */ pcommit_sfence(); } After this function completes the data pointed to by vaddr is has been accepted to memory and will be durable if the vaddr points to persistent memory. Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the CLFLUSH so that we can flip it into a CLFLUSHOPT by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain CLFLUSH, but I've been told that executing a CLFLUSH + prefix should be faster than executing a CLFLUSH + NOP. We had to hard code the assembly for CLWB because, lacking the ability to assemble the CLWB instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately XSAVEOPT itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. Signed-off-by: Ross Zwisler Acked-by: Borislav Petkov Acked-by: H. Peter Anvin Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1422377631-8986-3-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 0f7a5a1..854c04b 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -233,6 +233,7 @@ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 2ec1a53..aeb4666e 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -201,6 +201,20 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void clwb(volatile void *__p) +{ + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); +} + static inline void pcommit_sfence(void) { alternative(ASM_NOP7,