* [PATCH v2 0/2] add support for new persistent memory instructions @ 2015-01-23 20:40 Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 1/2] x86: Add support for the pcommit instruction Ross Zwisler ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Ross Zwisler @ 2015-01-23 20:40 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov This patch set adds support for two new persistent memory instructions, pcommit and clwb. These instructions were announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf These patches apply cleanly to v3.19-rc5. Changes from v1: - This series no longer updates the DRM code or clflush_cache_range to use clwb instead of clflushopt, as there was concern over whether clflushopt was used in some cases to explicitly evict lines from the processor cache. I'll leave it up to the owners of this code to decide how they want to use clwb. - Reworked the clwb patch so that it doesn't use xsaveopt, as xsaveopt wasn't included in all GCC versions that the kernel needs to support. The assembly is now sort of complex because we need to hard code the clwb instruction to use a known register, but we also need to pull in the parameter as a memory constraint so that gcc doesn't reorder this instruction around other accesses to the same memory location. Many thanks to hpa and Boris Petkov for their help on getting this right. Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Ross Zwisler (2): x86: Add support for the pcommit instruction x86: Add support for the clwb instruction arch/x86/include/asm/cpufeature.h | 2 ++ arch/x86/include/asm/special_insns.h | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) -- 1.9.3 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] x86: Add support for the pcommit instruction 2015-01-23 20:40 [PATCH v2 0/2] add support for new persistent memory instructions Ross Zwisler @ 2015-01-23 20:40 ` Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-01-23 23:03 ` [PATCH v2 0/2] add support for new persistent memory instructions H. Peter Anvin 2 siblings, 0 replies; 12+ messages in thread From: Ross Zwisler @ 2015-01-23 20:40 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov Add support for the new pcommit instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 6 ++++++ 2 files changed, 7 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index bb9b258..dfdd689 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -220,6 +220,7 @@ #define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ +#define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index e820c08..1709a2e 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,12 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void pcommit(void) +{ + alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8", + X86_FEATURE_PCOMMIT); +} + #define nop() asm volatile ("nop") -- 1.9.3 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] x86: Add support for the clwb instruction 2015-01-23 20:40 [PATCH v2 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 1/2] x86: Add support for the pcommit instruction Ross Zwisler @ 2015-01-23 20:40 ` Ross Zwisler 2015-01-23 23:03 ` [PATCH v2 0/2] add support for new persistent memory instructions H. Peter Anvin 2 siblings, 0 replies; 12+ messages in thread From: Ross Zwisler @ 2015-01-23 20:40 UTC (permalink / raw) To: linux-kernel Cc: Ross Zwisler, H Peter Anvin, Ingo Molnar, Thomas Gleixner, Borislav Petkov Add support for the new clwb instruction. This instruction was announced in the document "Intel Architecture Instruction Set Extensions Programming Reference" with reference number 319433-022. https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf Regarding the details of how the alternatives assembly is set up, we need one additional byte at the beginning of the clflush so that we can flip it into a clflushopt by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain clflush, but I've been told that executing a clflush + prefix should be faster than executing a clflush + NOP. We had to hard code the assembly for clwb because, lacking the ability to assemble the clwb instruction itself, the next closest thing is to have an xsaveopt instruction with a 0x66 prefix. Unfortunately xsaveopt itself is also relatively new, and isn't included by all the GCC versions that the kernel needs to support. I tested this patch with the following versions of GCC: gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55) Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: H Peter Anvin <h.peter.anvin@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> --- arch/x86/include/asm/cpufeature.h | 1 + arch/x86/include/asm/special_insns.h | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index dfdd689..dc91747 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -222,6 +222,7 @@ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 1709a2e..8883cbc 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,20 @@ static inline void clflushopt(volatile void *__p) "+m" (*(volatile char __force *)__p)); } +static inline void clwb(volatile void *__p) +{ + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); +} + static inline void pcommit(void) { alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8", -- 1.9.3 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-23 20:40 [PATCH v2 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 2/2] x86: Add support for the clwb instruction Ross Zwisler @ 2015-01-23 23:03 ` H. Peter Anvin 2015-01-24 11:14 ` Borislav Petkov 2015-01-26 19:51 ` Ross Zwisler 2 siblings, 2 replies; 12+ messages in thread From: H. Peter Anvin @ 2015-01-23 23:03 UTC (permalink / raw) To: Ross Zwisler, linux-kernel; +Cc: Ingo Molnar, Thomas Gleixner, Borislav Petkov On 01/23/2015 12:40 PM, Ross Zwisler wrote: > This patch set adds support for two new persistent memory instructions, pcommit > and clwb. These instructions were announced in the document "Intel > Architecture Instruction Set Extensions Programming Reference" with reference > number 319433-022. > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > Please explain in these patch descriptions what the instructions actually do. + volatile struct { char x[64]; } *p = __p; + + asm volatile(ALTERNATIVE_2( + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ + X86_FEATURE_CLFLUSHOPT, + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ + X86_FEATURE_CLWB) + : [p] "+m" (*p) + : [pax] "a" (p)); For the specific case of CLWB, we can use an "m" input rather than a "+m" output, simply because CLWB (or CLFLUSH* used as a standin for CLWB doesn't need to be ordered with respect to loads (whereas CLFLUSH* do). Now, one can argue that for performance reasons we should should still use "+m" in case we use the CLFLUSH* standin, to avoid flushing a cache line to memory just to bring it back in. +static inline void pcommit(void) +{ + alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8", + X86_FEATURE_PCOMMIT); +} + Should we use an SFENCE as a standin if pcommit is unavailable, in case we end up using CLFLUSHOPT? -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-23 23:03 ` [PATCH v2 0/2] add support for new persistent memory instructions H. Peter Anvin @ 2015-01-24 11:14 ` Borislav Petkov 2015-01-26 19:59 ` Ross Zwisler 2015-01-26 19:51 ` Ross Zwisler 1 sibling, 1 reply; 12+ messages in thread From: Borislav Petkov @ 2015-01-24 11:14 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ross Zwisler, linux-kernel, Ingo Molnar, Thomas Gleixner On Fri, Jan 23, 2015 at 03:03:41PM -0800, H. Peter Anvin wrote: > For the specific case of CLWB, we can use an "m" input rather than a > "+m" output, simply because CLWB (or CLFLUSH* used as a standin for CLWB > doesn't need to be ordered with respect to loads (whereas CLFLUSH* do). Well, we could do something like: volatile struct { char x[64]; } *p = __p; if (static_cpu_has(X86_FEATURE_CLWB)) asm volatile(".byte 0x66,0x0f,0xae,0x30" :: "m" (*p), "a" (p)); else asm volatile(ALTERNATIVE( ".byte 0x3e; clflush (%[pax])", ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ X86_FEATURE_CLFLUSHOPT) : [p] "+m" (*p) : [pax] "a" (p)); which would simplify the alternative macro too. Generated asm looks ok to me (my objdump doesn't know CLWB yet :)): 0000000000000aa0 <myclflush>: aa0: 55 push %rbp aa1: 48 89 e5 mov %rsp,%rbp aa4: eb 0a jmp ab0 <myclflush+0x10> aa6: 48 89 f8 mov %rdi,%rax aa9: 66 0f ae 30 data16 xsaveopt (%rax) aad: 5d pop %rbp aae: c3 retq aaf: 90 nop ab0: 48 89 f8 mov %rdi,%rax ab3: 3e 0f ae 38 clflush %ds:(%rax) ab7: 5d pop %rbp ab8: c3 retq > Should we use an SFENCE as a standin if pcommit is unavailable, in case > we end up using CLFLUSHOPT? Btw, is PCOMMIT a lightweight SFENCE for this persistent memory aspect to make sure stuff has become persistent after executing it? But not all stuff like SFENCE so SFENCE is the bigger hammer? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-24 11:14 ` Borislav Petkov @ 2015-01-26 19:59 ` Ross Zwisler 2015-01-26 21:34 ` Borislav Petkov 0 siblings, 1 reply; 12+ messages in thread From: Ross Zwisler @ 2015-01-26 19:59 UTC (permalink / raw) To: Borislav Petkov Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner On Sat, 2015-01-24 at 12:14 +0100, Borislav Petkov wrote: > On Fri, Jan 23, 2015 at 03:03:41PM -0800, H. Peter Anvin wrote: > > For the specific case of CLWB, we can use an "m" input rather than a > > "+m" output, simply because CLWB (or CLFLUSH* used as a standin for CLWB > > doesn't need to be ordered with respect to loads (whereas CLFLUSH* do). > > Well, we could do something like: > > volatile struct { char x[64]; } *p = __p; > > if (static_cpu_has(X86_FEATURE_CLWB)) > asm volatile(".byte 0x66,0x0f,0xae,0x30" :: "m" (*p), "a" (p)); > else > asm volatile(ALTERNATIVE( > ".byte 0x3e; clflush (%[pax])", > ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ > X86_FEATURE_CLFLUSHOPT) > : [p] "+m" (*p) > : [pax] "a" (p)); > > which would simplify the alternative macro too. This is interesting! I guess I'm confused as to how this solves the ordering issue, though. The "m" input vs "+m" output parameter will tell gcc whether or not the assembly can be reordered at compile time with respect to reads at that same location, correct? So if we have an inline function that could either read or write from gcc's point of view (input vs output parameter, depending on the branch), it seems like it would be forced to fall back to the most restrictive case (assume it will write), and not reorder with respect to reads. If so, you'd end up in the same place as using "+m" output, only now you've got an additional branch instead of a 3-way alternative. Am I misunderstanding this? > Generated asm looks ok to me (my objdump doesn't know CLWB yet :)): > > 0000000000000aa0 <myclflush>: > aa0: 55 push %rbp > aa1: 48 89 e5 mov %rsp,%rbp > aa4: eb 0a jmp ab0 <myclflush+0x10> > aa6: 48 89 f8 mov %rdi,%rax > aa9: 66 0f ae 30 data16 xsaveopt (%rax) > aad: 5d pop %rbp > aae: c3 retq > aaf: 90 nop > ab0: 48 89 f8 mov %rdi,%rax > ab3: 3e 0f ae 38 clflush %ds:(%rax) > ab7: 5d pop %rbp > ab8: c3 retq > > > Should we use an SFENCE as a standin if pcommit is unavailable, in case > > we end up using CLFLUSHOPT? > > Btw, is PCOMMIT a lightweight SFENCE for this persistent memory aspect > to make sure stuff has become persistent after executing it? But not all > stuff like SFENCE so SFENCE is the bigger hammer? Ah, yep, I definitely need to include an example flow in my commit comments. :) Here's a snip from my reply to hpa, to save searching: Both the flushes (wmb/clflushopt/clflush) and the pcommit are ordered by either mfence or sfence. An example function that flushes and commits a buffer could look like this (based on clflush_cache_range): void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); pcommit(); /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-26 19:59 ` Ross Zwisler @ 2015-01-26 21:34 ` Borislav Petkov 2015-01-26 21:50 ` Ross Zwisler 0 siblings, 1 reply; 12+ messages in thread From: Borislav Petkov @ 2015-01-26 21:34 UTC (permalink / raw) To: Ross Zwisler; +Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner On Mon, Jan 26, 2015 at 12:59:29PM -0700, Ross Zwisler wrote: > This is interesting! I guess I'm confused as to how this solves the ordering > issue, though. The "m" input vs "+m" output parameter will tell gcc whether > or not the assembly can be reordered at compile time with respect to reads at > that same location, correct? > > So if we have an inline function that could either read or write from gcc's > point of view (input vs output parameter, depending on the branch), it seems > like it would be forced to fall back to the most restrictive case (assume it > will write), and not reorder with respect to reads. If so, you'd end up in > the same place as using "+m" output, only now you've got an additional branch > instead of a 3-way alternative. > > Am I misunderstanding this? No, you're not, that is the right question. I was simply hypothesizing about how we could do what hpa suggests but I don't have any other ideas about having an "m" and an "+m" in the same inline asm statement. My hunch is, the moment we have an "+m", the reordering would be suppressed and that would not give us the CLWB case where we don't have to suppress reordering wrt reads. > Ah, yep, I definitely need to include an example flow in my commit comments. > :) Here's a snip from my reply to hpa, to save searching: > > Both the flushes (wmb/clflushopt/clflush) and the pcommit are ordered > by either mfence or sfence. > > An example function that flushes and commits a buffer could look like > this (based on clflush_cache_range): > > void flush_and_commit_buffer(void *vaddr, unsigned int size) > { > void *vend = vaddr + size - 1; > > for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) > clwb(vaddr); > > /* Flush any possible final partial cacheline */ > clwb(vend); > > /* > * sfence to order clwb/clflushopt/clflush cache flushes > * mfence via mb() also works > */ > wmb(); > > pcommit(); Oh, so you need an SFENCE to flush out the preceding in-flight writes *and* PCOMMIT for the persistent memory ranges. Ok, makes sense, PCOMMIT deals with the persistent stores. > /* > * sfence to order pcommit > * mfence via mb() also works > */ > wmb(); Doc says PCOMMIT is not ordered wrt loads and SFENCE too. Don't we want to be absolutely conservative here and use MFENCE both times? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-26 21:34 ` Borislav Petkov @ 2015-01-26 21:50 ` Ross Zwisler 2015-01-26 22:39 ` Borislav Petkov 0 siblings, 1 reply; 12+ messages in thread From: Ross Zwisler @ 2015-01-26 21:50 UTC (permalink / raw) To: Borislav Petkov Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner On Mon, 2015-01-26 at 22:34 +0100, Borislav Petkov wrote: > On Mon, Jan 26, 2015 at 12:59:29PM -0700, Ross Zwisler wrote: > > /* > > * sfence to order pcommit > > * mfence via mb() also works > > */ > > wmb(); > > Doc says PCOMMIT is not ordered wrt loads and SFENCE too. Don't we want > to be absolutely conservative here and use MFENCE both times? The code, for easy viewing: void write_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); pcommit(); /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } We can use MFENCE, but I don't think we need to. With SFENCE we will be ordered with respect to stores, and the flushes and pcommit will be ordered with respect to one another. I think you can sprinkle in loads anywhere you want in that flow and everything will work. The worst that will happen is that if you've used clflush or clflushopt you'll have to re-fetch something you just flushed out of the CPU cache hierarchy, but you'll always get correct data from your load and you'll always pcommit valid data to the DIMM. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-26 21:50 ` Ross Zwisler @ 2015-01-26 22:39 ` Borislav Petkov 2015-01-26 23:14 ` Ross Zwisler 0 siblings, 1 reply; 12+ messages in thread From: Borislav Petkov @ 2015-01-26 22:39 UTC (permalink / raw) To: Ross Zwisler; +Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner On Mon, Jan 26, 2015 at 02:50:07PM -0700, Ross Zwisler wrote: > We can use MFENCE, but I don't think we need to. With SFENCE we will > be ordered with respect to stores, and the flushes and pcommit will be > ordered with respect to one another. I think you can sprinkle in loads > anywhere you want in that flow and everything will work. Ok, maybe we should hold down that sequence and this somewhere as a this-is-how-you-should-do-pcommit-properly explanation or so... or do you have an actual patch adding write_and_commit_buffer()? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-26 22:39 ` Borislav Petkov @ 2015-01-26 23:14 ` Ross Zwisler 0 siblings, 0 replies; 12+ messages in thread From: Ross Zwisler @ 2015-01-26 23:14 UTC (permalink / raw) To: Borislav Petkov Cc: H. Peter Anvin, linux-kernel, Ingo Molnar, Thomas Gleixner On Mon, 2015-01-26 at 23:39 +0100, Borislav Petkov wrote: > On Mon, Jan 26, 2015 at 02:50:07PM -0700, Ross Zwisler wrote: > > We can use MFENCE, but I don't think we need to. With SFENCE we will > > be ordered with respect to stores, and the flushes and pcommit will be > > ordered with respect to one another. I think you can sprinkle in loads > > anywhere you want in that flow and everything will work. > > > Ok, maybe we should hold down that sequence and this somewhere as a > this-is-how-you-should-do-pcommit-properly explanation or so... or do > you have an actual patch adding write_and_commit_buffer()? I don't have a patch to add and use this yet. I'll add the flow to the commit messages for both pcommit and clwb. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-23 23:03 ` [PATCH v2 0/2] add support for new persistent memory instructions H. Peter Anvin 2015-01-24 11:14 ` Borislav Petkov @ 2015-01-26 19:51 ` Ross Zwisler 2015-01-26 20:05 ` H. Peter Anvin 1 sibling, 1 reply; 12+ messages in thread From: Ross Zwisler @ 2015-01-26 19:51 UTC (permalink / raw) To: H. Peter Anvin Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Borislav Petkov On Fri, 2015-01-23 at 15:03 -0800, H. Peter Anvin wrote: > On 01/23/2015 12:40 PM, Ross Zwisler wrote: > > This patch set adds support for two new persistent memory instructions, pcommit > > and clwb. These instructions were announced in the document "Intel > > Architecture Instruction Set Extensions Programming Reference" with reference > > number 319433-022. > > > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > > > Please explain in these patch descriptions what the instructions > actually do. Sure, will do. > + volatile struct { char x[64]; } *p = __p; > + > + asm volatile(ALTERNATIVE_2( > + ".byte " __stringify(NOP_DS_PREFIX) "; clflush (%[pax])", > + ".byte 0x66; clflush (%[pax])", /* clflushopt (%%rax) */ > + X86_FEATURE_CLFLUSHOPT, > + ".byte 0x66, 0x0f, 0xae, 0x30", /* clwb (%%rax) */ > + X86_FEATURE_CLWB) > + : [p] "+m" (*p) > + : [pax] "a" (p)); > > For the specific case of CLWB, we can use an "m" input rather than a > "+m" output, simply because CLWB (or CLFLUSH* used as a standin for CLWB > doesn't need to be ordered with respect to loads (whereas CLFLUSH* do). > > Now, one can argue that for performance reasons we should should still > use "+m" in case we use the CLFLUSH* standin, to avoid flushing a cache > line to memory just to bring it back in. Understood, and an interesting point. It seems like we can be correct using either, yea? I guess I'm happy with "+m" output since it's consistent with clflush and clflushopt, and since we avoid the clflush* then read issue. Please let me know if you have a preference. > +static inline void pcommit(void) > +{ > + alternative(ASM_NOP4, ".byte 0x66, 0x0f, 0xae, 0xf8", > + X86_FEATURE_PCOMMIT); > +} > + > > Should we use an SFENCE as a standin if pcommit is unavailable, in case > we end up using CLFLUSHOPT? Ah, sorry, I really need to include an example flow in my patch descriptions to make this more clear. :) Both the flushes (wmb/clflushopt/clflush) and the pcommit are ordered by either mfence or sfence. An example function that flushes and commits a buffer could look like this (based on clflush_cache_range): void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); pcommit(); /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } In this example function I don't begin with a fence because clwb (which may fall back to clflush/clflushopt) will be ordered with respect to either writes or reads and writes depending on whether the argument is given as an input or output parameter. If the platform doesn't support PCOMMIT, you end up with this: void flush_and_commit_buffer(void *vaddr, unsigned int size) { void *vend = vaddr + size - 1; for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size) clwb(vaddr); /* Flush any possible final partial cacheline */ clwb(vend); /* * sfence to order clwb/clflushopt/clflush cache flushes * mfence via mb() also works */ wmb(); nop(); /* from pcommit(), via alternatives */ /* * sfence to order pcommit * mfence via mb() also works */ wmb(); } This is fine, but now you've got two fences in a row. Another slightly more messy choice would be to include the fence in the pcommit assembly, so you either get pcommit + sfence or a pair of NOPs. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/2] add support for new persistent memory instructions 2015-01-26 19:51 ` Ross Zwisler @ 2015-01-26 20:05 ` H. Peter Anvin 0 siblings, 0 replies; 12+ messages in thread From: H. Peter Anvin @ 2015-01-26 20:05 UTC (permalink / raw) To: Ross Zwisler; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Borislav Petkov On 01/26/2015 11:51 AM, Ross Zwisler wrote: > > This is fine, but now you've got two fences in a row. Another slightly more > messy choice would be to include the fence in the pcommit assembly, so > you either get pcommit + sfence or a pair of NOPs. > If that is the required usage pattern, it should be part of the clwb() inline at least. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-01-26 23:15 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-23 20:40 [PATCH v2 0/2] add support for new persistent memory instructions Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 1/2] x86: Add support for the pcommit instruction Ross Zwisler 2015-01-23 20:40 ` [PATCH v2 2/2] x86: Add support for the clwb instruction Ross Zwisler 2015-01-23 23:03 ` [PATCH v2 0/2] add support for new persistent memory instructions H. Peter Anvin 2015-01-24 11:14 ` Borislav Petkov 2015-01-26 19:59 ` Ross Zwisler 2015-01-26 21:34 ` Borislav Petkov 2015-01-26 21:50 ` Ross Zwisler 2015-01-26 22:39 ` Borislav Petkov 2015-01-26 23:14 ` Ross Zwisler 2015-01-26 19:51 ` Ross Zwisler 2015-01-26 20:05 ` H. Peter Anvin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.