archive mirror
 help / color / mirror / Atom feed
From: Dave Jiang <>
Subject: [PATCH v5 1/9] x86/asm: add iosubmit_cmds512() based on MOVDIR64B CPU instruction
Date: Tue, 21 Jan 2020 16:43:41 -0700	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

With the introduction of MOVDIR64B instruction, there is now an instruction
that can write 64 bytes of data atomically.

Quoting from Intel SDM:
"There is no atomicity guarantee provided for the 64-byte load operation
from source address, and processor implementations may use multiple
load operations to read the 64-bytes. The 64-byte direct-store issued
by MOVDIR64B guarantees 64-byte write-completion atomicity. This means
that the data arrives at the destination in a single undivided 64-byte
write transaction."

We have identified at least 3 different use cases for this instruction in
the format of func(dst, src, count):
1) Clear poison / Initialize MKTME memory
   @dst is normal memory.
   @src in normal memory. Does not increment. (Copy same line to all
   @count (to clear/init multiple lines)
2) Submit command(s) to new devices
   @dst is a special MMIO region for a device. Does not increment.
   @src is normal memory. Increments.
   @count usually is 1, but can be multiple.
3) Copy to iomem in big chunks
   @dst is iomem and increments
   @src in normal memory and increments
   @count is number of chunks to copy

Add support for case #2 to support device that will accept commands via
this instruction. We provide a @count in order to submit a batch of
preprogrammed descriptors in virtually contiguous memory. This
allows the caller to submit multiple descriptors to a device with a single
submission. The special device requires the entire 64bytes descriptor to
be written atomically and will accept MOVDIR64B instruction.

Signed-off-by: Dave Jiang <>
Acked-by: Borislav Petkov <>
 arch/x86/include/asm/io.h |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 9997521fc5cd..e1aa17a468a8 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -399,4 +399,40 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset,
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
 				      unsigned long size);
+ * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
+ * @__dst: destination, in MMIO space (must be 512-bit aligned)
+ * @src: source
+ * @count: number of 512 bits quantities to submit
+ *
+ * Submit data from kernel space to MMIO space, in units of 512 bits at a
+ * time.  Order of access is not guaranteed, nor is a memory barrier
+ * performed afterwards.
+ *
+ * Warning: Do not use this helper unless your driver has checked that the CPU
+ * instruction is supported on the platform.
+ */
+static inline void iosubmit_cmds512(void __iomem *__dst, const void *src,
+				    size_t count)
+	/*
+	 * Note that this isn't an "on-stack copy", just definition of "dst"
+	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
+	 * In the MOVDIR64B case that may be needed as you can use the
+	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
+	 * lets the compiler know how much gets clobbered.
+	 */
+	volatile struct { char _[64]; } *dst = __dst;
+	const u8 *from = src;
+	const u8 *end = from + count * 64;
+	while (from < end) {
+		/* MOVDIR64B [rdx], rax */
+		asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
+			     : "=m" (dst)
+			     : "d" (from), "a" (dst));
+		from += 64;
+	}
 #endif /* _ASM_X86_IO_H */

  reply	other threads:[~2020-01-21 23:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-21 23:43 [PATCH v5 0/9] idxd driver for Intel Data Streaming Accelerator Dave Jiang
2020-01-21 23:43 ` Dave Jiang [this message]
2020-01-21 23:43 ` [PATCH v5 2/9] dmaengine: break out channel registration Dave Jiang
2020-01-21 23:43 ` [PATCH v5 3/9] dmaengine: add support to dynamic register/unregister of channels Dave Jiang
2020-01-21 23:43 ` [PATCH v5 4/9] dmaengine: idxd: Init and probe for Intel data accelerators Dave Jiang
2020-01-21 23:44 ` [PATCH v5 5/9] dmaengine: idxd: add configuration component of driver Dave Jiang
2020-01-21 23:44 ` [PATCH v5 6/9] dmaengine: idxd: add sysfs ABI for idxd driver Dave Jiang
2020-01-21 23:44 ` [PATCH v5 7/9] dmaengine: idxd: add descriptor manipulation routines Dave Jiang
2020-01-21 23:44 ` [PATCH v5 8/9] dmaengine: idxd: connect idxd to dmaengine subsystem Dave Jiang
2020-01-21 23:44 ` [PATCH v5 9/9] dmaengine: idxd: add char driver to expose submission portal to userland Dave Jiang
2020-01-24  5:49 ` [PATCH v5 0/9] idxd driver for Intel Data Streaming Accelerator Vinod Koul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).