dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Add shared workqueue support for idxd driver
@ 2020-09-17 21:15 Dave Jiang
  2020-09-17 21:15 ` [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h Dave Jiang
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 21:15 UTC (permalink / raw)
  To: vkoul, tglx, mingo, bp, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

v4:
- Rebased against latest dmaengine/next tree
- Split out enqcmd and pasid dependency.

V3:
- Rebased against latest dmaengine/next tree.
- Updated API doc with new kernel version and dates.
- Changed to allow driver to load without ENQCMD support.
- Break out some patches that can be sent ahead of this series for inclusion.

v2:
- Dropped device feature enabling (GregKH)
- Dropped PCI device feature enabling (Bjorn)
	- https://members.pcisig.com/wg/PCI-SIG/document/14237
- After some internal discussion, we have decided to hold off on the enabling of DMWR due to the
  following reasons. 1. Most first gen hw will not have the feature bits. 2. First gen hw that
  support the feature are all Root Complex integrated endpoints. 3. PCI devices that are not
  RCiEP’s with this capability won’t surface for a few years so we can wait until we can test the
  full code.
- Dropped special ioremap (hch)
- Added proper support for WQ flush (tony, dan)
- Changed descriptor submission to use sbitmap_queue for blocking. (dan)

Driver stage 1 postings for context: [1]

The patch series has compilation and functional dependency on Fenghua's "Tag application
address space for devices" patch series for the ENQCMD CPU command enumeration and the PASID MSR
support. [2] 

== Background ==
A typical DMA device requires the driver to translate application buffers to hardware addresses,
and a kernel-user transition to notify the hardware of new work. Shared Virtual Addressing (SVA)
allows the processor and device to use the same virtual addresses without requiring software to
translate between the address spaces. ENQCMD is a new instruction on Intel Platforms that allows
user applications to directly notify hardware of new work, much like how doorbells are used in
some hardware, but it carries a payload along with it. ENQCMDS is the supervisor version (ring0)
of ENQCMD.

== ENQCMDS ==
Introduce enqcmds(), a helper funciton that copies an input payload to a 64B aligned
destination and confirms whether the payload was accepted by the device or not.
enqcmds() wraps the new ENQCMDS CPU instruction. The ENQCMDS is a ring 0 CPU instruction that
performs similar to the ENQCMD instruction. Descriptor submission must use ENQCMD(S) for shared
workqueues (swq) on an Intel DSA device. 

== Shared WQ support ==
Introduce shared workqueue (swq) support for the idxd driver. The current idxd driver contains
dedicated workqueue (dwq) support only. A dwq accepts descriptors from a MOVDIR64B instruction.
MOVDIR64B is a posted instruction on the PCIe bus, it does not wait for any response from the
device. If the wq is full, submitted descriptors are dropped. A swq utilizes the ENQCMDS in
ring 0, which is a non-posted instruction. The zero flag would be set to 1 if the device rejects
the descriptor or if the wq is full. A swq can be shared between multiple users
(kernel or userspace) due to not having to keep track of the wq full condition for submission.
A swq requires PASID and can only run with SVA support. 

== IDXD SVA support ==
Add utilization of PASID to support Shared Virtual Addressing (SVA). With PASID support,
the descriptors can be programmed with host virtual address (HVA) rather than IOVA.
The hardware will work with the IOMMU in fulfilling page requests. With SVA support,
a user app using the char device interface can now submit descriptors without having to pin the
virtual memory range it wants to DMA in its own address space. 

The series does not add SVA support for the dmaengine subsystem. That support is coming at a
later time.

[1]: https://lore.kernel.org/lkml/157965011794.73301.15960052071729101309.stgit@djiang5-desk3.ch.intel.com/
[2]: https://lore.kernel.org/lkml/20200916080510.GA32552@8bytes.org/
[3]: https://software.intel.com/en-us/articles/intel-sdm
[4]: https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[5]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification
[6]: https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
[7]: https://intel.github.io/idxd/
[8]: https://github.com/intel/idxd-driver idxd-stage2

---

Dave Jiang (5):
      x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
      x86/asm: add enqcmds() to support ENQCMDS instruction
      dmaengine: idxd: add shared workqueue support
      dmaengine: idxd: clean up descriptors with fault error
      dmaengine: idxd: add ABI documentation for shared wq


 Documentation/ABI/stable/sysfs-driver-dma-idxd |   14 ++
 arch/x86/include/asm/io.h                      |   46 +++++---
 arch/x86/include/asm/special_insns.h           |   17 +++
 drivers/dma/Kconfig                            |   10 ++
 drivers/dma/idxd/cdev.c                        |   49 ++++++++
 drivers/dma/idxd/device.c                      |   91 ++++++++++++++-
 drivers/dma/idxd/dma.c                         |    9 --
 drivers/dma/idxd/idxd.h                        |   33 +++++-
 drivers/dma/idxd/init.c                        |   92 ++++++++++++---
 drivers/dma/idxd/irq.c                         |  143 ++++++++++++++++++++++--
 drivers/dma/idxd/registers.h                   |   14 ++
 drivers/dma/idxd/submit.c                      |   33 +++++-
 drivers/dma/idxd/sysfs.c                       |  127 +++++++++++++++++++++
 13 files changed, 608 insertions(+), 70 deletions(-)

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
  2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
@ 2020-09-17 21:15 ` Dave Jiang
  2020-09-23 10:41   ` Borislav Petkov
  2020-09-17 21:15 ` [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction Dave Jiang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 21:15 UTC (permalink / raw)
  To: vkoul, tglx, mingo, bp, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

The MOVDIR64B instruction can be used by other wrapper instructions. Move
the core asm code to special_insns.h and have iosubmit_cmds512() call the
core asm function.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/io.h            |   17 +++--------------
 arch/x86/include/asm/special_insns.h |   17 +++++++++++++++++
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index e1aa17a468a8..d726459d08e5 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -401,7 +401,7 @@ extern bool phys_mem_access_encrypted(unsigned long phys_addr,
 
 /**
  * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
- * @__dst: destination, in MMIO space (must be 512-bit aligned)
+ * @dst: destination, in MMIO space (must be 512-bit aligned)
  * @src: source
  * @count: number of 512 bits quantities to submit
  *
@@ -412,25 +412,14 @@ extern bool phys_mem_access_encrypted(unsigned long phys_addr,
  * Warning: Do not use this helper unless your driver has checked that the CPU
  * instruction is supported on the platform.
  */
-static inline void iosubmit_cmds512(void __iomem *__dst, const void *src,
+static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
 				    size_t count)
 {
-	/*
-	 * Note that this isn't an "on-stack copy", just definition of "dst"
-	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
-	 * In the MOVDIR64B case that may be needed as you can use the
-	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
-	 * lets the compiler know how much gets clobbered.
-	 */
-	volatile struct { char _[64]; } *dst = __dst;
 	const u8 *from = src;
 	const u8 *end = from + count * 64;
 
 	while (from < end) {
-		/* MOVDIR64B [rdx], rax */
-		asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
-			     : "=m" (dst)
-			     : "d" (from), "a" (dst));
+		movdir64b(dst, from);
 		from += 64;
 	}
 }
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 59a3e13204c3..7bc8e714f37e 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -234,6 +234,23 @@ static inline void clwb(volatile void *__p)
 
 #define nop() asm volatile ("nop")
 
+static inline void movdir64b(void *__dst, const void *src)
+{
+	/*
+	 * Note that this isn't an "on-stack copy", just definition of "dst"
+	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
+	 * In the MOVDIR64B case that may be needed as you can use the
+	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
+	 * lets the compiler know how much gets clobbered.
+	 */
+	volatile struct { char _[64]; } *dst = __dst;
+
+	/* MOVDIR64B [rdx], rax */
+	asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
+		     : "=m" (dst)
+		     : "d" (src), "a" (dst));
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_SPECIAL_INSNS_H */


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction
  2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
  2020-09-17 21:15 ` [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h Dave Jiang
@ 2020-09-17 21:15 ` Dave Jiang
  2020-09-23 11:08   ` Borislav Petkov
  2020-09-17 21:15 ` [PATCH v4 4/5] dmaengine: idxd: clean up descriptors with fault error Dave Jiang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 21:15 UTC (permalink / raw)
  To: vkoul, tglx, mingo, bp, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

Currently, the MOVDIR64B instruction is used to atomically
submit 64-byte work descriptors to devices. Although it can
encounter errors like faults, MOVDIR64B can not report back on
errors from the device itself. This means that MOVDIR64B users
need to separately interact with a device to see if a descriptor
was successfully queued, which slows down device interactions.

ENQCMD and ENQCMDS also atomically submit 64-byte work
descriptors to devices. But, they *can* report back errors
directly from the device, such as if the device was busy, or
there was an error made in composing the descriptor. This
immediate feedback from the submission instruction itself
reduces the number of interactions with the device and can
greatly increase efficiency.

ENQCMD can be used at any privilege level, but can effectively
only submit work on behalf of the current process. ENQCMDS is a
ring0-only instruction and can explicitly specify a process
context instead of being tied to the current process or needing
to reprogram the IA32_PASID MSR.

Use ENQCMDS for work submission within the kernel because a
Process Address ID (PASID) is setup to translate the kernel
virtual address space. This PASID is provided to ENQCMDS from
the descriptor structure submitted to the device and not retrieved
from IA32_PASID MSR, which is setup for the current user address space.

See Intel Software Developer’s Manual for more information on the
instructions.

Add enqcmds() in x86 io.h instead of special_insns.h. MOVDIR64B
instruction can be used for other purposes. A wrapper was introduced
in io.h for its command submission usage. ENQCMDS has a single
purpose of submit 64-byte commands to supported devices and should
be called directly.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/io.h |   29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index d726459d08e5..b7af0bf8a018 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -424,4 +424,33 @@ static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
 	}
 }
 
+/**
+ * enqcmds - copy a 512 bits data unit to single MMIO location
+ * @dst: destination, in MMIO space (must be 512-bit aligned)
+ * @src: source
+ *
+ * Submit data from kernel space to MMIO space, in a unit of 512 bits.
+ * Order of data access is not guaranteed, nor is a memory barrier
+ * performed afterwards. The command returns false (0) on failure, and true (1)
+ * on success.
+ *
+ * Warning: Do not use this helper unless your driver has checked that the CPU
+ * instruction is supported on the platform.
+ */
+static inline bool enqcmds(void __iomem *dst, const void *src)
+{
+	bool retry;
+
+	/* ENQCMDS [rdx], rax */
+	asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90\t\n"
+		     CC_SET(z)
+		     : CC_OUT(z) (retry)
+		     : "a" (dst), "d" (src));
+	/* Submission failure is indicated via EFLAGS.ZF=1 */
+	if (retry)
+		return false;
+
+	return true;
+}
+
 #endif /* _ASM_X86_IO_H */


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/5] dmaengine: idxd: clean up descriptors with fault error
  2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
  2020-09-17 21:15 ` [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h Dave Jiang
  2020-09-17 21:15 ` [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction Dave Jiang
@ 2020-09-17 21:15 ` Dave Jiang
  2020-09-17 21:15 ` [PATCH v4 5/5] dmaengine: idxd: add ABI documentation for shared wq Dave Jiang
  2020-09-17 23:43 ` [PATCH v4 0/5] Add shared workqueue support for idxd driver Randy Dunlap
  4 siblings, 0 replies; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 21:15 UTC (permalink / raw)
  To: vkoul, tglx, mingo, bp, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

Add code to "complete" a descriptor when the descriptor or its completion
address hit a fault error when SVA mode is being used. This error can be
triggered due to bad programming by the user. A lock is introduced in order
to protect the descriptor completion lists since the fault handler will run
from the system work queue after being scheduled in the interrupt handler.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dma/idxd/idxd.h |    5 ++
 drivers/dma/idxd/init.c |    1 
 drivers/dma/idxd/irq.c  |  143 +++++++++++++++++++++++++++++++++++++++++++----
 3 files changed, 137 insertions(+), 12 deletions(-)

diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
index 43a216c42d25..b64b6266ca97 100644
--- a/drivers/dma/idxd/idxd.h
+++ b/drivers/dma/idxd/idxd.h
@@ -34,6 +34,11 @@ struct idxd_irq_entry {
 	int id;
 	struct llist_head pending_llist;
 	struct list_head work_list;
+	/*
+	 * Lock to protect access between irq thread process descriptor
+	 * and irq thread processing error descriptor.
+	 */
+	spinlock_t list_lock;
 };
 
 struct idxd_group {
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 626401a71fdd..1bb7637b02eb 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -97,6 +97,7 @@ static int idxd_setup_interrupts(struct idxd_device *idxd)
 	for (i = 0; i < msixcnt; i++) {
 		idxd->irq_entries[i].id = i;
 		idxd->irq_entries[i].idxd = idxd;
+		spin_lock_init(&idxd->irq_entries[i].list_lock);
 	}
 
 	msix = &idxd->msix_entries[0];
diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c
index 17a65a13fb64..9e6cc55ad22f 100644
--- a/drivers/dma/idxd/irq.c
+++ b/drivers/dma/idxd/irq.c
@@ -11,6 +11,24 @@
 #include "idxd.h"
 #include "registers.h"
 
+enum irq_work_type {
+	IRQ_WORK_NORMAL = 0,
+	IRQ_WORK_PROCESS_FAULT,
+};
+
+struct idxd_fault {
+	struct work_struct work;
+	u64 addr;
+	struct idxd_device *idxd;
+};
+
+static int irq_process_work_list(struct idxd_irq_entry *irq_entry,
+				 enum irq_work_type wtype,
+				 int *processed, u64 data);
+static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry,
+				     enum irq_work_type wtype,
+				     int *processed, u64 data);
+
 static void idxd_device_reinit(struct work_struct *work)
 {
 	struct idxd_device *idxd = container_of(work, struct idxd_device, work);
@@ -44,6 +62,46 @@ static void idxd_device_reinit(struct work_struct *work)
 	idxd_device_wqs_clear_state(idxd);
 }
 
+static void idxd_device_fault_work(struct work_struct *work)
+{
+	struct idxd_fault *fault = container_of(work, struct idxd_fault, work);
+	struct idxd_irq_entry *ie;
+	int i;
+	int processed;
+	int irqcnt = fault->idxd->num_wq_irqs + 1;
+
+	for (i = 1; i < irqcnt; i++) {
+		ie = &fault->idxd->irq_entries[i];
+		irq_process_work_list(ie, IRQ_WORK_PROCESS_FAULT,
+				      &processed, fault->addr);
+		if (processed)
+			break;
+
+		irq_process_pending_llist(ie, IRQ_WORK_PROCESS_FAULT,
+					  &processed, fault->addr);
+		if (processed)
+			break;
+	}
+
+	kfree(fault);
+}
+
+static int idxd_device_schedule_fault_process(struct idxd_device *idxd,
+					      u64 fault_addr)
+{
+	struct idxd_fault *fault;
+
+	fault = kmalloc(sizeof(*fault), GFP_ATOMIC);
+	if (!fault)
+		return -ENOMEM;
+
+	fault->addr = fault_addr;
+	fault->idxd = idxd;
+	INIT_WORK(&fault->work, idxd_device_fault_work);
+	queue_work(idxd->wq, &fault->work);
+	return 0;
+}
+
 irqreturn_t idxd_irq_handler(int vec, void *data)
 {
 	struct idxd_irq_entry *irq_entry = data;
@@ -125,6 +183,16 @@ irqreturn_t idxd_misc_thread(int vec, void *data)
 	if (!err)
 		goto out;
 
+	/*
+	 * This case should rarely happen and typically is due to software
+	 * programming error by the driver.
+	 */
+	if (idxd->sw_err.valid &&
+	    idxd->sw_err.desc_valid &&
+	    idxd->sw_err.fault_addr)
+		idxd_device_schedule_fault_process(idxd,
+						   idxd->sw_err.fault_addr);
+
 	gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET);
 	if (gensts.state == IDXD_DEVICE_STATE_HALT) {
 		idxd->state = IDXD_DEV_HALTED;
@@ -152,57 +220,106 @@ irqreturn_t idxd_misc_thread(int vec, void *data)
 	return IRQ_HANDLED;
 }
 
+static bool process_fault(struct idxd_desc *desc, u64 fault_addr)
+{
+	if ((u64)desc->hw == fault_addr ||
+	    (u64)desc->completion == fault_addr) {
+		idxd_dma_complete_txd(desc, IDXD_COMPLETE_DEV_FAIL);
+		return true;
+	}
+
+	return false;
+}
+
+static bool complete_desc(struct idxd_desc *desc)
+{
+	if (desc->completion->status) {
+		idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+		return true;
+	}
+
+	return false;
+}
+
 static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry,
-				     int *processed)
+				     enum irq_work_type wtype,
+				     int *processed, u64 data)
 {
 	struct idxd_desc *desc, *t;
 	struct llist_node *head;
 	int queued = 0;
+	bool completed = false;
+	unsigned long flags;
 
 	*processed = 0;
 	head = llist_del_all(&irq_entry->pending_llist);
 	if (!head)
-		return 0;
+		goto out;
 
 	llist_for_each_entry_safe(desc, t, head, llnode) {
-		if (desc->completion->status) {
-			idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+		if (wtype == IRQ_WORK_NORMAL)
+			completed = complete_desc(desc);
+		else if (wtype == IRQ_WORK_PROCESS_FAULT)
+			completed = process_fault(desc, data);
+
+		if (completed) {
 			idxd_free_desc(desc->wq, desc);
 			(*processed)++;
+			if (wtype == IRQ_WORK_PROCESS_FAULT)
+				break;
 		} else {
-			list_add_tail(&desc->list, &irq_entry->work_list);
+			spin_lock_irqsave(&irq_entry->list_lock, flags);
+			list_add_tail(&desc->list,
+				      &irq_entry->work_list);
+			spin_unlock_irqrestore(&irq_entry->list_lock, flags);
 			queued++;
 		}
 	}
 
+ out:
 	return queued;
 }
 
 static int irq_process_work_list(struct idxd_irq_entry *irq_entry,
-				 int *processed)
+				 enum irq_work_type wtype,
+				 int *processed, u64 data)
 {
 	struct list_head *node, *next;
 	int queued = 0;
+	bool completed = false;
+	unsigned long flags;
 
 	*processed = 0;
+	spin_lock_irqsave(&irq_entry->list_lock, flags);
 	if (list_empty(&irq_entry->work_list))
-		return 0;
+		goto out;
 
 	list_for_each_safe(node, next, &irq_entry->work_list) {
 		struct idxd_desc *desc =
 			container_of(node, struct idxd_desc, list);
 
-		if (desc->completion->status) {
+		spin_unlock_irqrestore(&irq_entry->list_lock, flags);
+		if (wtype == IRQ_WORK_NORMAL)
+			completed = complete_desc(desc);
+		else if (wtype == IRQ_WORK_PROCESS_FAULT)
+			completed = process_fault(desc, data);
+
+		if (completed) {
+			spin_lock_irqsave(&irq_entry->list_lock, flags);
 			list_del(&desc->list);
-			/* process and callback */
-			idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
+			spin_unlock_irqrestore(&irq_entry->list_lock, flags);
 			idxd_free_desc(desc->wq, desc);
 			(*processed)++;
+			if (wtype == IRQ_WORK_PROCESS_FAULT)
+				return queued;
 		} else {
 			queued++;
 		}
+		spin_lock_irqsave(&irq_entry->list_lock, flags);
 	}
 
+ out:
+	spin_unlock_irqrestore(&irq_entry->list_lock, flags);
 	return queued;
 }
 
@@ -230,12 +347,14 @@ static int idxd_desc_process(struct idxd_irq_entry *irq_entry)
 	 * 5. Repeat until no more descriptors.
 	 */
 	do {
-		rc = irq_process_work_list(irq_entry, &processed);
+		rc = irq_process_work_list(irq_entry, IRQ_WORK_NORMAL,
+					   &processed, 0);
 		total += processed;
 		if (rc != 0)
 			continue;
 
-		rc = irq_process_pending_llist(irq_entry, &processed);
+		rc = irq_process_pending_llist(irq_entry, IRQ_WORK_NORMAL,
+					       &processed, 0);
 		total += processed;
 	} while (rc != 0);
 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 5/5] dmaengine: idxd: add ABI documentation for shared wq
  2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
                   ` (2 preceding siblings ...)
  2020-09-17 21:15 ` [PATCH v4 4/5] dmaengine: idxd: clean up descriptors with fault error Dave Jiang
@ 2020-09-17 21:15 ` Dave Jiang
  2020-09-17 23:43 ` [PATCH v4 0/5] Add shared workqueue support for idxd driver Randy Dunlap
  4 siblings, 0 replies; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 21:15 UTC (permalink / raw)
  To: vkoul, tglx, mingo, bp, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

Add the sysfs attribute bits in ABI/stable for shared wq support.

Signed-off-by: Jing Lin <jing.lin@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 Documentation/ABI/stable/sysfs-driver-dma-idxd |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
index b44183880935..42d3dc03ffea 100644
--- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
+++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
@@ -77,6 +77,13 @@ Contact:        dmaengine@vger.kernel.org
 Description:    The operation capability bit mask specify the operation types
 		supported by the this device.
 
+What:		/sys/bus/dsa/devices/dsa<m>/pasid_enabled
+Date:		Sep 17, 2020
+KernelVersion:	5.10.0
+Contact:	dmaengine@vger.kernel.org
+Description:	To indicate if PASID (process address space identifier) is
+		enabled or not for this device.
+
 What:           /sys/bus/dsa/devices/dsa<m>/state
 Date:           Oct 25, 2019
 KernelVersion:  5.6.0
@@ -122,6 +129,13 @@ KernelVersion:	5.10.0
 Contact:	dmaengine@vger.kernel.org
 Description:	The last executed device administrative command's status/error.
 
+What:		/sys/bus/dsa/devices/wq<m>.<n>/block_on_fault
+Date:		Sept 17, 2020
+KernelVersion:	5.10.0
+Contact:	dmaengine@vger.kernel.org
+Description:	To indicate block on fault is allowed or not for the work queue
+		to support on demand paging.
+
 What:           /sys/bus/dsa/devices/wq<m>.<n>/group_id
 Date:           Oct 25, 2019
 KernelVersion:  5.6.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/5] Add shared workqueue support for idxd driver
  2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
                   ` (3 preceding siblings ...)
  2020-09-17 21:15 ` [PATCH v4 5/5] dmaengine: idxd: add ABI documentation for shared wq Dave Jiang
@ 2020-09-17 23:43 ` Randy Dunlap
  2020-09-17 23:51   ` Dave Jiang
  4 siblings, 1 reply; 13+ messages in thread
From: Randy Dunlap @ 2020-09-17 23:43 UTC (permalink / raw)
  To: Dave Jiang, vkoul, tglx, mingo, bp, dan.j.williams, tony.luck,
	jing.lin, ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

Hi,

On 9/17/20 2:15 PM, Dave Jiang wrote:
> 
> ---
> 
> Dave Jiang (5):
>       x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
>       x86/asm: add enqcmds() to support ENQCMDS instruction
>       dmaengine: idxd: add shared workqueue support
>       dmaengine: idxd: clean up descriptors with fault error
>       dmaengine: idxd: add ABI documentation for shared wq
> 

I don't see patch 3/5 in my inbox nor at https://lore.kernel.org/dmaengine/

Did the email monster eat it?

thanks.

> 
>  Documentation/ABI/stable/sysfs-driver-dma-idxd |   14 ++
>  arch/x86/include/asm/io.h                      |   46 +++++---
>  arch/x86/include/asm/special_insns.h           |   17 +++
>  drivers/dma/Kconfig                            |   10 ++
>  drivers/dma/idxd/cdev.c                        |   49 ++++++++
>  drivers/dma/idxd/device.c                      |   91 ++++++++++++++-
>  drivers/dma/idxd/dma.c                         |    9 --
>  drivers/dma/idxd/idxd.h                        |   33 +++++-
>  drivers/dma/idxd/init.c                        |   92 ++++++++++++---
>  drivers/dma/idxd/irq.c                         |  143 ++++++++++++++++++++++--
>  drivers/dma/idxd/registers.h                   |   14 ++
>  drivers/dma/idxd/submit.c                      |   33 +++++-
>  drivers/dma/idxd/sysfs.c                       |  127 +++++++++++++++++++++
>  13 files changed, 608 insertions(+), 70 deletions(-)
> 
> --
> 

-- 
~Randy


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/5] Add shared workqueue support for idxd driver
  2020-09-17 23:43 ` [PATCH v4 0/5] Add shared workqueue support for idxd driver Randy Dunlap
@ 2020-09-17 23:51   ` Dave Jiang
  2020-09-17 23:56     ` Randy Dunlap
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2020-09-17 23:51 UTC (permalink / raw)
  To: Randy Dunlap, vkoul, tglx, mingo, bp, dan.j.williams, tony.luck,
	jing.lin, ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel



On 9/17/2020 4:43 PM, Randy Dunlap wrote:
> Hi,
> 
> On 9/17/20 2:15 PM, Dave Jiang wrote:
>>
>> ---
>>
>> Dave Jiang (5):
>>        x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
>>        x86/asm: add enqcmds() to support ENQCMDS instruction
>>        dmaengine: idxd: add shared workqueue support
>>        dmaengine: idxd: clean up descriptors with fault error
>>        dmaengine: idxd: add ABI documentation for shared wq
>>
> 
> I don't see patch 3/5 in my inbox nor at https://lore.kernel.org/dmaengine/
> 
> Did the email monster eat it?

Grrrrrr looks like Intel email server ate it. Everyone on cc list got it. But 
does not look like it made it to any of the mailing lists. I'll resend 3/5.

> 
> thanks.
> 
>>
>>   Documentation/ABI/stable/sysfs-driver-dma-idxd |   14 ++
>>   arch/x86/include/asm/io.h                      |   46 +++++---
>>   arch/x86/include/asm/special_insns.h           |   17 +++
>>   drivers/dma/Kconfig                            |   10 ++
>>   drivers/dma/idxd/cdev.c                        |   49 ++++++++
>>   drivers/dma/idxd/device.c                      |   91 ++++++++++++++-
>>   drivers/dma/idxd/dma.c                         |    9 --
>>   drivers/dma/idxd/idxd.h                        |   33 +++++-
>>   drivers/dma/idxd/init.c                        |   92 ++++++++++++---
>>   drivers/dma/idxd/irq.c                         |  143 ++++++++++++++++++++++--
>>   drivers/dma/idxd/registers.h                   |   14 ++
>>   drivers/dma/idxd/submit.c                      |   33 +++++-
>>   drivers/dma/idxd/sysfs.c                       |  127 +++++++++++++++++++++
>>   13 files changed, 608 insertions(+), 70 deletions(-)
>>
>> --
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/5] Add shared workqueue support for idxd driver
  2020-09-17 23:51   ` Dave Jiang
@ 2020-09-17 23:56     ` Randy Dunlap
  0 siblings, 0 replies; 13+ messages in thread
From: Randy Dunlap @ 2020-09-17 23:56 UTC (permalink / raw)
  To: Dave Jiang, vkoul, tglx, mingo, bp, dan.j.williams, tony.luck,
	jing.lin, ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian
  Cc: dmaengine, linux-kernel

On 9/17/20 4:51 PM, Dave Jiang wrote:
> 
> 
> On 9/17/2020 4:43 PM, Randy Dunlap wrote:
>> Hi,
>>
>> On 9/17/20 2:15 PM, Dave Jiang wrote:
>>>
>>> ---
>>>
>>> Dave Jiang (5):
>>>        x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
>>>        x86/asm: add enqcmds() to support ENQCMDS instruction
>>>        dmaengine: idxd: add shared workqueue support
>>>        dmaengine: idxd: clean up descriptors with fault error
>>>        dmaengine: idxd: add ABI documentation for shared wq
>>>
>>
>> I don't see patch 3/5 in my inbox nor at https://lore.kernel.org/dmaengine/
>>
>> Did the email monster eat it?
> 
> Grrrrrr looks like Intel email server ate it. Everyone on cc list got it. But does not look like it made it to any of the mailing lists. I'll resend 3/5.

Got it. Thanks.

>>
>> thanks.
>>
>>>
>>>   Documentation/ABI/stable/sysfs-driver-dma-idxd |   14 ++
>>>   arch/x86/include/asm/io.h                      |   46 +++++---
>>>   arch/x86/include/asm/special_insns.h           |   17 +++
>>>   drivers/dma/Kconfig                            |   10 ++
>>>   drivers/dma/idxd/cdev.c                        |   49 ++++++++
>>>   drivers/dma/idxd/device.c                      |   91 ++++++++++++++-
>>>   drivers/dma/idxd/dma.c                         |    9 --
>>>   drivers/dma/idxd/idxd.h                        |   33 +++++-
>>>   drivers/dma/idxd/init.c                        |   92 ++++++++++++---
>>>   drivers/dma/idxd/irq.c                         |  143 ++++++++++++++++++++++--
>>>   drivers/dma/idxd/registers.h                   |   14 ++
>>>   drivers/dma/idxd/submit.c                      |   33 +++++-
>>>   drivers/dma/idxd/sysfs.c                       |  127 +++++++++++++++++++++
>>>   13 files changed, 608 insertions(+), 70 deletions(-)
>>>
>>> -- 
>>>
>>


-- 
~Randy


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
  2020-09-17 21:15 ` [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h Dave Jiang
@ 2020-09-23 10:41   ` Borislav Petkov
  2020-09-23 15:43     ` Dave Jiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-09-23 10:41 UTC (permalink / raw)
  To: Dave Jiang
  Cc: vkoul, tglx, mingo, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian, dmaengine,
	linux-kernel

> Subject: Re: [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h

Start patch name with a capital letter: "Move the asm definition.."

Also, calling stuff "raw" and "core" is misleading in the kernel context
- you wanna say simply: "Carve out a generic movdir64b() helper... "

On Thu, Sep 17, 2020 at 02:15:16PM -0700, Dave Jiang wrote:
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 59a3e13204c3..7bc8e714f37e 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -234,6 +234,23 @@ static inline void clwb(volatile void *__p)
>  
>  #define nop() asm volatile ("nop")
>  
> +static inline void movdir64b(void *__dst, const void *src)

Make __dst be the function local variable name and keep "dst", i.e.,
without the underscores, the function parameter name.

> +	/*
> +	 * Note that this isn't an "on-stack copy", just definition of "dst"
> +	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
> +	 * In the MOVDIR64B case that may be needed as you can use the
> +	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
> +	 * lets the compiler know how much gets clobbered.
> +	 */
> +	volatile struct { char _[64]; } *dst = __dst;
> +
> +	/* MOVDIR64B [rdx], rax */
> +	asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
> +		     : "=m" (dst)
> +		     : "d" (src), "a" (dst));
> +}
> +
>  #endif /* __KERNEL__ */
>  
>  #endif /* _ASM_X86_SPECIAL_INSNS_H */
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction
  2020-09-17 21:15 ` [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction Dave Jiang
@ 2020-09-23 11:08   ` Borislav Petkov
  2020-09-23 15:47     ` Dave Jiang
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2020-09-23 11:08 UTC (permalink / raw)
  To: Dave Jiang
  Cc: vkoul, tglx, mingo, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian, dmaengine,
	linux-kernel

On Thu, Sep 17, 2020 at 02:15:23PM -0700, Dave Jiang wrote:
> Add enqcmds() in x86 io.h instead of special_insns.h.

Why? It is an asm wrapper for a special instruction.

> MOVDIR64B
> instruction can be used for other purposes. A wrapper was introduced
> in io.h for its command submission usage. ENQCMDS has a single
> purpose of submit 64-byte commands to supported devices and should
> be called directly.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/include/asm/io.h |   29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index d726459d08e5..b7af0bf8a018 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -424,4 +424,33 @@ static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
>  	}
>  }
>  
> +/**
> + * enqcmds - copy a 512 bits data unit to single MMIO location

Your #319433 doc says

"ENQCMDS — Enqueue Command Supervisor"

Now *how* that enqueueing is done you can explain in the comment below.

> + * @dst: destination, in MMIO space (must be 512-bit aligned)
> + * @src: source
> + *
> + * Submit data from kernel space to MMIO space, in a unit of 512 bits.
> + * Order of data access is not guaranteed, nor is a memory barrier
> + * performed afterwards. The command returns false (0) on failure, and true (1)
> + * on success.

The command or the function?

From what I see below, the instruction sets ZF=1 to denote that it needs
to be retried and ZF=0 means success, as the doc says. And in good UNIX
tradition, 0 means usually success and !0 failure.

So why are you flipping that?

> + * Warning: Do not use this helper unless your driver has checked that the CPU
> + * instruction is supported on the platform.
> + */
> +static inline bool enqcmds(void __iomem *dst, const void *src)
> +{
> +	bool retry;
> +
> +	/* ENQCMDS [rdx], rax */
> +	asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90\t\n"
								    ^^^^
No need for those last two chars.

> +		     CC_SET(z)
> +		     : CC_OUT(z) (retry)
> +		     : "a" (dst), "d" (src));

<---- newline here.

> +	/* Submission failure is indicated via EFLAGS.ZF=1 */
> +	if (retry)
> +		return false;
> +
> +	return true;
> +}
> +
>  #endif /* _ASM_X86_IO_H */

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
  2020-09-23 10:41   ` Borislav Petkov
@ 2020-09-23 15:43     ` Dave Jiang
  2020-09-23 16:00       ` David Laight
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2020-09-23 15:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: vkoul, tglx, mingo, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian, dmaengine,
	linux-kernel



On 9/23/2020 3:41 AM, Borislav Petkov wrote:
>> Subject: Re: [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
> 
> Start patch name with a capital letter: "Move the asm definition.."
> 
> Also, calling stuff "raw" and "core" is misleading in the kernel context
> - you wanna say simply: "Carve out a generic movdir64b() helper... "

Ok I will update

> 
> On Thu, Sep 17, 2020 at 02:15:16PM -0700, Dave Jiang wrote:
>> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
>> index 59a3e13204c3..7bc8e714f37e 100644
>> --- a/arch/x86/include/asm/special_insns.h
>> +++ b/arch/x86/include/asm/special_insns.h
>> @@ -234,6 +234,23 @@ static inline void clwb(volatile void *__p)
>>   
>>   #define nop() asm volatile ("nop")
>>   
>> +static inline void movdir64b(void *__dst, const void *src)
> 
> Make __dst be the function local variable name and keep "dst", i.e.,
> without the underscores, the function parameter name.

Ok will fix

> 
>> +	/*
>> +	 * Note that this isn't an "on-stack copy", just definition of "dst"
>> +	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
>> +	 * In the MOVDIR64B case that may be needed as you can use the
>> +	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
>> +	 * lets the compiler know how much gets clobbered.
>> +	 */
>> +	volatile struct { char _[64]; } *dst = __dst;
>> +
>> +	/* MOVDIR64B [rdx], rax */
>> +	asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
>> +		     : "=m" (dst)
>> +		     : "d" (src), "a" (dst));
>> +}
>> +
>>   #endif /* __KERNEL__ */
>>   
>>   #endif /* _ASM_X86_SPECIAL_INSNS_H */
>>
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction
  2020-09-23 11:08   ` Borislav Petkov
@ 2020-09-23 15:47     ` Dave Jiang
  0 siblings, 0 replies; 13+ messages in thread
From: Dave Jiang @ 2020-09-23 15:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: vkoul, tglx, mingo, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian, dmaengine,
	linux-kernel



On 9/23/2020 4:08 AM, Borislav Petkov wrote:
> On Thu, Sep 17, 2020 at 02:15:23PM -0700, Dave Jiang wrote:
>> Add enqcmds() in x86 io.h instead of special_insns.h.
> 
> Why? It is an asm wrapper for a special instruction.

Ok will move.

> 
>> MOVDIR64B
>> instruction can be used for other purposes. A wrapper was introduced
>> in io.h for its command submission usage. ENQCMDS has a single
>> purpose of submit 64-byte commands to supported devices and should
>> be called directly.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>> ---
>>   arch/x86/include/asm/io.h |   29 +++++++++++++++++++++++++++++
>>   1 file changed, 29 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
>> index d726459d08e5..b7af0bf8a018 100644
>> --- a/arch/x86/include/asm/io.h
>> +++ b/arch/x86/include/asm/io.h
>> @@ -424,4 +424,33 @@ static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
>>   	}
>>   }
>>   
>> +/**
>> + * enqcmds - copy a 512 bits data unit to single MMIO location
> 
> Your #319433 doc says
> 
> "ENQCMDS — Enqueue Command Supervisor"
> 
> Now *how* that enqueueing is done you can explain in the comment below.

Ok will add.

> 
>> + * @dst: destination, in MMIO space (must be 512-bit aligned)
>> + * @src: source
>> + *
>> + * Submit data from kernel space to MMIO space, in a unit of 512 bits.
>> + * Order of data access is not guaranteed, nor is a memory barrier
>> + * performed afterwards. The command returns false (0) on failure, and true (1)
>> + * on success.
> 
> The command or the function?

Function. Will fix.
> 
>  From what I see below, the instruction sets ZF=1 to denote that it needs
> to be retried and ZF=0 means success, as the doc says. And in good UNIX
> tradition, 0 means usually success and !0 failure.
> 
> So why are you flipping that?

Ok will return 0 for success and -ERETRY for failure.

> 
>> + * Warning: Do not use this helper unless your driver has checked that the CPU
>> + * instruction is supported on the platform.
>> + */
>> +static inline bool enqcmds(void __iomem *dst, const void *src)
>> +{
>> +	bool retry;
>> +
>> +	/* ENQCMDS [rdx], rax */
>> +	asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90\t\n"
> 								    ^^^^
> No need for those last two chars.

Ok will remove.

> 
>> +		     CC_SET(z)
>> +		     : CC_OUT(z) (retry)
>> +		     : "a" (dst), "d" (src));
> 
> <---- newline here.

Will fix.

> 
>> +	/* Submission failure is indicated via EFLAGS.ZF=1 */
>> +	if (retry)
>> +		return false;
>> +
>> +	return true;
>> +}
>> +
>>   #endif /* _ASM_X86_IO_H */
> 
> Thx.
> 

Thank you very much for reviewing Boris. Very much appreciated!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h
  2020-09-23 15:43     ` Dave Jiang
@ 2020-09-23 16:00       ` David Laight
  0 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2020-09-23 16:00 UTC (permalink / raw)
  To: 'Dave Jiang', Borislav Petkov
  Cc: vkoul, tglx, mingo, dan.j.williams, tony.luck, jing.lin,
	ashok.raj, sanjay.k.kumar, fenghua.yu, kevin.tian, dmaengine,
	linux-kernel

From: Dave Jiang
> Sent: 23 September 2020 16:43
...
> >
> > On Thu, Sep 17, 2020 at 02:15:16PM -0700, Dave Jiang wrote:
> >> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> >> index 59a3e13204c3..7bc8e714f37e 100644
> >> --- a/arch/x86/include/asm/special_insns.h
> >> +++ b/arch/x86/include/asm/special_insns.h
> >> @@ -234,6 +234,23 @@ static inline void clwb(volatile void *__p)
> >>
> >>   #define nop() asm volatile ("nop")
> >>
> >> +static inline void movdir64b(void *__dst, const void *src)
> >
> > Make __dst be the function local variable name and keep "dst", i.e.,
> > without the underscores, the function parameter name.
> 
> Ok will fix
> 
> >
> >> +	/*
> >> +	 * Note that this isn't an "on-stack copy", just definition of "dst"
> >> +	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
> >> +	 * In the MOVDIR64B case that may be needed as you can use the
> >> +	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
> >> +	 * lets the compiler know how much gets clobbered.
> >> +	 */
> >> +	volatile struct { char _[64]; } *dst = __dst;
> >> +
> >> +	/* MOVDIR64B [rdx], rax */
> >> +	asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
> >> +		     : "=m" (dst)
> >> +		     : "d" (src), "a" (dst));
> >> +}

Since 'dst' needs to be 64byte aligned it isn't clear that 'void *'
is the right type for 'dst'.
At least add a comment.

Your asm constraints are also just wrong.

There is no real point specifying "=m" (dst) as an output.
The write rather bypasses the cache and the caller better
know what they are doing.

OTOH you'd better add "m" ((struct { char _[64];} *)src) as
an input constraint.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-23 16:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-17 21:15 [PATCH v4 0/5] Add shared workqueue support for idxd driver Dave Jiang
2020-09-17 21:15 ` [PATCH v4 1/5] x86/asm: move the raw asm in iosubmit_cmds512() to special_insns.h Dave Jiang
2020-09-23 10:41   ` Borislav Petkov
2020-09-23 15:43     ` Dave Jiang
2020-09-23 16:00       ` David Laight
2020-09-17 21:15 ` [PATCH v4 2/5] x86/asm: add enqcmds() to support ENQCMDS instruction Dave Jiang
2020-09-23 11:08   ` Borislav Petkov
2020-09-23 15:47     ` Dave Jiang
2020-09-17 21:15 ` [PATCH v4 4/5] dmaengine: idxd: clean up descriptors with fault error Dave Jiang
2020-09-17 21:15 ` [PATCH v4 5/5] dmaengine: idxd: add ABI documentation for shared wq Dave Jiang
2020-09-17 23:43 ` [PATCH v4 0/5] Add shared workqueue support for idxd driver Randy Dunlap
2020-09-17 23:51   ` Dave Jiang
2020-09-17 23:56     ` Randy Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).