All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] x86: tag application address space for devices
@ 2020-03-30 19:33 ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Typical hardware devices require a driver stack to translate application
buffers to hardware addresses, and a kernel-user transition to notify the
hardware of new work. What if both the translation and transition overhead
could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
Applications map portals in their local-address-space and directly submit
work to them using a new instruction.

This series implements management of a new MSR (MSR_IA32_PASID). This new
MSR allows an application address space to be associated with what the PCIe
spec calls a Process Address Space ID (PASID). This PASID tag is carried
along with all requests between applications and devices and allows devices
to interact with the process address space.

SVA and ENQCMD enabled device drivers will use this series in the future.
For example, it will be used by the phase 2 DSA driver which will be
released with SVA and ENQCMD support as explained in:
https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator

This series only provides simple and basic support for the MSR as follows:
1. Explain different various technical terms used in the series (patch 1).
2. Enumerate support for ENQCMD in the processor (patch 2).
3. Handle FPU PASID state and the MSR during context switch (patches 3-4).
4. Allocate and free PASID for a process (patch 5).
5. Fix up the PASID MSR in #GP handler when one thread in a process
   executes ENQCMD for the first time (patches 6).
6. Clear PASID state for forked and cloned thread (patch 7).

And this patch series needs support from supervisor states patch set:
https://lore.kernel.org/lkml/20200328164307.17497-1-yu-cheng.yu@intel.com/

The v3 supervisor states series, this patch series, and DSA phase 2 series
(to be released shortly in idxd driver) can be cloned from:
https://github.com/intel/idxd-driver.git     idxd-stage2

References:
1. Detailed information on the ENQCMD/ENQCMDS instructions and the
IA32_PASID MSR can be found in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

2. Detailed information on DSA can be found in DSA specification:
https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Ashok Raj (1):
  docs: x86: Add a documentation for ENQCMD

Fenghua Yu (5):
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  x86/msr-index: Define IA32_PASID MSR
  x86/mmu: Allocate/free PASID
  x86/traps: Fix up invalid PASID
  x86/process: Clear PASID state for a newly forked/cloned thread

Yu-cheng Yu (1):
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature

 Documentation/x86/enqcmd.rst       | 185 +++++++++++++++++++++++++++++
 arch/x86/include/asm/cpufeatures.h |   1 +
 arch/x86/include/asm/fpu/types.h   |  10 ++
 arch/x86/include/asm/fpu/xstate.h  |   2 +-
 arch/x86/include/asm/iommu.h       |   3 +
 arch/x86/include/asm/mmu.h         |   4 +
 arch/x86/include/asm/mmu_context.h |  14 +++
 arch/x86/include/asm/msr-index.h   |   3 +
 arch/x86/kernel/cpu/cpuid-deps.c   |   1 +
 arch/x86/kernel/fpu/xstate.c       |   4 +
 arch/x86/kernel/process.c          |  13 ++
 arch/x86/kernel/traps.c            |  17 +++
 drivers/iommu/intel-svm.c          | 119 +++++++++++++++++--
 13 files changed, 367 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/x86/enqcmd.rst

-- 
2.19.1


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 0/7] x86: tag application address space for devices
@ 2020-03-30 19:33 ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Typical hardware devices require a driver stack to translate application
buffers to hardware addresses, and a kernel-user transition to notify the
hardware of new work. What if both the translation and transition overhead
could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
Applications map portals in their local-address-space and directly submit
work to them using a new instruction.

This series implements management of a new MSR (MSR_IA32_PASID). This new
MSR allows an application address space to be associated with what the PCIe
spec calls a Process Address Space ID (PASID). This PASID tag is carried
along with all requests between applications and devices and allows devices
to interact with the process address space.

SVA and ENQCMD enabled device drivers will use this series in the future.
For example, it will be used by the phase 2 DSA driver which will be
released with SVA and ENQCMD support as explained in:
https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator

This series only provides simple and basic support for the MSR as follows:
1. Explain different various technical terms used in the series (patch 1).
2. Enumerate support for ENQCMD in the processor (patch 2).
3. Handle FPU PASID state and the MSR during context switch (patches 3-4).
4. Allocate and free PASID for a process (patch 5).
5. Fix up the PASID MSR in #GP handler when one thread in a process
   executes ENQCMD for the first time (patches 6).
6. Clear PASID state for forked and cloned thread (patch 7).

And this patch series needs support from supervisor states patch set:
https://lore.kernel.org/lkml/20200328164307.17497-1-yu-cheng.yu@intel.com/

The v3 supervisor states series, this patch series, and DSA phase 2 series
(to be released shortly in idxd driver) can be cloned from:
https://github.com/intel/idxd-driver.git     idxd-stage2

References:
1. Detailed information on the ENQCMD/ENQCMDS instructions and the
IA32_PASID MSR can be found in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

2. Detailed information on DSA can be found in DSA specification:
https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Ashok Raj (1):
  docs: x86: Add a documentation for ENQCMD

Fenghua Yu (5):
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  x86/msr-index: Define IA32_PASID MSR
  x86/mmu: Allocate/free PASID
  x86/traps: Fix up invalid PASID
  x86/process: Clear PASID state for a newly forked/cloned thread

Yu-cheng Yu (1):
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature

 Documentation/x86/enqcmd.rst       | 185 +++++++++++++++++++++++++++++
 arch/x86/include/asm/cpufeatures.h |   1 +
 arch/x86/include/asm/fpu/types.h   |  10 ++
 arch/x86/include/asm/fpu/xstate.h  |   2 +-
 arch/x86/include/asm/iommu.h       |   3 +
 arch/x86/include/asm/mmu.h         |   4 +
 arch/x86/include/asm/mmu_context.h |  14 +++
 arch/x86/include/asm/msr-index.h   |   3 +
 arch/x86/kernel/cpu/cpuid-deps.c   |   1 +
 arch/x86/kernel/fpu/xstate.c       |   4 +
 arch/x86/kernel/process.c          |  13 ++
 arch/x86/kernel/traps.c            |  17 +++
 drivers/iommu/intel-svm.c          | 119 +++++++++++++++++--
 13 files changed, 367 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/x86/enqcmd.rst

-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

From: Ashok Raj <ashok.raj@intel.com>

ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
features are a complicated stack with lots of interconnected pieces.
This documentation provides a big picture overview for all of the
features.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++
 1 file changed, 185 insertions(+)
 create mode 100644 Documentation/x86/enqcmd.rst

diff --git a/Documentation/x86/enqcmd.rst b/Documentation/x86/enqcmd.rst
new file mode 100644
index 000000000000..414ef7d24028
--- /dev/null
+++ b/Documentation/x86/enqcmd.rst
@@ -0,0 +1,185 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Improved Device Interaction Overview
+
+== Background ==
+
+Shared Virtual Addressing (SVA) allows the processor and device to use the
+same virtual addresses avoiding the need for software to translate virtual
+addresses to physical addresses. ENQCMD is a new instruction on Intel
+platforms that allows user applications to directly notify hardware of new
+work, much like doorbells are used in some hardware, but carries a payload
+that carries the PASID and some additional device specific commands
+along with it.
+
+== Address Space Tagging ==
+
+A new MSR (MSR_IA32_PASID) allows an application address space to be
+associated with what the PCIe spec calls a Process Address Space ID
+(PASID). This PASID tag is carried along with all requests between
+applications and devices and allows devices to interact with the process
+address space.
+
+This MSR is managed with the XSAVE feature set as "supervisor state".
+
+== PASID Management ==
+
+The kernel must allocate a PASID on behalf of each process and program it
+into the new MSR to communicate the process identity to platform hardware.
+ENQCMD uses the PASID stored in this MSR to tag requests from this process.
+Requests for DMA from the device are also tagged with the same PASID. The
+platform IOMMU uses the PASID in the transaction to perform address
+translation. The IOMMU api's setup the corresponding PASID entry in IOMMU
+with the process address used by the CPU (for e.g cr3 in x86).
+
+The MSR must be configured on each logical CPU before any application
+thread can interact with a device.  Threads that belong to the same
+process share the same page tables, thus the same MSR value.
+
+The PASID allocation and MSR programming may occur long after a process and
+its threads have been created. If a thread uses ENQCMD without the MSR
+first being populated, it will #GP.  The kernel will fix up the #GP by
+writing the process-wide PASID into the thread that took the #GP. A single
+process PASID can be used simultaneously with multiple devices since they
+all share the same address space.
+
+New threads could inherit the MSR value from the parent. But this would
+involve additional state management for those threads which may never use
+ENQCMD. Clearing the MSR at thread creation permits all threads to have a
+consistent behavior; the PASID is only programmed when the thread calls
+ENQCMD for the first time.
+
+Although ENQCMD can be executed in the kernel, there isn't any usage yet.
+Currently #GP handler doesn't fix up #GP triggered from ENQCMD executed
+in the kernel
+
+== Relationships ==
+
+ * Each process has many threads, but only one PASID
+ * Devices have a limited number (~10's to 1000's) of hardware
+   workqueues and each portal maps down to a single workqueue.
+   The device driver manages allocating hardware workqueues.
+ * A single mmap() maps a single hardware workqueue as a "portal"
+ * For each device with which a process interacts, there must be
+   one or more mmap()'d portals.
+ * Many threads within a process can share a single portal to access
+   a single device.
+ * Multiple processes can separately mmap() the same portal, in
+   which case they still share one device hardware workqueue.
+ * The single process-wide PASID is used by all threads to interact
+   with all devices.  There is not, for instance, a PASID for each
+   thread or each thread<->device pair.
+
+== FAQ ==
+
+* What is SVA/SVM?
+
+Shared Virtual Addressing (SVA) permits I/O hardware and the processor to
+work in the same address space. In short, sharing the address space. Some
+call it Shared Virtual Memory (SVM), but Linux community wanted to avoid
+it with Posix Shared Memory and Secure Virtual Machines which were terms
+already in circulation.
+
+* What is a PASID?
+
+A Process Address Space ID (PASID) is a PCIe-defined TLP Prefix. A PASID is
+a 20 bit number allocated and managed by the OS. PASID is included in all
+transactions between the platform and the device.
+
+* How are shared work queues different?
+
+Traditionally to allow user space applications interact with hardware,
+there is a separate instance required per process. For e.g. consider
+doorbells as a mechanism of informing hardware about work to process. Each
+doorbell is required to be spaced 4k (or page-size) apart for process
+isolation. This requires hardware to provision that space and reserve in
+MMIO. This doesn't scale as the number of threads becomes quite large. The
+hardware also manages the queue depth for Shared Work Queues (SWQ), and
+consumers don't need to track queue depth. If there is no space to accept
+a command, the device will return an error indicating retry. Also
+submitting a command to an MMIO address that can't accept ENQCMD will
+return retry in response.
+
+SWQ allows hardware to provision just a single address in the device. When
+used with ENQCMD to submit work, the device can distinguish the process
+submitting the work since it will include the PASID assigned to that
+process. This decreases the pressure of hardware requiring to support
+hardware to scale to a large number of processes.
+
+* Is this the same as a user space device driver?
+
+Communicating with the device via the shared work queue is much simpler
+than a full blown user space driver. The kernel driver does all the
+initialization of the hardware. User space only needs to worry about
+submitting work and processing completions.
+
+* Is this the same as SR-IOV?
+
+Single Root I/O Virtualization (SR-IOV) focuses on providing independent
+hardware interfaces for virtualizing hardware. Hence its required to be
+almost fully functional interface to software supporting the traditional
+BAR's, space for interrupts via MSI-x, its own register layout. Creating
+of this Virtual Functions (VFs) is assisted by the Physical Function (PF)
+driver.
+
+Scalable I/O Virtualization builds on the PASID concept to create device
+instances for virtualization. SIOV requires host software to assist in
+creating virtual devices, each virtual device is represented by a PASID.
+This allows device hardware to optimize device resource creation and can
+grow dynamically on demand. SR-IOV creation and management is very static
+in nature. Consult references below for more details.
+
+* Why not just create a virtual function for each app?
+
+Creating PCIe SRIOV type virtual functions (VF) are expensive. They create
+duplicated hardware for PCI config space requirements, Interrupts such as
+MSIx for instance. Resources such as interrupts have to be hard partitioned
+between VF's at creation time, and cannot scale dynamically on demand. The
+VF's are not completely independent from the Physical function (PF). Most
+VF's require some communication and assistance from the PF driver. SIOV
+creates a software defined device. Where all the configuration and control
+aspects are mediated via the slow path. The work submission and completion
+happen without any mediation.
+
+* Does this support virtualization?
+
+ENQCMD can be used from within a guest VM. In these cases the VMM helps
+with setting up a translation table to translate from Guest PASID to Host
+PASID. Please consult the ENQCMD instruction set reference for more
+details.
+
+* Does memory need to be pinned?
+
+When devices support SVA, along with platform hardware such as IOMMU
+supporting such devices, there is no need to pin memory for DMA purposes.
+Devices that support SVA also support other PCIe features that remove the
+pinning requirement for memory.
+
+Device TLB support - Device requests the IOMMU to lookup an address before
+use via Address Translation Service (ATS) requests.  If the mapping exists
+but there is no page allocated by the OS, IOMMU hardware returns that no
+mapping exists.
+
+Device requests that virtual address to be mapped via Page Request
+Interface (PRI). Once the OS has successfully completed  the mapping, it
+returns the response back to the device. The device continues again to
+request for a translation and continues.
+
+IOMMU works with the OS in managing consistency of page-tables with the
+device. When removing pages, it interacts with the device to remove any
+device-tlb that might have been cached before removing the mappings from
+the OS.
+
+== References ==
+
+VT-D:
+https://01.org/blogs/ashokraj/2018/recent-enhancements-intel-virtualization-technology-directed-i/o-intel-vt-d
+
+SIOV:
+https://01.org/blogs/2019/assignable-interfaces-intel-scalable-i/o-virtualization-linux
+
+ENQCMD in ISE:
+https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
+
+DSA spec:
+https://software.intel.com/sites/default/files/341204-intel-data-streaming-accelerator-spec.pdf
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

From: Ashok Raj <ashok.raj@intel.com>

ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
features are a complicated stack with lots of interconnected pieces.
This documentation provides a big picture overview for all of the
features.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++
 1 file changed, 185 insertions(+)
 create mode 100644 Documentation/x86/enqcmd.rst

diff --git a/Documentation/x86/enqcmd.rst b/Documentation/x86/enqcmd.rst
new file mode 100644
index 000000000000..414ef7d24028
--- /dev/null
+++ b/Documentation/x86/enqcmd.rst
@@ -0,0 +1,185 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Improved Device Interaction Overview
+
+== Background ==
+
+Shared Virtual Addressing (SVA) allows the processor and device to use the
+same virtual addresses avoiding the need for software to translate virtual
+addresses to physical addresses. ENQCMD is a new instruction on Intel
+platforms that allows user applications to directly notify hardware of new
+work, much like doorbells are used in some hardware, but carries a payload
+that carries the PASID and some additional device specific commands
+along with it.
+
+== Address Space Tagging ==
+
+A new MSR (MSR_IA32_PASID) allows an application address space to be
+associated with what the PCIe spec calls a Process Address Space ID
+(PASID). This PASID tag is carried along with all requests between
+applications and devices and allows devices to interact with the process
+address space.
+
+This MSR is managed with the XSAVE feature set as "supervisor state".
+
+== PASID Management ==
+
+The kernel must allocate a PASID on behalf of each process and program it
+into the new MSR to communicate the process identity to platform hardware.
+ENQCMD uses the PASID stored in this MSR to tag requests from this process.
+Requests for DMA from the device are also tagged with the same PASID. The
+platform IOMMU uses the PASID in the transaction to perform address
+translation. The IOMMU api's setup the corresponding PASID entry in IOMMU
+with the process address used by the CPU (for e.g cr3 in x86).
+
+The MSR must be configured on each logical CPU before any application
+thread can interact with a device.  Threads that belong to the same
+process share the same page tables, thus the same MSR value.
+
+The PASID allocation and MSR programming may occur long after a process and
+its threads have been created. If a thread uses ENQCMD without the MSR
+first being populated, it will #GP.  The kernel will fix up the #GP by
+writing the process-wide PASID into the thread that took the #GP. A single
+process PASID can be used simultaneously with multiple devices since they
+all share the same address space.
+
+New threads could inherit the MSR value from the parent. But this would
+involve additional state management for those threads which may never use
+ENQCMD. Clearing the MSR at thread creation permits all threads to have a
+consistent behavior; the PASID is only programmed when the thread calls
+ENQCMD for the first time.
+
+Although ENQCMD can be executed in the kernel, there isn't any usage yet.
+Currently #GP handler doesn't fix up #GP triggered from ENQCMD executed
+in the kernel
+
+== Relationships ==
+
+ * Each process has many threads, but only one PASID
+ * Devices have a limited number (~10's to 1000's) of hardware
+   workqueues and each portal maps down to a single workqueue.
+   The device driver manages allocating hardware workqueues.
+ * A single mmap() maps a single hardware workqueue as a "portal"
+ * For each device with which a process interacts, there must be
+   one or more mmap()'d portals.
+ * Many threads within a process can share a single portal to access
+   a single device.
+ * Multiple processes can separately mmap() the same portal, in
+   which case they still share one device hardware workqueue.
+ * The single process-wide PASID is used by all threads to interact
+   with all devices.  There is not, for instance, a PASID for each
+   thread or each thread<->device pair.
+
+== FAQ ==
+
+* What is SVA/SVM?
+
+Shared Virtual Addressing (SVA) permits I/O hardware and the processor to
+work in the same address space. In short, sharing the address space. Some
+call it Shared Virtual Memory (SVM), but Linux community wanted to avoid
+it with Posix Shared Memory and Secure Virtual Machines which were terms
+already in circulation.
+
+* What is a PASID?
+
+A Process Address Space ID (PASID) is a PCIe-defined TLP Prefix. A PASID is
+a 20 bit number allocated and managed by the OS. PASID is included in all
+transactions between the platform and the device.
+
+* How are shared work queues different?
+
+Traditionally to allow user space applications interact with hardware,
+there is a separate instance required per process. For e.g. consider
+doorbells as a mechanism of informing hardware about work to process. Each
+doorbell is required to be spaced 4k (or page-size) apart for process
+isolation. This requires hardware to provision that space and reserve in
+MMIO. This doesn't scale as the number of threads becomes quite large. The
+hardware also manages the queue depth for Shared Work Queues (SWQ), and
+consumers don't need to track queue depth. If there is no space to accept
+a command, the device will return an error indicating retry. Also
+submitting a command to an MMIO address that can't accept ENQCMD will
+return retry in response.
+
+SWQ allows hardware to provision just a single address in the device. When
+used with ENQCMD to submit work, the device can distinguish the process
+submitting the work since it will include the PASID assigned to that
+process. This decreases the pressure of hardware requiring to support
+hardware to scale to a large number of processes.
+
+* Is this the same as a user space device driver?
+
+Communicating with the device via the shared work queue is much simpler
+than a full blown user space driver. The kernel driver does all the
+initialization of the hardware. User space only needs to worry about
+submitting work and processing completions.
+
+* Is this the same as SR-IOV?
+
+Single Root I/O Virtualization (SR-IOV) focuses on providing independent
+hardware interfaces for virtualizing hardware. Hence its required to be
+almost fully functional interface to software supporting the traditional
+BAR's, space for interrupts via MSI-x, its own register layout. Creating
+of this Virtual Functions (VFs) is assisted by the Physical Function (PF)
+driver.
+
+Scalable I/O Virtualization builds on the PASID concept to create device
+instances for virtualization. SIOV requires host software to assist in
+creating virtual devices, each virtual device is represented by a PASID.
+This allows device hardware to optimize device resource creation and can
+grow dynamically on demand. SR-IOV creation and management is very static
+in nature. Consult references below for more details.
+
+* Why not just create a virtual function for each app?
+
+Creating PCIe SRIOV type virtual functions (VF) are expensive. They create
+duplicated hardware for PCI config space requirements, Interrupts such as
+MSIx for instance. Resources such as interrupts have to be hard partitioned
+between VF's at creation time, and cannot scale dynamically on demand. The
+VF's are not completely independent from the Physical function (PF). Most
+VF's require some communication and assistance from the PF driver. SIOV
+creates a software defined device. Where all the configuration and control
+aspects are mediated via the slow path. The work submission and completion
+happen without any mediation.
+
+* Does this support virtualization?
+
+ENQCMD can be used from within a guest VM. In these cases the VMM helps
+with setting up a translation table to translate from Guest PASID to Host
+PASID. Please consult the ENQCMD instruction set reference for more
+details.
+
+* Does memory need to be pinned?
+
+When devices support SVA, along with platform hardware such as IOMMU
+supporting such devices, there is no need to pin memory for DMA purposes.
+Devices that support SVA also support other PCIe features that remove the
+pinning requirement for memory.
+
+Device TLB support - Device requests the IOMMU to lookup an address before
+use via Address Translation Service (ATS) requests.  If the mapping exists
+but there is no page allocated by the OS, IOMMU hardware returns that no
+mapping exists.
+
+Device requests that virtual address to be mapped via Page Request
+Interface (PRI). Once the OS has successfully completed  the mapping, it
+returns the response back to the device. The device continues again to
+request for a translation and continues.
+
+IOMMU works with the OS in managing consistency of page-tables with the
+device. When removing pages, it interacts with the device to remove any
+device-tlb that might have been cached before removing the mappings from
+the OS.
+
+== References ==
+
+VT-D:
+https://01.org/blogs/ashokraj/2018/recent-enhancements-intel-virtualization-technology-directed-i/o-intel-vt-d
+
+SIOV:
+https://01.org/blogs/2019/assignable-interfaces-intel-scalable-i/o-virtualization-linux
+
+ENQCMD in ISE:
+https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
+
+DSA spec:
+https://software.intel.com/sites/default/files/341204-intel-data-streaming-accelerator-spec.pdf
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

A user space application can execute ENQCMD instruction to submit work
to device. The kernel executes ENQCMDS instruction to submit work to
device.

There is a lot of other enabling needed for the instructions to actually
be usable in user space and the kernel, and that enabling is coming later
in the series and in device drivers.

The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index f3327cb56edf..d12ee3be1b93 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -349,6 +349,7 @@
 #define X86_FEATURE_CLDEMOTE		(16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI		(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B		(16*32+28) /* MOVDIR64B instruction */
+#define X86_FEATURE_ENQCMD		(16*32+29) /* ENQCMD and ENQCMDS instructions */
 
 /* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
 #define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 3cbe24ca80ab..3a02707c1f4d 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{}
 };
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

A user space application can execute ENQCMD instruction to submit work
to device. The kernel executes ENQCMDS instruction to submit work to
device.

There is a lot of other enabling needed for the instructions to actually
be usable in user space and the kernel, and that enabling is coming later
in the series and in device drivers.

The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index f3327cb56edf..d12ee3be1b93 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -349,6 +349,7 @@
 #define X86_FEATURE_CLDEMOTE		(16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI		(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B		(16*32+28) /* MOVDIR64B instruction */
+#define X86_FEATURE_ENQCMD		(16*32+29) /* ENQCMD and ENQCMDS instructions */
 
 /* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
 #define X86_FEATURE_OVERFLOW_RECOV	(17*32+ 0) /* MCA overflow recovery support */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 3cbe24ca80ab..3a02707c1f4d 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{}
 };
 
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Yu-cheng Yu, Fenghua Yu

From: Yu-cheng Yu <yu-cheng.yu@intel.com>

The IA32_PASID MSR is used when a task submits work via the ENQCMD
instruction. The per task MSR is stored in the task's supervisor FPU
PASID state and is context switched by XSAVES/XRSTORS.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/fpu/types.h  | 10 ++++++++++
 arch/x86/include/asm/fpu/xstate.h |  2 +-
 arch/x86/kernel/fpu/xstate.c      |  4 ++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f098f6cab94b..00f8efd4c07d 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -114,6 +114,7 @@ enum xfeature {
 	XFEATURE_Hi16_ZMM,
 	XFEATURE_PT_UNIMPLEMENTED_SO_FAR,
 	XFEATURE_PKRU,
+	XFEATURE_PASID,
 
 	XFEATURE_MAX,
 };
@@ -128,6 +129,7 @@ enum xfeature {
 #define XFEATURE_MASK_Hi16_ZMM		(1 << XFEATURE_Hi16_ZMM)
 #define XFEATURE_MASK_PT		(1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR)
 #define XFEATURE_MASK_PKRU		(1 << XFEATURE_PKRU)
+#define XFEATURE_MASK_PASID		(1 << XFEATURE_PASID)
 
 #define XFEATURE_MASK_FPSSE		(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
 #define XFEATURE_MASK_AVX512		(XFEATURE_MASK_OPMASK \
@@ -229,6 +231,14 @@ struct pkru_state {
 	u32				pad;
 } __packed;
 
+/*
+ * State component 10 is supervisor state used for context-switching the
+ * PASID state.
+ */
+struct ia32_pasid_state {
+	u64 pasid;
+} __packed;
+
 struct xstate_header {
 	u64				xfeatures;
 	u64				xcomp_bv;
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 422d8369012a..ab9833c57aaa 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -33,7 +33,7 @@
 				      XFEATURE_MASK_BNDCSR)
 
 /* All currently supported supervisor features */
-#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
+#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 7d0a9f878b26..8724675532de 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -37,6 +37,7 @@ static const char *xfeature_names[] =
 	"AVX-512 ZMM_Hi256"		,
 	"Processor Trace (unused)"	,
 	"Protection Keys User registers",
+	"PASID state",
 	"unknown xstate feature"	,
 };
 
@@ -51,6 +52,7 @@ static short xsave_cpuid_features[] __initdata = {
 	X86_FEATURE_AVX512F,
 	X86_FEATURE_INTEL_PT,
 	X86_FEATURE_PKU,
+	X86_FEATURE_ENQCMD,
 };
 
 /*
@@ -316,6 +318,7 @@ static void __init print_xstate_features(void)
 	print_xstate_feature(XFEATURE_MASK_ZMM_Hi256);
 	print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
 	print_xstate_feature(XFEATURE_MASK_PKRU);
+	print_xstate_feature(XFEATURE_MASK_PASID);
 }
 
 /*
@@ -590,6 +593,7 @@ static void check_xstate_against_struct(int nr)
 	XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state);
 	XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM,  struct avx_512_hi16_state);
 	XCHECK_SZ(sz, nr, XFEATURE_PKRU,      struct pkru_state);
+	XCHECK_SZ(sz, nr, XFEATURE_PASID,     struct ia32_pasid_state);
 
 	/*
 	 * Make *SURE* to add any feature numbers in below if
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, Yu-cheng Yu, linux-kernel

From: Yu-cheng Yu <yu-cheng.yu@intel.com>

The IA32_PASID MSR is used when a task submits work via the ENQCMD
instruction. The per task MSR is stored in the task's supervisor FPU
PASID state and is context switched by XSAVES/XRSTORS.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/fpu/types.h  | 10 ++++++++++
 arch/x86/include/asm/fpu/xstate.h |  2 +-
 arch/x86/kernel/fpu/xstate.c      |  4 ++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f098f6cab94b..00f8efd4c07d 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -114,6 +114,7 @@ enum xfeature {
 	XFEATURE_Hi16_ZMM,
 	XFEATURE_PT_UNIMPLEMENTED_SO_FAR,
 	XFEATURE_PKRU,
+	XFEATURE_PASID,
 
 	XFEATURE_MAX,
 };
@@ -128,6 +129,7 @@ enum xfeature {
 #define XFEATURE_MASK_Hi16_ZMM		(1 << XFEATURE_Hi16_ZMM)
 #define XFEATURE_MASK_PT		(1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR)
 #define XFEATURE_MASK_PKRU		(1 << XFEATURE_PKRU)
+#define XFEATURE_MASK_PASID		(1 << XFEATURE_PASID)
 
 #define XFEATURE_MASK_FPSSE		(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
 #define XFEATURE_MASK_AVX512		(XFEATURE_MASK_OPMASK \
@@ -229,6 +231,14 @@ struct pkru_state {
 	u32				pad;
 } __packed;
 
+/*
+ * State component 10 is supervisor state used for context-switching the
+ * PASID state.
+ */
+struct ia32_pasid_state {
+	u64 pasid;
+} __packed;
+
 struct xstate_header {
 	u64				xfeatures;
 	u64				xcomp_bv;
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 422d8369012a..ab9833c57aaa 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -33,7 +33,7 @@
 				      XFEATURE_MASK_BNDCSR)
 
 /* All currently supported supervisor features */
-#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
+#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 7d0a9f878b26..8724675532de 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -37,6 +37,7 @@ static const char *xfeature_names[] =
 	"AVX-512 ZMM_Hi256"		,
 	"Processor Trace (unused)"	,
 	"Protection Keys User registers",
+	"PASID state",
 	"unknown xstate feature"	,
 };
 
@@ -51,6 +52,7 @@ static short xsave_cpuid_features[] __initdata = {
 	X86_FEATURE_AVX512F,
 	X86_FEATURE_INTEL_PT,
 	X86_FEATURE_PKU,
+	X86_FEATURE_ENQCMD,
 };
 
 /*
@@ -316,6 +318,7 @@ static void __init print_xstate_features(void)
 	print_xstate_feature(XFEATURE_MASK_ZMM_Hi256);
 	print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
 	print_xstate_feature(XFEATURE_MASK_PKRU);
+	print_xstate_feature(XFEATURE_MASK_PASID);
 }
 
 /*
@@ -590,6 +593,7 @@ static void check_xstate_against_struct(int nr)
 	XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state);
 	XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM,  struct avx_512_hi16_state);
 	XCHECK_SZ(sz, nr, XFEATURE_PKRU,      struct pkru_state);
+	XCHECK_SZ(sz, nr, XFEATURE_PASID,     struct ia32_pasid_state);
 
 	/*
 	 * Make *SURE* to add any feature numbers in below if
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
(PASID), a 20-bit value. Bit 31 must be set to indicate the value
programmed in the MSR is valid. Hardware uses PASID to identify which
process submits the work and direct responses to the right process.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/msr-index.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index d5e517d1c3dd..ebda24839dc5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -224,6 +224,9 @@
 #define MSR_IA32_LASTINTFROMIP		0x000001dd
 #define MSR_IA32_LASTINTTOIP		0x000001de
 
+#define MSR_IA32_PASID			0x00000d93
+#define MSR_IA32_PASID_VALID		BIT_ULL(31)
+
 /* DEBUGCTLMSR bits (others vary by model): */
 #define DEBUGCTLMSR_LBR			(1UL <<  0) /* last branch recording */
 #define DEBUGCTLMSR_BTF_SHIFT		1
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
(PASID), a 20-bit value. Bit 31 must be set to indicate the value
programmed in the MSR is valid. Hardware uses PASID to identify which
process submits the work and direct responses to the right process.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/msr-index.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index d5e517d1c3dd..ebda24839dc5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -224,6 +224,9 @@
 #define MSR_IA32_LASTINTFROMIP		0x000001dd
 #define MSR_IA32_LASTINTTOIP		0x000001de
 
+#define MSR_IA32_PASID			0x00000d93
+#define MSR_IA32_PASID_VALID		BIT_ULL(31)
+
 /* DEBUGCTLMSR bits (others vary by model): */
 #define DEBUGCTLMSR_LBR			(1UL <<  0) /* last branch recording */
 #define DEBUGCTLMSR_BTF_SHIFT		1
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Add the field to the architecture specific
mm_context_t structure.

A PASID is allocated for an "mm" the first time any thread attaches
to an SVM capable device. Later device atatches (whether to the same
device or another SVM device) will re-use the same PASID.

The PASID is freed when the process exits (so no need to keep
reference counts on how many SVM devices are sharing the PASID).

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/iommu.h       |  2 +
 arch/x86/include/asm/mmu.h         |  4 ++
 arch/x86/include/asm/mmu_context.h | 14 +++++
 drivers/iommu/intel-svm.c          | 82 +++++++++++++++++++++++++++---
 4 files changed, 94 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index bf1ed2ddc74b..ed41259fe7ac 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -26,4 +26,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
 	return -EINVAL;
 }
 
+void __free_pasid(struct mm_struct *mm);
+
 #endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index bdeae9291e5c..137bf51f19e6 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -50,6 +50,10 @@ typedef struct {
 	u16 pkey_allocation_map;
 	s16 execute_only_pkey;
 #endif
+
+#ifdef CONFIG_INTEL_IOMMU_SVM
+	int pasid;
+#endif
 } mm_context_t;
 
 #define INIT_MM_CONTEXT(mm)						\
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index b538d9ddee9c..1c020c7955e6 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -13,6 +13,7 @@
 #include <asm/tlbflush.h>
 #include <asm/paravirt.h>
 #include <asm/debugreg.h>
+#include <asm/iommu.h>
 
 extern atomic64_t last_mm_ctx_id;
 
@@ -129,9 +130,22 @@ static inline int init_new_context(struct task_struct *tsk,
 	init_new_context_ldt(mm);
 	return 0;
 }
+
+static inline void free_pasid(struct mm_struct *mm)
+{
+	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+		return;
+
+	if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+		return;
+
+	__free_pasid(mm);
+}
+
 static inline void destroy_context(struct mm_struct *mm)
 {
 	destroy_context_ldt(mm);
+	free_pasid(mm);
 }
 
 extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index d7f2a5358900..da718a49e91e 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
 	list_for_each_entry((sdev), &(svm)->devs, list)	\
 		if ((d) != (sdev)->dev) {} else
 
+/*
+ * If this mm already has a PASID we can use it. Otherwise allocate a new one.
+ * Let the caller know if we did an allocation via 'new_pasid'.
+ */
+static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
+		       int pasid_max,  bool *new_pasid, int flags)
+{
+	int pasid;
+
+	/*
+	 * Reuse the PASID if the mm already has a PASID and not a private
+	 * PASID is requested.
+	 */
+	if (mm && mm->context.pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+		/*
+		 * Once a PASID is allocated for this mm, the PASID
+		 * stays with the mm until the mm is dropped. Reuse
+		 * the PASID which has been already allocated for the
+		 * mm instead of allocating a new one.
+		 */
+		ioasid_set_data(mm->context.pasid, svm);
+		*new_pasid = false;
+
+		return mm->context.pasid;
+	}
+
+	/*
+	 * Allocate a new pasid. Do not use PASID 0, reserved for RID to
+	 * PASID.
+	 */
+	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);
+	if (pasid == INVALID_IOASID)
+		return -ENOSPC;
+
+	*new_pasid = true;
+
+	return pasid;
+}
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 	init_rcu_head(&sdev->rcu);
 
 	if (!svm) {
+		bool new_pasid;
+
 		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
 		if (!svm) {
 			ret = -ENOMEM;
@@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (pasid_max > intel_pasid_max_id)
 			pasid_max = intel_pasid_max_id;
 
-		/* Do not use PASID 0, reserved for RID to PASID */
-		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
-					  pasid_max - 1, svm);
-		if (svm->pasid == INVALID_IOASID) {
+		svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
+		if (svm->pasid < 0) {
 			kfree(svm);
 			kfree(sdev);
-			ret = -ENOSPC;
 			goto out;
 		}
+
 		svm->notifier.ops = &intel_mmuops;
 		svm->mm = mm;
 		svm->flags = flags;
@@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (mm) {
 			ret = mmu_notifier_register(&svm->notifier, mm);
 			if (ret) {
-				ioasid_free(svm->pasid);
+				if (new_pasid)
+					ioasid_free(svm->pasid);
 				kfree(svm);
 				kfree(sdev);
 				goto out;
@@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (ret) {
 			if (mm)
 				mmu_notifier_unregister(&svm->notifier, mm);
-			ioasid_free(svm->pasid);
+			if (new_pasid)
+				ioasid_free(svm->pasid);
 			kfree(svm);
 			kfree(sdev);
 			goto out;
 		}
 
+		if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+			/*
+			 * Track the new pasid in the mm. The pasid will be
+			 * freed at process exit. Don't track requested
+			 * private PASID in the mm.
+			 */
+			mm->context.pasid = svm->pasid;
+		}
 		list_add_tail(&svm->list, &global_svm_list);
 	} else {
 		/*
@@ -447,7 +496,8 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 			kfree_rcu(sdev, rcu);
 
 			if (list_empty(&svm->devs)) {
-				ioasid_free(svm->pasid);
+				/* Clear data in the pasid. */
+				ioasid_set_data(pasid, NULL);
 				if (svm->mm)
 					mmu_notifier_unregister(&svm->notifier, svm->mm);
 				list_del(&svm->list);
@@ -693,3 +743,19 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 
 	return IRQ_RETVAL(handled);
 }
+
+/* On process exit free the PASID (if one was allocated). */
+void __free_pasid(struct mm_struct *mm)
+{
+	int pasid = mm->context.pasid;
+
+	if (!pasid)
+		return;
+
+	/*
+	 * Since the pasid is not bound to any svm by now, there is no race
+	 * here with binding/unbinding and no need to protect the free
+	 * operation by pasid_mutex.
+	 */
+	ioasid_free(pasid);
+}
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Add the field to the architecture specific
mm_context_t structure.

A PASID is allocated for an "mm" the first time any thread attaches
to an SVM capable device. Later device atatches (whether to the same
device or another SVM device) will re-use the same PASID.

The PASID is freed when the process exits (so no need to keep
reference counts on how many SVM devices are sharing the PASID).

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/iommu.h       |  2 +
 arch/x86/include/asm/mmu.h         |  4 ++
 arch/x86/include/asm/mmu_context.h | 14 +++++
 drivers/iommu/intel-svm.c          | 82 +++++++++++++++++++++++++++---
 4 files changed, 94 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index bf1ed2ddc74b..ed41259fe7ac 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -26,4 +26,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
 	return -EINVAL;
 }
 
+void __free_pasid(struct mm_struct *mm);
+
 #endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index bdeae9291e5c..137bf51f19e6 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -50,6 +50,10 @@ typedef struct {
 	u16 pkey_allocation_map;
 	s16 execute_only_pkey;
 #endif
+
+#ifdef CONFIG_INTEL_IOMMU_SVM
+	int pasid;
+#endif
 } mm_context_t;
 
 #define INIT_MM_CONTEXT(mm)						\
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index b538d9ddee9c..1c020c7955e6 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -13,6 +13,7 @@
 #include <asm/tlbflush.h>
 #include <asm/paravirt.h>
 #include <asm/debugreg.h>
+#include <asm/iommu.h>
 
 extern atomic64_t last_mm_ctx_id;
 
@@ -129,9 +130,22 @@ static inline int init_new_context(struct task_struct *tsk,
 	init_new_context_ldt(mm);
 	return 0;
 }
+
+static inline void free_pasid(struct mm_struct *mm)
+{
+	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+		return;
+
+	if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+		return;
+
+	__free_pasid(mm);
+}
+
 static inline void destroy_context(struct mm_struct *mm)
 {
 	destroy_context_ldt(mm);
+	free_pasid(mm);
 }
 
 extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index d7f2a5358900..da718a49e91e 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
 	list_for_each_entry((sdev), &(svm)->devs, list)	\
 		if ((d) != (sdev)->dev) {} else
 
+/*
+ * If this mm already has a PASID we can use it. Otherwise allocate a new one.
+ * Let the caller know if we did an allocation via 'new_pasid'.
+ */
+static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
+		       int pasid_max,  bool *new_pasid, int flags)
+{
+	int pasid;
+
+	/*
+	 * Reuse the PASID if the mm already has a PASID and not a private
+	 * PASID is requested.
+	 */
+	if (mm && mm->context.pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+		/*
+		 * Once a PASID is allocated for this mm, the PASID
+		 * stays with the mm until the mm is dropped. Reuse
+		 * the PASID which has been already allocated for the
+		 * mm instead of allocating a new one.
+		 */
+		ioasid_set_data(mm->context.pasid, svm);
+		*new_pasid = false;
+
+		return mm->context.pasid;
+	}
+
+	/*
+	 * Allocate a new pasid. Do not use PASID 0, reserved for RID to
+	 * PASID.
+	 */
+	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);
+	if (pasid == INVALID_IOASID)
+		return -ENOSPC;
+
+	*new_pasid = true;
+
+	return pasid;
+}
+
 int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
 {
 	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
@@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 	init_rcu_head(&sdev->rcu);
 
 	if (!svm) {
+		bool new_pasid;
+
 		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
 		if (!svm) {
 			ret = -ENOMEM;
@@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (pasid_max > intel_pasid_max_id)
 			pasid_max = intel_pasid_max_id;
 
-		/* Do not use PASID 0, reserved for RID to PASID */
-		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
-					  pasid_max - 1, svm);
-		if (svm->pasid == INVALID_IOASID) {
+		svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
+		if (svm->pasid < 0) {
 			kfree(svm);
 			kfree(sdev);
-			ret = -ENOSPC;
 			goto out;
 		}
+
 		svm->notifier.ops = &intel_mmuops;
 		svm->mm = mm;
 		svm->flags = flags;
@@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (mm) {
 			ret = mmu_notifier_register(&svm->notifier, mm);
 			if (ret) {
-				ioasid_free(svm->pasid);
+				if (new_pasid)
+					ioasid_free(svm->pasid);
 				kfree(svm);
 				kfree(sdev);
 				goto out;
@@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 		if (ret) {
 			if (mm)
 				mmu_notifier_unregister(&svm->notifier, mm);
-			ioasid_free(svm->pasid);
+			if (new_pasid)
+				ioasid_free(svm->pasid);
 			kfree(svm);
 			kfree(sdev);
 			goto out;
 		}
 
+		if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
+			/*
+			 * Track the new pasid in the mm. The pasid will be
+			 * freed at process exit. Don't track requested
+			 * private PASID in the mm.
+			 */
+			mm->context.pasid = svm->pasid;
+		}
 		list_add_tail(&svm->list, &global_svm_list);
 	} else {
 		/*
@@ -447,7 +496,8 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
 			kfree_rcu(sdev, rcu);
 
 			if (list_empty(&svm->devs)) {
-				ioasid_free(svm->pasid);
+				/* Clear data in the pasid. */
+				ioasid_set_data(pasid, NULL);
 				if (svm->mm)
 					mmu_notifier_unregister(&svm->notifier, svm->mm);
 				list_del(&svm->list);
@@ -693,3 +743,19 @@ static irqreturn_t prq_event_thread(int irq, void *d)
 
 	return IRQ_RETVAL(handled);
 }
+
+/* On process exit free the PASID (if one was allocated). */
+void __free_pasid(struct mm_struct *mm)
+{
+	int pasid = mm->context.pasid;
+
+	if (!pasid)
+		return;
+
+	/*
+	 * Since the pasid is not bound to any svm by now, there is no race
+	 * here with binding/unbinding and no need to protect the free
+	 * operation by pasid_mutex.
+	 */
+	ioasid_free(pasid);
+}
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

A #GP fault is generated when ENQCMD instruction is executed without
a valid PASID value programmed in. The #GP fault handler will initialize
the current thread's PASID MSR.

The following heuristic is used to avoid decoding the user instructions
to determine the precise reason for the #GP fault:
1) If the mm for the process has not been allocated a PASID, this #GP
   cannot be fixed.
2) If the PASID MSR is already initialized, then the #GP was for some
   other reason
3) Try initializing the PASID MSR and returning. If the #GP was from
   an ENQCMD this will fix it. If not, the #GP fault will be repeated
   and we will hit case "2".

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/iommu.h |  1 +
 arch/x86/kernel/traps.c      | 17 +++++++++++++++++
 drivers/iommu/intel-svm.c    | 37 ++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index ed41259fe7ac..e9365a5d6f7d 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -27,5 +27,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
 }
 
 void __free_pasid(struct mm_struct *mm);
+bool __fixup_pasid_exception(void);
 
 #endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6ef00eb6fbb9..369b5ba94635 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -56,6 +56,7 @@
 #include <asm/umip.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/iommu.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 	return GP_CANONICAL;
 }
 
+static bool fixup_pasid_exception(void)
+{
+	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+		return false;
+	if (!static_cpu_has(X86_FEATURE_ENQCMD))
+		return false;
+
+	return __fixup_pasid_exception();
+}
+
 #define GPFSTR "general protection fault"
 
 dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
@@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
 	int ret;
 
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+	if (user_mode(regs) && fixup_pasid_exception()) {
+		cond_local_irq_enable(regs);
+		return;
+	}
+
 	cond_local_irq_enable(regs);
 
 	if (static_cpu_has(X86_FEATURE_UMIP)) {
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index da718a49e91e..5ed39a022adb 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
 	 */
 	ioasid_free(pasid);
 }
+
+/*
+ * Fix up the PASID MSR if possible.
+ *
+ * But if the #GP was due to another reason, a second #GP might be triggered
+ * to force proper behavior.
+ */
+bool __fixup_pasid_exception(void)
+{
+	struct mm_struct *mm;
+	bool ret = true;
+	u64 pasid_msr;
+	int pasid;
+
+	mm = get_task_mm(current);
+	/* This #GP was triggered from user mode. So mm cannot be NULL. */
+	pasid = mm->context.pasid;
+	/* Ensure this process has been bound to a PASID. */
+	if (!pasid) {
+		ret = false;
+		goto out;
+	}
+
+	/* Check to see if the PASID MSR has already been set for this task. */
+	rdmsrl(MSR_IA32_PASID, pasid_msr);
+	if (pasid_msr & MSR_IA32_PASID_VALID) {
+		ret = false;
+		goto out;
+	}
+
+	/* Fix up the MSR. */
+	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
+out:
+	mmput(mm);
+
+	return ret;
+}
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

A #GP fault is generated when ENQCMD instruction is executed without
a valid PASID value programmed in. The #GP fault handler will initialize
the current thread's PASID MSR.

The following heuristic is used to avoid decoding the user instructions
to determine the precise reason for the #GP fault:
1) If the mm for the process has not been allocated a PASID, this #GP
   cannot be fixed.
2) If the PASID MSR is already initialized, then the #GP was for some
   other reason
3) Try initializing the PASID MSR and returning. If the #GP was from
   an ENQCMD this will fix it. If not, the #GP fault will be repeated
   and we will hit case "2".

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/iommu.h |  1 +
 arch/x86/kernel/traps.c      | 17 +++++++++++++++++
 drivers/iommu/intel-svm.c    | 37 ++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index ed41259fe7ac..e9365a5d6f7d 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -27,5 +27,6 @@ arch_rmrr_sanity_check(struct acpi_dmar_reserved_memory *rmrr)
 }
 
 void __free_pasid(struct mm_struct *mm);
+bool __fixup_pasid_exception(void);
 
 #endif /* _ASM_X86_IOMMU_H */
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 6ef00eb6fbb9..369b5ba94635 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -56,6 +56,7 @@
 #include <asm/umip.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
+#include <asm/iommu.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 	return GP_CANONICAL;
 }
 
+static bool fixup_pasid_exception(void)
+{
+	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
+		return false;
+	if (!static_cpu_has(X86_FEATURE_ENQCMD))
+		return false;
+
+	return __fixup_pasid_exception();
+}
+
 #define GPFSTR "general protection fault"
 
 dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
@@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
 	int ret;
 
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+	if (user_mode(regs) && fixup_pasid_exception()) {
+		cond_local_irq_enable(regs);
+		return;
+	}
+
 	cond_local_irq_enable(regs);
 
 	if (static_cpu_has(X86_FEATURE_UMIP)) {
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index da718a49e91e..5ed39a022adb 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
 	 */
 	ioasid_free(pasid);
 }
+
+/*
+ * Fix up the PASID MSR if possible.
+ *
+ * But if the #GP was due to another reason, a second #GP might be triggered
+ * to force proper behavior.
+ */
+bool __fixup_pasid_exception(void)
+{
+	struct mm_struct *mm;
+	bool ret = true;
+	u64 pasid_msr;
+	int pasid;
+
+	mm = get_task_mm(current);
+	/* This #GP was triggered from user mode. So mm cannot be NULL. */
+	pasid = mm->context.pasid;
+	/* Ensure this process has been bound to a PASID. */
+	if (!pasid) {
+		ret = false;
+		goto out;
+	}
+
+	/* Check to see if the PASID MSR has already been set for this task. */
+	rdmsrl(MSR_IA32_PASID, pasid_msr);
+	if (pasid_msr & MSR_IA32_PASID_VALID) {
+		ret = false;
+		goto out;
+	}
+
+	/* Fix up the MSR. */
+	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
+out:
+	mmput(mm);
+
+	return ret;
+}
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 7/7] x86/process: Clear PASID state for a newly forked/cloned thread
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-03-30 19:33   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

The PASID state has to be cleared on forks, since the child has a
different address space. The PASID is also cleared for thread clone. While
it would be correct to inherit the PASID in this case, it is unknown
whether the new task will use ENQCMD. Giving it the PASID "just in case"
would have the downside of increased context switch overhead to setting
the PASID MSR.

Since #GP faults have to be handled on any threads that were created before
the PASID was assigned to the mm of the process, newly created threads
might as well be treated in a consistent way.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/process.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 87de18c64cf5..cefdc8f7fc13 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -122,6 +122,16 @@ static int set_new_tls(struct task_struct *p, unsigned long tls)
 		return do_set_thread_area_64(p, ARCH_SET_FS, tls);
 }
 
+/* Clear PASID MSR/state for the forked/cloned thread. */
+static void clear_task_pasid(struct task_struct *task)
+{
+	/*
+	 * Clear the xfeatures bit in the PASID state so that the MSR will be
+	 * initialized to its init state (0) by XRSTORS.
+	 */
+	task->thread.fpu.state.xsave.header.xfeatures &= ~XFEATURE_MASK_PASID;
+}
+
 int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 		    unsigned long arg, struct task_struct *p, unsigned long tls)
 {
@@ -175,6 +185,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	task_user_gs(p) = get_user_gs(current_pt_regs());
 #endif
 
+	if (static_cpu_has(X86_FEATURE_ENQCMD))
+		clear_task_pasid(p);
+
 	/* Set a new TLS for the child thread? */
 	if (clone_flags & CLONE_SETTLS)
 		ret = set_new_tls(p, tls);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 7/7] x86/process: Clear PASID state for a newly forked/cloned thread
@ 2020-03-30 19:33   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-03-30 19:33 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

The PASID state has to be cleared on forks, since the child has a
different address space. The PASID is also cleared for thread clone. While
it would be correct to inherit the PASID in this case, it is unknown
whether the new task will use ENQCMD. Giving it the PASID "just in case"
would have the downside of increased context switch overhead to setting
the PASID MSR.

Since #GP faults have to be handled on any threads that were created before
the PASID was assigned to the mm of the process, newly created threads
might as well be treated in a consistent way.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/process.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 87de18c64cf5..cefdc8f7fc13 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -122,6 +122,16 @@ static int set_new_tls(struct task_struct *p, unsigned long tls)
 		return do_set_thread_area_64(p, ARCH_SET_FS, tls);
 }
 
+/* Clear PASID MSR/state for the forked/cloned thread. */
+static void clear_task_pasid(struct task_struct *task)
+{
+	/*
+	 * Clear the xfeatures bit in the PASID state so that the MSR will be
+	 * initialized to its init state (0) by XRSTORS.
+	 */
+	task->thread.fpu.state.xsave.header.xfeatures &= ~XFEATURE_MASK_PASID;
+}
+
 int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 		    unsigned long arg, struct task_struct *p, unsigned long tls)
 {
@@ -175,6 +185,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	task_user_gs(p) = get_user_gs(current_pt_regs());
 #endif
 
+	if (static_cpu_has(X86_FEATURE_ENQCMD))
+		clear_task_pasid(p);
+
 	/* Set a new TLS for the child thread? */
 	if (clone_flags & CLONE_SETTLS)
 		ret = set_new_tls(p, tls);
-- 
2.19.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7] x86: tag application address space for devices
  2020-03-30 19:33 ` Fenghua Yu
@ 2020-04-22 20:41   ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-22 20:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu

On Mon, Mar 30, 2020 at 12:33:01PM -0700, Fenghua Yu wrote:
> Typical hardware devices require a driver stack to translate application
> buffers to hardware addresses, and a kernel-user transition to notify the
> hardware of new work. What if both the translation and transition overhead
> could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
> enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
> Applications map portals in their local-address-space and directly submit
> work to them using a new instruction.
> 
Hi, maintainers,

Any comment on this series?

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 0/7] x86: tag application address space for devices
@ 2020-04-22 20:41   ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-22 20:41 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: iommu, x86, linux-kernel

On Mon, Mar 30, 2020 at 12:33:01PM -0700, Fenghua Yu wrote:
> Typical hardware devices require a driver stack to translate application
> buffers to hardware addresses, and a kernel-user transition to notify the
> hardware of new work. What if both the translation and transition overhead
> could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
> enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
> Applications map portals in their local-address-space and directly submit
> work to them using a new instruction.
> 
Hi, maintainers,

Any comment on this series?

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 11:02     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:02 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:

s/Add a documentation/Add documentation/

> From: Ashok Raj <ashok.raj@intel.com>
>
> ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> features are a complicated stack with lots of interconnected pieces.
> This documentation provides a big picture overview for all of the
> features.
>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++

How is that hooked up into the Documentation index?

 Documentation/x86/enqcmd.rst: WARNING: document isn't included in any toctree

> +++ b/Documentation/x86/enqcmd.rst
> @@ -0,0 +1,185 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Improved Device Interaction Overview

So the document is about ENQCMD, right? Can you please make that in some
way consistently named?

> +
> +== Background ==

This lacks any docbook formatting.... The resulting HTML looks like ...

> +
> +Shared Virtual Addressing (SVA) allows the processor and device to use the
> +same virtual addresses avoiding the need for software to translate virtual
> +addresses to physical addresses. ENQCMD is a new instruction on Intel
> +platforms that allows user applications to directly notify hardware of new
> +work, much like doorbells are used in some hardware, but carries a payload
> +that carries the PASID and some additional device specific commands
> +along with it.

Sorry that's not background information, that's an agglomeration of
words.

Can you please explain properly what's the background of SVA, how it
differs from regular device addressing and what kind of requirements it
has?

ENQCMD is not related to background. It's part of the new technology.

> +== Address Space Tagging ==
> +
> +A new MSR (MSR_IA32_PASID) allows an application address space to be
> +associated with what the PCIe spec calls a Process Address Space ID
> +(PASID). This PASID tag is carried along with all requests between
> +applications and devices and allows devices to interact with the process
> +address space.

Sigh. The important part here is not the MSR. The important part is to
explain what PASID is and where it comes from. Documentation has similar
rules as changelogs:

      1) Provide context

      2) Explain requirements
      
      3) Explain implementation

The pile you provided is completely backwards and unstructured.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
@ 2020-04-26 11:02     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:02 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:

s/Add a documentation/Add documentation/

> From: Ashok Raj <ashok.raj@intel.com>
>
> ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> features are a complicated stack with lots of interconnected pieces.
> This documentation provides a big picture overview for all of the
> features.
>
> Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++

How is that hooked up into the Documentation index?

 Documentation/x86/enqcmd.rst: WARNING: document isn't included in any toctree

> +++ b/Documentation/x86/enqcmd.rst
> @@ -0,0 +1,185 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Improved Device Interaction Overview

So the document is about ENQCMD, right? Can you please make that in some
way consistently named?

> +
> +== Background ==

This lacks any docbook formatting.... The resulting HTML looks like ...

> +
> +Shared Virtual Addressing (SVA) allows the processor and device to use the
> +same virtual addresses avoiding the need for software to translate virtual
> +addresses to physical addresses. ENQCMD is a new instruction on Intel
> +platforms that allows user applications to directly notify hardware of new
> +work, much like doorbells are used in some hardware, but carries a payload
> +that carries the PASID and some additional device specific commands
> +along with it.

Sorry that's not background information, that's an agglomeration of
words.

Can you please explain properly what's the background of SVA, how it
differs from regular device addressing and what kind of requirements it
has?

ENQCMD is not related to background. It's part of the new technology.

> +== Address Space Tagging ==
> +
> +A new MSR (MSR_IA32_PASID) allows an application address space to be
> +associated with what the PCIe spec calls a Process Address Space ID
> +(PASID). This PASID tag is carried along with all requests between
> +applications and devices and allows devices to interact with the process
> +address space.

Sigh. The important part here is not the MSR. The important part is to
explain what PASID is and where it comes from. Documentation has similar
rules as changelogs:

      1) Provide context

      2) Explain requirements
      
      3) Explain implementation

The pile you provided is completely backwards and unstructured.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 11:06     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:06 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:
> A user space application can execute ENQCMD instruction to submit work
> to device. The kernel executes ENQCMDS instruction to submit work to
> device.

So a user space application _can_ execute ENQCMD and the kernel
executes ENQCMDS. And both submit work to device.

> There is a lot of other enabling needed for the instructions to actually
> be usable in user space and the kernel, and that enabling is coming later
> in the series and in device drivers.

That's important information to the enumeration of the instructions in
which way?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
@ 2020-04-26 11:06     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:06 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:
> A user space application can execute ENQCMD instruction to submit work
> to device. The kernel executes ENQCMDS instruction to submit work to
> device.

So a user space application _can_ execute ENQCMD and the kernel
executes ENQCMDS. And both submit work to device.

> There is a lot of other enabling needed for the instructions to actually
> be usable in user space and the kernel, and that enabling is coming later
> in the series and in device drivers.

That's important information to the enumeration of the instructions in
which way?

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 11:17     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:17 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Yu-cheng Yu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:
> From: Yu-cheng Yu <yu-cheng.yu@intel.com>
>
> The IA32_PASID MSR is used when a task submits work via the ENQCMD
> instruction.

Is used?

> The per task MSR is stored in the task's supervisor FPU

per task MSR? Lot's of MSRs ....

> PASID state and is context switched by XSAVES/XRSTORS.
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
@ 2020-04-26 11:17     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:17 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, Yu-cheng Yu, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:
> From: Yu-cheng Yu <yu-cheng.yu@intel.com>
>
> The IA32_PASID MSR is used when a task submits work via the ENQCMD
> instruction.

Is used?

> The per task MSR is stored in the task's supervisor FPU

per task MSR? Lot's of MSRs ....

> PASID state and is context switched by XSAVES/XRSTORS.
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 11:22     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:22 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:

> The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
> (PASID), a 20-bit value. Bit 31 must be set to indicate the value
> programmed in the MSR is valid. Hardware uses PASID to identify which
> process submits the work and direct responses to the right process.

No. It does not identify the process. It identifies the process' address
space as the name says.

Please provide coherent and precise information.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
@ 2020-04-26 11:22     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 11:22 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:

> The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
> (PASID), a 20-bit value. Bit 31 must be set to indicate the value
> programmed in the MSR is valid. Hardware uses PASID to identify which
> process submits the work and direct responses to the right process.

No. It does not identify the process. It identifies the process' address
space as the name says.

Please provide coherent and precise information.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 14:55     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 14:55 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:

> PASID is shared by all threads in a process. So the logical place to keep
> track of it is in the "mm". Add the field to the architecture specific
> mm_context_t structure.
>
> A PASID is allocated for an "mm" the first time any thread attaches
> to an SVM capable device. Later device atatches (whether to the same

atatches?

> device or another SVM device) will re-use the same PASID.
>
> The PASID is freed when the process exits (so no need to keep
> reference counts on how many SVM devices are sharing the PASID).

I'm not buying that. If there is an outstanding request with the PASID
of a process then tearing down the process address space and freeing the
PASID (which might be reused) is fundamentally broken.

> +void __free_pasid(struct mm_struct *mm);
> +
>  #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> index bdeae9291e5c..137bf51f19e6 100644
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -50,6 +50,10 @@ typedef struct {
>  	u16 pkey_allocation_map;
>  	s16 execute_only_pkey;
>  #endif
> +
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> +	int pasid;

int? It's a value which gets programmed into the MSR along with the
valid bit (bit 31) set. 

>  extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index d7f2a5358900..da718a49e91e 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
>  	list_for_each_entry((sdev), &(svm)->devs, list)	\
>  		if ((d) != (sdev)->dev) {} else
>  
> +/*
> + * If this mm already has a PASID we can use it. Otherwise allocate a new one.
> + * Let the caller know if we did an allocation via 'new_pasid'.
> + */
> +static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
> +		       int pasid_max,  bool *new_pasid, int flags)

Again, data types please. flags are generally unsigned and not plain
int. Also pasid_max is certainly not plain int either.

> +{
> +	int pasid;
> +
> +	/*
> +	 * Reuse the PASID if the mm already has a PASID and not a private
> +	 * PASID is requested.
> +	 */
> +	if (mm && mm->context.pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> +		/*
> +		 * Once a PASID is allocated for this mm, the PASID
> +		 * stays with the mm until the mm is dropped. Reuse
> +		 * the PASID which has been already allocated for the
> +		 * mm instead of allocating a new one.
> +		 */
> +		ioasid_set_data(mm->context.pasid, svm);

So if the PASID is reused several times for different SVMs then every
time ioasid_data->private is set to a different SVM. How is that
supposed to work?

> +		*new_pasid = false;
> +
> +		return mm->context.pasid;
> +	}
> +
> +	/*
> +	 * Allocate a new pasid. Do not use PASID 0, reserved for RID to
> +	 * PASID.
> +	 */
> +	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);

ioasid_alloc() uses ioasid_t which is

typedef unsigned int ioasid_t;

Can we please have consistent types and behaviour all over the place?

> +	if (pasid == INVALID_IOASID)
> +		return -ENOSPC;
> +
> +	*new_pasid = true;
> +
> +	return pasid;
> +}
> +
>  int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
>  {
>  	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> @@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  	init_rcu_head(&sdev->rcu);
>  
>  	if (!svm) {
> +		bool new_pasid;
> +
>  		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
>  		if (!svm) {
>  			ret = -ENOMEM;
> @@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (pasid_max > intel_pasid_max_id)
>  			pasid_max = intel_pasid_max_id;
>  
> -		/* Do not use PASID 0, reserved for RID to PASID */
> -		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> -					  pasid_max - 1, svm);
> -		if (svm->pasid == INVALID_IOASID) {
> +		svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
> +		if (svm->pasid < 0) {
>  			kfree(svm);
>  			kfree(sdev);
> -			ret = -ENOSPC;

ret gets magically initialized to an error return value, right?

>  			goto out;
>  		}
> +
>  		svm->notifier.ops = &intel_mmuops;
>  		svm->mm = mm;
>  		svm->flags = flags;
> @@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (mm) {
>  			ret = mmu_notifier_register(&svm->notifier, mm);
>  			if (ret) {
> -				ioasid_free(svm->pasid);
> +				if (new_pasid)
> +					ioasid_free(svm->pasid);
>  				kfree(svm);
>  				kfree(sdev);
>  				goto out;
> @@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (ret) {
>  			if (mm)
>  				mmu_notifier_unregister(&svm->notifier, mm);
> -			ioasid_free(svm->pasid);
> +			if (new_pasid)
> +				ioasid_free(svm->pasid);
>  			kfree(svm);
>  			kfree(sdev);

So there are 3 places now freeing svm ad sdev and 2 of them
conditionally free svm->pasid. Can you please rewrite that to have a
proper error exit path instead of glueing that stuff into the existing
mess?

>  			goto out;
>  		}
>  
> +		if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> +			/*
> +			 * Track the new pasid in the mm. The pasid will be
> +			 * freed at process exit. Don't track requested
> +			 * private PASID in the mm.

What happens to private PASIDs?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-26 14:55     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 14:55 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:

> PASID is shared by all threads in a process. So the logical place to keep
> track of it is in the "mm". Add the field to the architecture specific
> mm_context_t structure.
>
> A PASID is allocated for an "mm" the first time any thread attaches
> to an SVM capable device. Later device atatches (whether to the same

atatches?

> device or another SVM device) will re-use the same PASID.
>
> The PASID is freed when the process exits (so no need to keep
> reference counts on how many SVM devices are sharing the PASID).

I'm not buying that. If there is an outstanding request with the PASID
of a process then tearing down the process address space and freeing the
PASID (which might be reused) is fundamentally broken.

> +void __free_pasid(struct mm_struct *mm);
> +
>  #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> index bdeae9291e5c..137bf51f19e6 100644
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -50,6 +50,10 @@ typedef struct {
>  	u16 pkey_allocation_map;
>  	s16 execute_only_pkey;
>  #endif
> +
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> +	int pasid;

int? It's a value which gets programmed into the MSR along with the
valid bit (bit 31) set. 

>  extern void switch_mm(struct mm_struct *prev, struct mm_struct *next,
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index d7f2a5358900..da718a49e91e 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
>  	list_for_each_entry((sdev), &(svm)->devs, list)	\
>  		if ((d) != (sdev)->dev) {} else
>  
> +/*
> + * If this mm already has a PASID we can use it. Otherwise allocate a new one.
> + * Let the caller know if we did an allocation via 'new_pasid'.
> + */
> +static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
> +		       int pasid_max,  bool *new_pasid, int flags)

Again, data types please. flags are generally unsigned and not plain
int. Also pasid_max is certainly not plain int either.

> +{
> +	int pasid;
> +
> +	/*
> +	 * Reuse the PASID if the mm already has a PASID and not a private
> +	 * PASID is requested.
> +	 */
> +	if (mm && mm->context.pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> +		/*
> +		 * Once a PASID is allocated for this mm, the PASID
> +		 * stays with the mm until the mm is dropped. Reuse
> +		 * the PASID which has been already allocated for the
> +		 * mm instead of allocating a new one.
> +		 */
> +		ioasid_set_data(mm->context.pasid, svm);

So if the PASID is reused several times for different SVMs then every
time ioasid_data->private is set to a different SVM. How is that
supposed to work?

> +		*new_pasid = false;
> +
> +		return mm->context.pasid;
> +	}
> +
> +	/*
> +	 * Allocate a new pasid. Do not use PASID 0, reserved for RID to
> +	 * PASID.
> +	 */
> +	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, svm);

ioasid_alloc() uses ioasid_t which is

typedef unsigned int ioasid_t;

Can we please have consistent types and behaviour all over the place?

> +	if (pasid == INVALID_IOASID)
> +		return -ENOSPC;
> +
> +	*new_pasid = true;
> +
> +	return pasid;
> +}
> +
>  int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_ops *ops)
>  {
>  	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> @@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  	init_rcu_head(&sdev->rcu);
>  
>  	if (!svm) {
> +		bool new_pasid;
> +
>  		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
>  		if (!svm) {
>  			ret = -ENOMEM;
> @@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (pasid_max > intel_pasid_max_id)
>  			pasid_max = intel_pasid_max_id;
>  
> -		/* Do not use PASID 0, reserved for RID to PASID */
> -		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> -					  pasid_max - 1, svm);
> -		if (svm->pasid == INVALID_IOASID) {
> +		svm->pasid = alloc_pasid(svm, mm, pasid_max, &new_pasid, flags);
> +		if (svm->pasid < 0) {
>  			kfree(svm);
>  			kfree(sdev);
> -			ret = -ENOSPC;

ret gets magically initialized to an error return value, right?

>  			goto out;
>  		}
> +
>  		svm->notifier.ops = &intel_mmuops;
>  		svm->mm = mm;
>  		svm->flags = flags;
> @@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (mm) {
>  			ret = mmu_notifier_register(&svm->notifier, mm);
>  			if (ret) {
> -				ioasid_free(svm->pasid);
> +				if (new_pasid)
> +					ioasid_free(svm->pasid);
>  				kfree(svm);
>  				kfree(sdev);
>  				goto out;
> @@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
>  		if (ret) {
>  			if (mm)
>  				mmu_notifier_unregister(&svm->notifier, mm);
> -			ioasid_free(svm->pasid);
> +			if (new_pasid)
> +				ioasid_free(svm->pasid);
>  			kfree(svm);
>  			kfree(sdev);

So there are 3 places now freeing svm ad sdev and 2 of them
conditionally free svm->pasid. Can you please rewrite that to have a
proper error exit path instead of glueing that stuff into the existing
mess?

>  			goto out;
>  		}
>  
> +		if (mm && new_pasid && !(flags & SVM_FLAG_PRIVATE_PASID)) {
> +			/*
> +			 * Track the new pasid in the mm. The pasid will be
> +			 * freed at process exit. Don't track requested
> +			 * private PASID in the mm.

What happens to private PASIDs?

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-03-30 19:33   ` Fenghua Yu
@ 2020-04-26 15:25     ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 15:25 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: linux-kernel, x86, iommu, Fenghua Yu

Fenghua Yu <fenghua.yu@intel.com> writes:
> A #GP fault is generated when ENQCMD instruction is executed without
> a valid PASID value programmed in.

Programmed in what?

> The #GP fault handler will initialize the current thread's PASID MSR.
>
> The following heuristic is used to avoid decoding the user instructions
> to determine the precise reason for the #GP fault:
> 1) If the mm for the process has not been allocated a PASID, this #GP
>    cannot be fixed.
> 2) If the PASID MSR is already initialized, then the #GP was for some
>    other reason
> 3) Try initializing the PASID MSR and returning. If the #GP was from
>    an ENQCMD this will fix it. If not, the #GP fault will be repeated
>    and we will hit case "2".
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>

Just for the record I also suggested to have a proper errorcode in the
#GP for ENQCMD and I surely did not suggest to avoid decoding the user
instructions.

>  void __free_pasid(struct mm_struct *mm);
> +bool __fixup_pasid_exception(void);
>  
>  #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 6ef00eb6fbb9..369b5ba94635 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -56,6 +56,7 @@
>  #include <asm/umip.h>
>  #include <asm/insn.h>
>  #include <asm/insn-eval.h>
> +#include <asm/iommu.h>
>  
>  #ifdef CONFIG_X86_64
>  #include <asm/x86_init.h>
> @@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
>  	return GP_CANONICAL;
>  }
>  
> +static bool fixup_pasid_exception(void)
> +{
> +	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> +		return false;
> +	if (!static_cpu_has(X86_FEATURE_ENQCMD))
> +		return false;
> +
> +	return __fixup_pasid_exception();
> +}
> +
>  #define GPFSTR "general protection fault"
>  
>  dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
>  	int ret;
>  
>  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> +	if (user_mode(regs) && fixup_pasid_exception()) {
> +		cond_local_irq_enable(regs);

The point of this conditional irq enable _AFTER_ calling into the fixup
function is? Also what's the reason for keeping interrupts disabled
while calling into that function? Comments exist for a reason.

> +		return;
> +	}
> +
>  	cond_local_irq_enable(regs);
>  
>  	if (static_cpu_has(X86_FEATURE_UMIP)) {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index da718a49e91e..5ed39a022adb 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
>  	 */
>  	ioasid_free(pasid);
>  }
> +
> +/*
> + * Fix up the PASID MSR if possible.
> + *
> + * But if the #GP was due to another reason, a second #GP might be triggered
> + * to force proper behavior.
> + */
> +bool __fixup_pasid_exception(void)
> +{
> +	struct mm_struct *mm;
> +	bool ret = true;
> +	u64 pasid_msr;
> +	int pasid;
> +
> +	mm = get_task_mm(current);

Why do you need a reference to current->mm ?

> +	/* This #GP was triggered from user mode. So mm cannot be NULL. */
> +	pasid = mm->context.pasid;
> +	/* Ensure this process has been bound to a PASID. */
> +	if (!pasid) {
> +		ret = false;
> +		goto out;
> +	}
> +
> +	/* Check to see if the PASID MSR has already been set for this task. */
> +	rdmsrl(MSR_IA32_PASID, pasid_msr);
> +	if (pasid_msr & MSR_IA32_PASID_VALID) {
> +		ret = false;
> +		goto out;
> +	}
> +
> +	/* Fix up the MSR. */
> +	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> +out:
> +	mmput(mm);

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-26 15:25     ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-26 15:25 UTC (permalink / raw)
  To: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Jacob Jun Pan, Dave Jiang, Sohil Mehta, Ravi V Shankar
  Cc: Fenghua Yu, iommu, x86, linux-kernel

Fenghua Yu <fenghua.yu@intel.com> writes:
> A #GP fault is generated when ENQCMD instruction is executed without
> a valid PASID value programmed in.

Programmed in what?

> The #GP fault handler will initialize the current thread's PASID MSR.
>
> The following heuristic is used to avoid decoding the user instructions
> to determine the precise reason for the #GP fault:
> 1) If the mm for the process has not been allocated a PASID, this #GP
>    cannot be fixed.
> 2) If the PASID MSR is already initialized, then the #GP was for some
>    other reason
> 3) Try initializing the PASID MSR and returning. If the #GP was from
>    an ENQCMD this will fix it. If not, the #GP fault will be repeated
>    and we will hit case "2".
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>

Just for the record I also suggested to have a proper errorcode in the
#GP for ENQCMD and I surely did not suggest to avoid decoding the user
instructions.

>  void __free_pasid(struct mm_struct *mm);
> +bool __fixup_pasid_exception(void);
>  
>  #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 6ef00eb6fbb9..369b5ba94635 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -56,6 +56,7 @@
>  #include <asm/umip.h>
>  #include <asm/insn.h>
>  #include <asm/insn-eval.h>
> +#include <asm/iommu.h>
>  
>  #ifdef CONFIG_X86_64
>  #include <asm/x86_init.h>
> @@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
>  	return GP_CANONICAL;
>  }
>  
> +static bool fixup_pasid_exception(void)
> +{
> +	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> +		return false;
> +	if (!static_cpu_has(X86_FEATURE_ENQCMD))
> +		return false;
> +
> +	return __fixup_pasid_exception();
> +}
> +
>  #define GPFSTR "general protection fault"
>  
>  dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
>  	int ret;
>  
>  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> +	if (user_mode(regs) && fixup_pasid_exception()) {
> +		cond_local_irq_enable(regs);

The point of this conditional irq enable _AFTER_ calling into the fixup
function is? Also what's the reason for keeping interrupts disabled
while calling into that function? Comments exist for a reason.

> +		return;
> +	}
> +
>  	cond_local_irq_enable(regs);
>  
>  	if (static_cpu_has(X86_FEATURE_UMIP)) {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index da718a49e91e..5ed39a022adb 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
>  	 */
>  	ioasid_free(pasid);
>  }
> +
> +/*
> + * Fix up the PASID MSR if possible.
> + *
> + * But if the #GP was due to another reason, a second #GP might be triggered
> + * to force proper behavior.
> + */
> +bool __fixup_pasid_exception(void)
> +{
> +	struct mm_struct *mm;
> +	bool ret = true;
> +	u64 pasid_msr;
> +	int pasid;
> +
> +	mm = get_task_mm(current);

Why do you need a reference to current->mm ?

> +	/* This #GP was triggered from user mode. So mm cannot be NULL. */
> +	pasid = mm->context.pasid;
> +	/* Ensure this process has been bound to a PASID. */
> +	if (!pasid) {
> +		ret = false;
> +		goto out;
> +	}
> +
> +	/* Check to see if the PASID MSR has already been set for this task. */
> +	rdmsrl(MSR_IA32_PASID, pasid_msr);
> +	if (pasid_msr & MSR_IA32_PASID_VALID) {
> +		ret = false;
> +		goto out;
> +	}
> +
> +	/* Fix up the MSR. */
> +	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> +out:
> +	mmput(mm);

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-26 15:25     ` Thomas Gleixner
@ 2020-04-27 20:11       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A #GP fault is generated when ENQCMD instruction is executed without
> > a valid PASID value programmed in.
> 
> Programmed in what?

Will change to "...programmed in the PASID MSR".

> 
> > The #GP fault handler will initialize the current thread's PASID MSR.
> >
> > The following heuristic is used to avoid decoding the user instructions
> > to determine the precise reason for the #GP fault:
> > 1) If the mm for the process has not been allocated a PASID, this #GP
> >    cannot be fixed.
> > 2) If the PASID MSR is already initialized, then the #GP was for some
> >    other reason
> > 3) Try initializing the PASID MSR and returning. If the #GP was from
> >    an ENQCMD this will fix it. If not, the #GP fault will be repeated
> >    and we will hit case "2".
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> 
> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.
> 
> >  void __free_pasid(struct mm_struct *mm);
> > +bool __fixup_pasid_exception(void);
> >  
> >  #endif /* _ASM_X86_IOMMU_H */
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index 6ef00eb6fbb9..369b5ba94635 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -56,6 +56,7 @@
> >  #include <asm/umip.h>
> >  #include <asm/insn.h>
> >  #include <asm/insn-eval.h>
> > +#include <asm/iommu.h>
> >  
> >  #ifdef CONFIG_X86_64
> >  #include <asm/x86_init.h>
> > @@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> >  	return GP_CANONICAL;
> >  }
> >  
> > +static bool fixup_pasid_exception(void)
> > +{
> > +	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> > +		return false;
> > +	if (!static_cpu_has(X86_FEATURE_ENQCMD))
> > +		return false;
> > +
> > +	return __fixup_pasid_exception();
> > +}
> > +
> >  #define GPFSTR "general protection fault"
> >  
> >  dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> > @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> >  	int ret;
> >  
> >  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > +
> > +	if (user_mode(regs) && fixup_pasid_exception()) {
> > +		cond_local_irq_enable(regs);
> 
> The point of this conditional irq enable _AFTER_ calling into the fixup
> function is? Also what's the reason for keeping interrupts disabled
> while calling into that function? Comments exist for a reason.

irq needs to be disabled because the fixup function requires to disable
preempt in order to update the PASID MSR on the faulting CPU.

Will add comments here.

> 
> > +		return;
> > +	}
> > +
> >  	cond_local_irq_enable(regs);
> >  
> >  	if (static_cpu_has(X86_FEATURE_UMIP)) {
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index da718a49e91e..5ed39a022adb 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
> >  	 */
> >  	ioasid_free(pasid);
> >  }
> > +
> > +/*
> > + * Fix up the PASID MSR if possible.
> > + *
> > + * But if the #GP was due to another reason, a second #GP might be triggered
> > + * to force proper behavior.
> > + */
> > +bool __fixup_pasid_exception(void)
> > +{
> > +	struct mm_struct *mm;
> > +	bool ret = true;
> > +	u64 pasid_msr;
> > +	int pasid;
> > +
> > +	mm = get_task_mm(current);
> 
> Why do you need a reference to current->mm ?

The PASID for the address space is per mm and is stored in mm.
To get the PASID, we need to get the mm and the pasid=mm->context.pasid.


> 
> > +	/* This #GP was triggered from user mode. So mm cannot be NULL. */
> > +	pasid = mm->context.pasid;
> > +	/* Ensure this process has been bound to a PASID. */
> > +	if (!pasid) {
> > +		ret = false;
> > +		goto out;
> > +	}
> > +
> > +	/* Check to see if the PASID MSR has already been set for this task. */
> > +	rdmsrl(MSR_IA32_PASID, pasid_msr);
> > +	if (pasid_msr & MSR_IA32_PASID_VALID) {
> > +		ret = false;
> > +		goto out;
> > +	}
> > +
> > +	/* Fix up the MSR. */
> > +	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > +out:
> > +	mmput(mm);

Thanks,

-Fenghua


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-27 20:11       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A #GP fault is generated when ENQCMD instruction is executed without
> > a valid PASID value programmed in.
> 
> Programmed in what?

Will change to "...programmed in the PASID MSR".

> 
> > The #GP fault handler will initialize the current thread's PASID MSR.
> >
> > The following heuristic is used to avoid decoding the user instructions
> > to determine the precise reason for the #GP fault:
> > 1) If the mm for the process has not been allocated a PASID, this #GP
> >    cannot be fixed.
> > 2) If the PASID MSR is already initialized, then the #GP was for some
> >    other reason
> > 3) Try initializing the PASID MSR and returning. If the #GP was from
> >    an ENQCMD this will fix it. If not, the #GP fault will be repeated
> >    and we will hit case "2".
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> 
> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.
> 
> >  void __free_pasid(struct mm_struct *mm);
> > +bool __fixup_pasid_exception(void);
> >  
> >  #endif /* _ASM_X86_IOMMU_H */
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index 6ef00eb6fbb9..369b5ba94635 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -56,6 +56,7 @@
> >  #include <asm/umip.h>
> >  #include <asm/insn.h>
> >  #include <asm/insn-eval.h>
> > +#include <asm/iommu.h>
> >  
> >  #ifdef CONFIG_X86_64
> >  #include <asm/x86_init.h>
> > @@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> >  	return GP_CANONICAL;
> >  }
> >  
> > +static bool fixup_pasid_exception(void)
> > +{
> > +	if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> > +		return false;
> > +	if (!static_cpu_has(X86_FEATURE_ENQCMD))
> > +		return false;
> > +
> > +	return __fixup_pasid_exception();
> > +}
> > +
> >  #define GPFSTR "general protection fault"
> >  
> >  dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> > @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> >  	int ret;
> >  
> >  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > +
> > +	if (user_mode(regs) && fixup_pasid_exception()) {
> > +		cond_local_irq_enable(regs);
> 
> The point of this conditional irq enable _AFTER_ calling into the fixup
> function is? Also what's the reason for keeping interrupts disabled
> while calling into that function? Comments exist for a reason.

irq needs to be disabled because the fixup function requires to disable
preempt in order to update the PASID MSR on the faulting CPU.

Will add comments here.

> 
> > +		return;
> > +	}
> > +
> >  	cond_local_irq_enable(regs);
> >  
> >  	if (static_cpu_has(X86_FEATURE_UMIP)) {
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index da718a49e91e..5ed39a022adb 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
> >  	 */
> >  	ioasid_free(pasid);
> >  }
> > +
> > +/*
> > + * Fix up the PASID MSR if possible.
> > + *
> > + * But if the #GP was due to another reason, a second #GP might be triggered
> > + * to force proper behavior.
> > + */
> > +bool __fixup_pasid_exception(void)
> > +{
> > +	struct mm_struct *mm;
> > +	bool ret = true;
> > +	u64 pasid_msr;
> > +	int pasid;
> > +
> > +	mm = get_task_mm(current);
> 
> Why do you need a reference to current->mm ?

The PASID for the address space is per mm and is stored in mm.
To get the PASID, we need to get the mm and the pasid=mm->context.pasid.


> 
> > +	/* This #GP was triggered from user mode. So mm cannot be NULL. */
> > +	pasid = mm->context.pasid;
> > +	/* Ensure this process has been bound to a PASID. */
> > +	if (!pasid) {
> > +		ret = false;
> > +		goto out;
> > +	}
> > +
> > +	/* Check to see if the PASID MSR has already been set for this task. */
> > +	rdmsrl(MSR_IA32_PASID, pasid_msr);
> > +	if (pasid_msr & MSR_IA32_PASID_VALID) {
> > +		ret = false;
> > +		goto out;
> > +	}
> > +
> > +	/* Fix up the MSR. */
> > +	wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > +out:
> > +	mmput(mm);

Thanks,

-Fenghua

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
  2020-04-26 11:02     ` Thomas Gleixner
@ 2020-04-27 20:13       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 01:02:12PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> s/Add a documentation/Add documentation/
> 
> > From: Ashok Raj <ashok.raj@intel.com>
> >
> > ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> > features are a complicated stack with lots of interconnected pieces.
> > This documentation provides a big picture overview for all of the
> > features.
> >
> > Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> > Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > ---
> >  Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++
> 
> How is that hooked up into the Documentation index?
> 
>  Documentation/x86/enqcmd.rst: WARNING: document isn't included in any toctree
> 
> > +++ b/Documentation/x86/enqcmd.rst
> > @@ -0,0 +1,185 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +Improved Device Interaction Overview
> 
> So the document is about ENQCMD, right? Can you please make that in some
> way consistently named?
> 
> > +
> > +== Background ==
> 
> This lacks any docbook formatting.... The resulting HTML looks like ...
> 
> > +
> > +Shared Virtual Addressing (SVA) allows the processor and device to use the
> > +same virtual addresses avoiding the need for software to translate virtual
> > +addresses to physical addresses. ENQCMD is a new instruction on Intel
> > +platforms that allows user applications to directly notify hardware of new
> > +work, much like doorbells are used in some hardware, but carries a payload
> > +that carries the PASID and some additional device specific commands
> > +along with it.
> 
> Sorry that's not background information, that's an agglomeration of
> words.
> 
> Can you please explain properly what's the background of SVA, how it
> differs from regular device addressing and what kind of requirements it
> has?
> 
> ENQCMD is not related to background. It's part of the new technology.
> 
> > +== Address Space Tagging ==
> > +
> > +A new MSR (MSR_IA32_PASID) allows an application address space to be
> > +associated with what the PCIe spec calls a Process Address Space ID
> > +(PASID). This PASID tag is carried along with all requests between
> > +applications and devices and allows devices to interact with the process
> > +address space.
> 
> Sigh. The important part here is not the MSR. The important part is to
> explain what PASID is and where it comes from. Documentation has similar
> rules as changelogs:
> 
>       1) Provide context
> 
>       2) Explain requirements
>       
>       3) Explain implementation
> 
> The pile you provided is completely backwards and unstructured.

Ok. Will address all of the comments.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/7] docs: x86: Add a documentation for ENQCMD
@ 2020-04-27 20:13       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 01:02:12PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> s/Add a documentation/Add documentation/
> 
> > From: Ashok Raj <ashok.raj@intel.com>
> >
> > ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
> > features are a complicated stack with lots of interconnected pieces.
> > This documentation provides a big picture overview for all of the
> > features.
> >
> > Signed-off-by: Ashok Raj <ashok.raj@intel.com>
> > Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
> > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > ---
> >  Documentation/x86/enqcmd.rst | 185 +++++++++++++++++++++++++++++++++++
> 
> How is that hooked up into the Documentation index?
> 
>  Documentation/x86/enqcmd.rst: WARNING: document isn't included in any toctree
> 
> > +++ b/Documentation/x86/enqcmd.rst
> > @@ -0,0 +1,185 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +Improved Device Interaction Overview
> 
> So the document is about ENQCMD, right? Can you please make that in some
> way consistently named?
> 
> > +
> > +== Background ==
> 
> This lacks any docbook formatting.... The resulting HTML looks like ...
> 
> > +
> > +Shared Virtual Addressing (SVA) allows the processor and device to use the
> > +same virtual addresses avoiding the need for software to translate virtual
> > +addresses to physical addresses. ENQCMD is a new instruction on Intel
> > +platforms that allows user applications to directly notify hardware of new
> > +work, much like doorbells are used in some hardware, but carries a payload
> > +that carries the PASID and some additional device specific commands
> > +along with it.
> 
> Sorry that's not background information, that's an agglomeration of
> words.
> 
> Can you please explain properly what's the background of SVA, how it
> differs from regular device addressing and what kind of requirements it
> has?
> 
> ENQCMD is not related to background. It's part of the new technology.
> 
> > +== Address Space Tagging ==
> > +
> > +A new MSR (MSR_IA32_PASID) allows an application address space to be
> > +associated with what the PCIe spec calls a Process Address Space ID
> > +(PASID). This PASID tag is carried along with all requests between
> > +applications and devices and allows devices to interact with the process
> > +address space.
> 
> Sigh. The important part here is not the MSR. The important part is to
> explain what PASID is and where it comes from. Documentation has similar
> rules as changelogs:
> 
>       1) Provide context
> 
>       2) Explain requirements
>       
>       3) Explain implementation
> 
> The pile you provided is completely backwards and unstructured.

Ok. Will address all of the comments.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  2020-04-26 11:06     ` Thomas Gleixner
@ 2020-04-27 20:17       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 01:06:33PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A user space application can execute ENQCMD instruction to submit work
> > to device. The kernel executes ENQCMDS instruction to submit work to
> > device.
> 
> So a user space application _can_ execute ENQCMD and the kernel
> executes ENQCMDS. And both submit work to device.
> 
> > There is a lot of other enabling needed for the instructions to actually
> > be usable in user space and the kernel, and that enabling is coming later
> > in the series and in device drivers.
> 
> That's important information to the enumeration of the instructions in
> which way?

I just want to notify people this enumeration is only part of enabling
ENQCMD. But seems this paragraph is not so useful to be here. Maybe I can
remove it.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
@ 2020-04-27 20:17       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 01:06:33PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A user space application can execute ENQCMD instruction to submit work
> > to device. The kernel executes ENQCMDS instruction to submit work to
> > device.
> 
> So a user space application _can_ execute ENQCMD and the kernel
> executes ENQCMDS. And both submit work to device.
> 
> > There is a lot of other enabling needed for the instructions to actually
> > be usable in user space and the kernel, and that enabling is coming later
> > in the series and in device drivers.
> 
> That's important information to the enumeration of the instructions in
> which way?

I just want to notify people this enumeration is only part of enabling
ENQCMD. But seems this paragraph is not so useful to be here. Maybe I can
remove it.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
  2020-04-26 11:17     ` Thomas Gleixner
@ 2020-04-27 20:33       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, Yu-cheng Yu

On Sun, Apr 26, 2020 at 01:17:11PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > From: Yu-cheng Yu <yu-cheng.yu@intel.com>
> >
> > The IA32_PASID MSR is used when a task submits work via the ENQCMD
> > instruction.
> 
> Is used?
> 
> > The per task MSR is stored in the task's supervisor FPU
> 
> per task MSR? Lot's of MSRs ....
> 
> > PASID state and is context switched by XSAVES/XRSTORS.
> >

Maybe change the commit messge to the following?

ENQCMD instruction reads PASID from IA32_PASID MSR. The MSR is stored
in the task's supervisor FPU PASID state and is context switched by
XSAVES/XRSTORS.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature
@ 2020-04-27 20:33       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Yu-cheng Yu, Tony Luck, Dave Jiang, Ashok Raj,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 01:17:11PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > From: Yu-cheng Yu <yu-cheng.yu@intel.com>
> >
> > The IA32_PASID MSR is used when a task submits work via the ENQCMD
> > instruction.
> 
> Is used?
> 
> > The per task MSR is stored in the task's supervisor FPU
> 
> per task MSR? Lot's of MSRs ....
> 
> > PASID state and is context switched by XSAVES/XRSTORS.
> >

Maybe change the commit messge to the following?

ENQCMD instruction reads PASID from IA32_PASID MSR. The MSR is stored
in the task's supervisor FPU PASID state and is context switched by
XSAVES/XRSTORS.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
  2020-04-26 11:22     ` Thomas Gleixner
@ 2020-04-27 20:50       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 01:22:00PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> > The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
> > (PASID), a 20-bit value. Bit 31 must be set to indicate the value
> > programmed in the MSR is valid. Hardware uses PASID to identify which
> > process submits the work and direct responses to the right process.
> 
> No. It does not identify the process. It identifies the process' address
> space as the name says.
> 
> Please provide coherent and precise information.

Ok. Will change to address space identification.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR
@ 2020-04-27 20:50       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 20:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 01:22:00PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> > The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
> > (PASID), a 20-bit value. Bit 31 must be set to indicate the value
> > programmed in the MSR is valid. Hardware uses PASID to identify which
> > process submits the work and direct responses to the right process.
> 
> No. It does not identify the process. It identifies the process' address
> space as the name says.
> 
> Please provide coherent and precise information.

Ok. Will change to address space identification.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-26 14:55     ` Thomas Gleixner
@ 2020-04-27 22:18       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 22:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > +++ b/arch/x86/include/asm/mmu.h @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map; s16 execute_only_pkey;
> >  #endif
> > + +#ifdef CONFIG_INTEL_IOMMU_SVM +	int pasid;
> 
> int? It's a value which gets programmed into the MSR along with the valid 
> bit (bit 31) set.

The pasid is defined as "int" in struct intel_svm and in 
intel_svm_bind_mm() and intel_svm_unbind_mm(). So the pasid defined in this 
patch follows the same type defined in those places.

But as you pointed out below, ioasid_t is defined as "unsigned int".

> 
> >  extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, 
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c 
> > index d7f2a5358900..da718a49e91e 100644 --- a/drivers/iommu/intel-svm.c 
> > +++ b/drivers/iommu/intel-svm.c @@ -226,6 +226,45 @@ static 
> > LIST_HEAD(global_svm_list);
> >  	list_for_each_entry((sdev), &(svm)->devs, list)	\
> >  		if ((d) != (sdev)->dev) {} else
> >  
> > +/* + * If this mm already has a PASID we can use it. Otherwise 
> > allocate a new one. + * Let the caller know if we did an allocation via 
> > 'new_pasid'. + */ +static int alloc_pasid(struct intel_svm *svm, struct 
> > mm_struct *mm, + int pasid_max, bool *new_pasid, int flags)
> 
> Again, data types please. flags are generally unsigned and not plain int. 
> Also pasid_max is certainly not plain int either.

The caller defines pasid_max and flags as "int". This function just follows
the caller's definitions.

But I will change their definitions to "unsigned int" here.

> 
> > + *new_pasid = false; + + return mm->context.pasid; + } + + /* + * 
> > Allocate a new pasid. Do not use PASID 0, reserved for RID to + * 
> > PASID. + */ + pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, 
> > svm);
> 
> ioasid_alloc() uses ioasid_t which is
> 
> typedef unsigned int ioasid_t;
> 
> Can we please have consistent types and behaviour all over the place?

Should I just define "pasid", "pasid_max", "flags" as "unsigned int" for
the new functions/code?

Or should I also change their types to "unsigned int" in the original
svm code (struct intel_svm, ...bind_mm(), etc)? I'm afraid that will be
a lot of changes and should be in a separate preparation patch.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-27 22:18       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-27 22:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > +++ b/arch/x86/include/asm/mmu.h @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map; s16 execute_only_pkey;
> >  #endif
> > + +#ifdef CONFIG_INTEL_IOMMU_SVM +	int pasid;
> 
> int? It's a value which gets programmed into the MSR along with the valid 
> bit (bit 31) set.

The pasid is defined as "int" in struct intel_svm and in 
intel_svm_bind_mm() and intel_svm_unbind_mm(). So the pasid defined in this 
patch follows the same type defined in those places.

But as you pointed out below, ioasid_t is defined as "unsigned int".

> 
> >  extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, 
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c 
> > index d7f2a5358900..da718a49e91e 100644 --- a/drivers/iommu/intel-svm.c 
> > +++ b/drivers/iommu/intel-svm.c @@ -226,6 +226,45 @@ static 
> > LIST_HEAD(global_svm_list);
> >  	list_for_each_entry((sdev), &(svm)->devs, list)	\
> >  		if ((d) != (sdev)->dev) {} else
> >  
> > +/* + * If this mm already has a PASID we can use it. Otherwise 
> > allocate a new one. + * Let the caller know if we did an allocation via 
> > 'new_pasid'. + */ +static int alloc_pasid(struct intel_svm *svm, struct 
> > mm_struct *mm, + int pasid_max, bool *new_pasid, int flags)
> 
> Again, data types please. flags are generally unsigned and not plain int. 
> Also pasid_max is certainly not plain int either.

The caller defines pasid_max and flags as "int". This function just follows
the caller's definitions.

But I will change their definitions to "unsigned int" here.

> 
> > + *new_pasid = false; + + return mm->context.pasid; + } + + /* + * 
> > Allocate a new pasid. Do not use PASID 0, reserved for RID to + * 
> > PASID. + */ + pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1, 
> > svm);
> 
> ioasid_alloc() uses ioasid_t which is
> 
> typedef unsigned int ioasid_t;
> 
> Can we please have consistent types and behaviour all over the place?

Should I just define "pasid", "pasid_max", "flags" as "unsigned int" for
the new functions/code?

Or should I also change their types to "unsigned int" in the original
svm code (struct intel_svm, ...bind_mm(), etc)? I'm afraid that will be
a lot of changes and should be in a separate preparation patch.

Thanks.

-Fenghua
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-26 15:25     ` Thomas Gleixner
@ 2020-04-27 22:46       ` Raj, Ashok
  -1 siblings, 0 replies; 74+ messages in thread
From: Raj, Ashok @ 2020-04-27 22:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, Ashok Raj

Hi Thomas

On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A #GP fault is generated when ENQCMD instruction is executed without
> > a valid PASID value programmed in.
> 
> Programmed in what?
> 
> > The #GP fault handler will initialize the current thread's PASID MSR.
> >
> > The following heuristic is used to avoid decoding the user instructions
> > to determine the precise reason for the #GP fault:
> > 1) If the mm for the process has not been allocated a PASID, this #GP
> >    cannot be fixed.
> > 2) If the PASID MSR is already initialized, then the #GP was for some
> >    other reason
> > 3) Try initializing the PASID MSR and returning. If the #GP was from
> >    an ENQCMD this will fix it. If not, the #GP fault will be repeated
> >    and we will hit case "2".
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> 
> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.

We certainly discussed the possiblity of adding an error code to 
identiy #GP due to ENQCMD with our HW architects. 

There are only a few cases that have an error code, like move to segment
with an invalid value for instance. There were a few but i don't
recall that entire list. 

Since the error code is 0 in most places, there isn't plumbing in hw to return
this value in all cases. It appeared that due to some uarch reasons it
wasn't as simple as it appears to /me sw kinds :-)

So after some internal discussion we decided to take the current
approach. Its possible that if the #GP was due to some other reason
we might #GP another time. Since this wasn't perf or speed path we took
this lazy approach. 

We will keep tabs with HW folks for future consideration. 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-27 22:46       ` Raj, Ashok
  0 siblings, 0 replies; 74+ messages in thread
From: Raj, Ashok @ 2020-04-27 22:46 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, Jacob Jun Pan, H Peter Anvin, David Woodhouse

Hi Thomas

On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > A #GP fault is generated when ENQCMD instruction is executed without
> > a valid PASID value programmed in.
> 
> Programmed in what?
> 
> > The #GP fault handler will initialize the current thread's PASID MSR.
> >
> > The following heuristic is used to avoid decoding the user instructions
> > to determine the precise reason for the #GP fault:
> > 1) If the mm for the process has not been allocated a PASID, this #GP
> >    cannot be fixed.
> > 2) If the PASID MSR is already initialized, then the #GP was for some
> >    other reason
> > 3) Try initializing the PASID MSR and returning. If the #GP was from
> >    an ENQCMD this will fix it. If not, the #GP fault will be repeated
> >    and we will hit case "2".
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> 
> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.

We certainly discussed the possiblity of adding an error code to 
identiy #GP due to ENQCMD with our HW architects. 

There are only a few cases that have an error code, like move to segment
with an invalid value for instance. There were a few but i don't
recall that entire list. 

Since the error code is 0 in most places, there isn't plumbing in hw to return
this value in all cases. It appeared that due to some uarch reasons it
wasn't as simple as it appears to /me sw kinds :-)

So after some internal discussion we decided to take the current
approach. Its possible that if the #GP was due to some other reason
we might #GP another time. Since this wasn't perf or speed path we took
this lazy approach. 

We will keep tabs with HW folks for future consideration. 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-27 22:46       ` Raj, Ashok
@ 2020-04-27 23:08         ` Luck, Tony
  -1 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-27 23:08 UTC (permalink / raw)
  To: Raj, Ashok, Thomas Gleixner
  Cc: Yu, Fenghua, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Hansen, Dave, Pan, Jacob jun, Jiang,
	Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel, x86, iommu

> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.

Thomas,

Is the heuristic to avoid decoding the user instructions OK (you are just pointing
out that you should not be given credit for this part of the idea)?

Or are you saying that you'd like to see the instruction checked to see that it
was an ENQCMD?

-Tony

 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-27 23:08         ` Luck, Tony
  0 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-27 23:08 UTC (permalink / raw)
  To: Raj, Ashok, Thomas Gleixner
  Cc: Yu, Fenghua, Jiang, Dave, Shankar, Ravi V, x86, linux-kernel,
	Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov, Pan,
	Jacob jun, H Peter Anvin, David Woodhouse

> Just for the record I also suggested to have a proper errorcode in the
> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> instructions.

Thomas,

Is the heuristic to avoid decoding the user instructions OK (you are just pointing
out that you should not be given credit for this part of the idea)?

Or are you saying that you'd like to see the instruction checked to see that it
was an ENQCMD?

-Tony

 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-27 22:18       ` Fenghua Yu
@ 2020-04-27 23:44         ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-27 23:44 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

Fenghua Yu <fenghua.yu@intel.com> writes:
> On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
>> Fenghua Yu <fenghua.yu@intel.com> writes:
>> > + +#ifdef CONFIG_INTEL_IOMMU_SVM +	int pasid;
>> 
>> int? It's a value which gets programmed into the MSR along with the valid 
>> bit (bit 31) set.
>
> The pasid is defined as "int" in struct intel_svm and in 
> intel_svm_bind_mm() and intel_svm_unbind_mm(). So the pasid defined in this 
> patch follows the same type defined in those places.

Which are wrong to begin with.

>> ioasid_alloc() uses ioasid_t which is
>> 
>> typedef unsigned int ioasid_t;
>> 
>> Can we please have consistent types and behaviour all over the place?
>
> Should I just define "pasid", "pasid_max", "flags" as "unsigned int" for
> the new functions/code?
>
> Or should I also change their types to "unsigned int" in the original
> svm code (struct intel_svm, ...bind_mm(), etc)? I'm afraid that will be
> a lot of changes and should be in a separate preparation patch.

Yes, please. The existance of non-sensical code is not an excuse to
proliferate it.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-27 23:44         ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-27 23:44 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

Fenghua Yu <fenghua.yu@intel.com> writes:
> On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
>> Fenghua Yu <fenghua.yu@intel.com> writes:
>> > + +#ifdef CONFIG_INTEL_IOMMU_SVM +	int pasid;
>> 
>> int? It's a value which gets programmed into the MSR along with the valid 
>> bit (bit 31) set.
>
> The pasid is defined as "int" in struct intel_svm and in 
> intel_svm_bind_mm() and intel_svm_unbind_mm(). So the pasid defined in this 
> patch follows the same type defined in those places.

Which are wrong to begin with.

>> ioasid_alloc() uses ioasid_t which is
>> 
>> typedef unsigned int ioasid_t;
>> 
>> Can we please have consistent types and behaviour all over the place?
>
> Should I just define "pasid", "pasid_max", "flags" as "unsigned int" for
> the new functions/code?
>
> Or should I also change their types to "unsigned int" in the original
> svm code (struct intel_svm, ...bind_mm(), etc)? I'm afraid that will be
> a lot of changes and should be in a separate preparation patch.

Yes, please. The existance of non-sensical code is not an excuse to
proliferate it.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-27 20:11       ` Fenghua Yu
@ 2020-04-28  0:13         ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:13 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

Fenghua Yu <fenghua.yu@intel.com> writes:
> On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
>> > @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
>> >  	int ret;
>> >  
>> >  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
>> > +
>> > +	if (user_mode(regs) && fixup_pasid_exception()) {
>> > +		cond_local_irq_enable(regs);
>> 
>> The point of this conditional irq enable _AFTER_ calling into the fixup
>> function is? Also what's the reason for keeping interrupts disabled
>> while calling into that function? Comments exist for a reason.
>
> irq needs to be disabled because the fixup function requires to disable
> preempt in order to update the PASID MSR on the faulting CPU.

No, that's just wrong. It's not about the update itself.

> Will add comments here.

Factual ones and not some fairy tales please.

>> > +bool __fixup_pasid_exception(void)
>> > +{
>> > +	struct mm_struct *mm;
>> > +	bool ret = true;
>> > +	u64 pasid_msr;
>> > +	int pasid;
>> > +
>> > +	mm = get_task_mm(current);
>> 
>> Why do you need a reference to current->mm ?
>
> The PASID for the address space is per mm and is stored in mm.
> To get the PASID, we need to get the mm and the
> pasid=mm->context.pasid.

It's obvious that you need to access current-mm in order to check
current->mm->context.pasid. Let me rephrase the question:

   Why do you need to take a reference on current->mm ?

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-28  0:13         ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:13 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

Fenghua Yu <fenghua.yu@intel.com> writes:
> On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
>> > @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
>> >  	int ret;
>> >  
>> >  	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
>> > +
>> > +	if (user_mode(regs) && fixup_pasid_exception()) {
>> > +		cond_local_irq_enable(regs);
>> 
>> The point of this conditional irq enable _AFTER_ calling into the fixup
>> function is? Also what's the reason for keeping interrupts disabled
>> while calling into that function? Comments exist for a reason.
>
> irq needs to be disabled because the fixup function requires to disable
> preempt in order to update the PASID MSR on the faulting CPU.

No, that's just wrong. It's not about the update itself.

> Will add comments here.

Factual ones and not some fairy tales please.

>> > +bool __fixup_pasid_exception(void)
>> > +{
>> > +	struct mm_struct *mm;
>> > +	bool ret = true;
>> > +	u64 pasid_msr;
>> > +	int pasid;
>> > +
>> > +	mm = get_task_mm(current);
>> 
>> Why do you need a reference to current->mm ?
>
> The PASID for the address space is per mm and is stored in mm.
> To get the PASID, we need to get the mm and the
> pasid=mm->context.pasid.

It's obvious that you need to access current-mm in order to check
current->mm->context.pasid. Let me rephrase the question:

   Why do you need to take a reference on current->mm ?

Thanks,

        tglx

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-27 23:08         ` Luck, Tony
@ 2020-04-28  0:20           ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:20 UTC (permalink / raw)
  To: Luck, Tony, Raj, Ashok
  Cc: Yu, Fenghua, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Hansen, Dave, Pan, Jacob jun, Jiang,
	Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel, x86, iommu

"Luck, Tony" <tony.luck@intel.com> writes:
>> Just for the record I also suggested to have a proper errorcode in the
>> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
>> instructions.
>
> Is the heuristic to avoid decoding the user instructions OK (you are just pointing
> out that you should not be given credit for this part of the idea)?

I surely suggested the approach, but at the same time I asked for the
error code and did not say that instruction checking needs to be
avoided.

This comment was just to make it clear that there were other options
discussed. IOW, the changelog should have some explicit explanations
why:

 - the error code idea does not work (according to HW folks)

 - the instruction decoding has no real benefit because $REASONS

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-28  0:20           ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:20 UTC (permalink / raw)
  To: Luck, Tony, Raj, Ashok
  Cc: Yu, Fenghua, Jiang, Dave, Shankar, Ravi V, x86, linux-kernel,
	Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov, Pan,
	Jacob jun, H Peter Anvin, David Woodhouse

"Luck, Tony" <tony.luck@intel.com> writes:
>> Just for the record I also suggested to have a proper errorcode in the
>> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
>> instructions.
>
> Is the heuristic to avoid decoding the user instructions OK (you are just pointing
> out that you should not be given credit for this part of the idea)?

I surely suggested the approach, but at the same time I asked for the
error code and did not say that instruction checking needs to be
avoided.

This comment was just to make it clear that there were other options
discussed. IOW, the changelog should have some explicit explanations
why:

 - the error code idea does not work (according to HW folks)

 - the instruction decoding has no real benefit because $REASONS

Thanks,

        tglx


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-27 22:46       ` Raj, Ashok
@ 2020-04-28  0:54         ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:54 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, Ashok Raj

Ashok,

"Raj, Ashok" <ashok.raj@intel.com> writes:
> On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
>> Just for the record I also suggested to have a proper errorcode in the
>> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
>> instructions.
>
> We certainly discussed the possiblity of adding an error code to 
> identiy #GP due to ENQCMD with our HW architects. 
>
> There are only a few cases that have an error code, like move to segment
> with an invalid value for instance. There were a few but i don't
> recall that entire list. 
>
> Since the error code is 0 in most places, there isn't plumbing in hw to return
> this value in all cases. It appeared that due to some uarch reasons it
> wasn't as simple as it appears to /me sw kinds :-)

Sigh.

> So after some internal discussion we decided to take the current
> approach. Its possible that if the #GP was due to some other reason
> we might #GP another time. Since this wasn't perf or speed path we took
> this lazy approach.

I know that the HW people's mantra is that everything can be fixed in
software and therefore slapping new features into the CPUs can be done
without thinking about the consequeses.

But we all know from painful experience that this is fundamentally wrong
unless there is a really compelling reason.

For new features there is absolutely no reason at all.

Can HW people pretty please understand that hardware and software have
to be co-designed and not dictated by 'some uarch reasons'. This is
nothing fundamentally new. This problem existed 30+ years ago, is well
documented and has been ignored forever. I'm tired of that, really.

But as this seems to be unsolvable for the problem at hand can you
please document the inability, unwillingness or whatever in the
changelog?

The question why this brand new_ ENQCMD + invalid PASID induced #GP does
not generate an useful error code and needs heuristics to be dealt with
is pretty obvious.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-28  0:54         ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28  0:54 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, Jacob Jun Pan, H Peter Anvin, David Woodhouse

Ashok,

"Raj, Ashok" <ashok.raj@intel.com> writes:
> On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
>> Just for the record I also suggested to have a proper errorcode in the
>> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
>> instructions.
>
> We certainly discussed the possiblity of adding an error code to 
> identiy #GP due to ENQCMD with our HW architects. 
>
> There are only a few cases that have an error code, like move to segment
> with an invalid value for instance. There were a few but i don't
> recall that entire list. 
>
> Since the error code is 0 in most places, there isn't plumbing in hw to return
> this value in all cases. It appeared that due to some uarch reasons it
> wasn't as simple as it appears to /me sw kinds :-)

Sigh.

> So after some internal discussion we decided to take the current
> approach. Its possible that if the #GP was due to some other reason
> we might #GP another time. Since this wasn't perf or speed path we took
> this lazy approach.

I know that the HW people's mantra is that everything can be fixed in
software and therefore slapping new features into the CPUs can be done
without thinking about the consequeses.

But we all know from painful experience that this is fundamentally wrong
unless there is a really compelling reason.

For new features there is absolutely no reason at all.

Can HW people pretty please understand that hardware and software have
to be co-designed and not dictated by 'some uarch reasons'. This is
nothing fundamentally new. This problem existed 30+ years ago, is well
documented and has been ignored forever. I'm tired of that, really.

But as this seems to be unsolvable for the problem at hand can you
please document the inability, unwillingness or whatever in the
changelog?

The question why this brand new_ ENQCMD + invalid PASID induced #GP does
not generate an useful error code and needs heuristics to be dealt with
is pretty obvious.

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
  2020-04-28  0:54         ` Thomas Gleixner
@ 2020-04-28  1:08           ` Raj, Ashok
  -1 siblings, 0 replies; 74+ messages in thread
From: Raj, Ashok @ 2020-04-28  1:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, Ashok Raj

Hi Thomas,

On Tue, Apr 28, 2020 at 02:54:59AM +0200, Thomas Gleixner wrote:
> Ashok,
> 
> "Raj, Ashok" <ashok.raj@intel.com> writes:
> > On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> >> Just for the record I also suggested to have a proper errorcode in the
> >> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> >> instructions.
> >
> > We certainly discussed the possiblity of adding an error code to 
> > identiy #GP due to ENQCMD with our HW architects. 
> >
> > There are only a few cases that have an error code, like move to segment
> > with an invalid value for instance. There were a few but i don't
> > recall that entire list. 
> >
> > Since the error code is 0 in most places, there isn't plumbing in hw to return
> > this value in all cases. It appeared that due to some uarch reasons it
> > wasn't as simple as it appears to /me sw kinds :-)
> 
> Sigh.
> 
> > So after some internal discussion we decided to take the current
> > approach. Its possible that if the #GP was due to some other reason
> > we might #GP another time. Since this wasn't perf or speed path we took
> > this lazy approach.
> 
> I know that the HW people's mantra is that everything can be fixed in
> software and therefore slapping new features into the CPUs can be done
> without thinking about the consequeses.
> 
> But we all know from painful experience that this is fundamentally wrong
> unless there is a really compelling reason.

:-)... I'm still looking for the quote from Linus about RAS before
he went to behavior school.


> 
> For new features there is absolutely no reason at all.
> 
> Can HW people pretty please understand that hardware and software have
> to be co-designed and not dictated by 'some uarch reasons'. This is
> nothing fundamentally new. This problem existed 30+ years ago, is well
> documented and has been ignored forever. I'm tired of that, really.
> 
> But as this seems to be unsolvable for the problem at hand can you
> please document the inability, unwillingness or whatever in the
> changelog?

Most certainly!

> 
> The question why this brand new_ ENQCMD + invalid PASID induced #GP does
> not generate an useful error code and needs heuristics to be dealt with
> is pretty obvious.
> 

Cheers,
Ashok

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 6/7] x86/traps: Fix up invalid PASID
@ 2020-04-28  1:08           ` Raj, Ashok
  0 siblings, 0 replies; 74+ messages in thread
From: Raj, Ashok @ 2020-04-28  1:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, Jacob Jun Pan, H Peter Anvin, David Woodhouse

Hi Thomas,

On Tue, Apr 28, 2020 at 02:54:59AM +0200, Thomas Gleixner wrote:
> Ashok,
> 
> "Raj, Ashok" <ashok.raj@intel.com> writes:
> > On Sun, Apr 26, 2020 at 05:25:06PM +0200, Thomas Gleixner wrote:
> >> Just for the record I also suggested to have a proper errorcode in the
> >> #GP for ENQCMD and I surely did not suggest to avoid decoding the user
> >> instructions.
> >
> > We certainly discussed the possiblity of adding an error code to 
> > identiy #GP due to ENQCMD with our HW architects. 
> >
> > There are only a few cases that have an error code, like move to segment
> > with an invalid value for instance. There were a few but i don't
> > recall that entire list. 
> >
> > Since the error code is 0 in most places, there isn't plumbing in hw to return
> > this value in all cases. It appeared that due to some uarch reasons it
> > wasn't as simple as it appears to /me sw kinds :-)
> 
> Sigh.
> 
> > So after some internal discussion we decided to take the current
> > approach. Its possible that if the #GP was due to some other reason
> > we might #GP another time. Since this wasn't perf or speed path we took
> > this lazy approach.
> 
> I know that the HW people's mantra is that everything can be fixed in
> software and therefore slapping new features into the CPUs can be done
> without thinking about the consequeses.
> 
> But we all know from painful experience that this is fundamentally wrong
> unless there is a really compelling reason.

:-)... I'm still looking for the quote from Linus about RAS before
he went to behavior school.


> 
> For new features there is absolutely no reason at all.
> 
> Can HW people pretty please understand that hardware and software have
> to be co-designed and not dictated by 'some uarch reasons'. This is
> nothing fundamentally new. This problem existed 30+ years ago, is well
> documented and has been ignored forever. I'm tired of that, really.
> 
> But as this seems to be unsolvable for the problem at hand can you
> please document the inability, unwillingness or whatever in the
> changelog?

Most certainly!

> 
> The question why this brand new_ ENQCMD + invalid PASID induced #GP does
> not generate an useful error code and needs heuristics to be dealt with
> is pretty obvious.
> 

Cheers,
Ashok
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-26 14:55     ` Thomas Gleixner
@ 2020-04-28 18:21       ` Jacob Pan (Jun)
  -1 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 18:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, jacob.jun.pan

On Sun, 26 Apr 2020 16:55:25 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> > PASID is shared by all threads in a process. So the logical place
> > to keep track of it is in the "mm". Add the field to the
> > architecture specific mm_context_t structure.
> >
> > A PASID is allocated for an "mm" the first time any thread attaches
> > to an SVM capable device. Later device atatches (whether to the
> > same  
> 
> atatches?
> 
> > device or another SVM device) will re-use the same PASID.
> >
> > The PASID is freed when the process exits (so no need to keep
> > reference counts on how many SVM devices are sharing the PASID).  
> 
> I'm not buying that. If there is an outstanding request with the PASID
> of a process then tearing down the process address space and freeing
> the PASID (which might be reused) is fundamentally broken.
> 
Device driver unbind PASID is tied to FD release. So when a process
exits, FD close causes driver to do the following:
1. stops DMA
2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs, drain
in flight page requests)

For bare metal SVM, if the last mmdrop always happens after FD release,
we can ensure no outstanding requests at the point of ioasid_free().
Perhaps this is a wrong assumption?

For guest SVM, there will be more users of a PASID. I am also
working on adding refcounting to ioasid. ioasid_free() will not release
the PASID back to the pool until all references are dropped.

> > +void __free_pasid(struct mm_struct *mm);
> > +
> >  #endif /* _ASM_X86_IOMMU_H */
> > diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> > index bdeae9291e5c..137bf51f19e6 100644
> > --- a/arch/x86/include/asm/mmu.h
> > +++ b/arch/x86/include/asm/mmu.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map;
> >  	s16 execute_only_pkey;
> >  #endif
> > +
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +	int pasid;  
> 
> int? It's a value which gets programmed into the MSR along with the
> valid bit (bit 31) set. 
> 
> >  extern void switch_mm(struct mm_struct *prev, struct mm_struct
> > *next, diff --git a/drivers/iommu/intel-svm.c
> > b/drivers/iommu/intel-svm.c index d7f2a5358900..da718a49e91e 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
> >  	list_for_each_entry((sdev), &(svm)->devs, list)	\
> >  		if ((d) != (sdev)->dev) {} else
> >  
> > +/*
> > + * If this mm already has a PASID we can use it. Otherwise
> > allocate a new one.
> > + * Let the caller know if we did an allocation via 'new_pasid'.
> > + */
> > +static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
> > +		       int pasid_max,  bool *new_pasid, int
> > flags)  
> 
> Again, data types please. flags are generally unsigned and not plain
> int. Also pasid_max is certainly not plain int either.
> 
> > +{
> > +	int pasid;
> > +
> > +	/*
> > +	 * Reuse the PASID if the mm already has a PASID and not a
> > private
> > +	 * PASID is requested.
> > +	 */
> > +	if (mm && mm->context.pasid && !(flags &
> > SVM_FLAG_PRIVATE_PASID)) {
> > +		/*
> > +		 * Once a PASID is allocated for this mm, the PASID
> > +		 * stays with the mm until the mm is dropped. Reuse
> > +		 * the PASID which has been already allocated for
> > the
> > +		 * mm instead of allocating a new one.
> > +		 */
> > +		ioasid_set_data(mm->context.pasid, svm);  
> 
> So if the PASID is reused several times for different SVMs then every
> time ioasid_data->private is set to a different SVM. How is that
> supposed to work?
> 
For the lifetime of the mm, there is only one PASID. svm_bind/unbind_mm
could happen many times with different private data: intel_svm.
Multiple devices can bind to the same PASID as well. But private data
don't change within the first bind and last unbind.

> > +		*new_pasid = false;
> > +
> > +		return mm->context.pasid;
> > +	}
> > +
> > +	/*
> > +	 * Allocate a new pasid. Do not use PASID 0, reserved for
> > RID to
> > +	 * PASID.
> > +	 */
> > +	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1,
> > svm);  
> 
> ioasid_alloc() uses ioasid_t which is
> 
> typedef unsigned int ioasid_t;
> 
> Can we please have consistent types and behaviour all over the place?
> 
> > +	if (pasid == INVALID_IOASID)
> > +		return -ENOSPC;
> > +
> > +	*new_pasid = true;
> > +
> > +	return pasid;
> > +}
> > +
> >  int intel_svm_bind_mm(struct device *dev, int *pasid, int flags,
> > struct svm_dev_ops *ops) {
> >  	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > @@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ init_rcu_head(&sdev->rcu);
> >  
> >  	if (!svm) {
> > +		bool new_pasid;
> > +
> >  		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
> >  		if (!svm) {
> >  			ret = -ENOMEM;
> > @@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (pasid_max >
> > intel_pasid_max_id) pasid_max = intel_pasid_max_id;
> >  
> > -		/* Do not use PASID 0, reserved for RID to PASID */
> > -		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> > -					  pasid_max - 1, svm);
> > -		if (svm->pasid == INVALID_IOASID) {
> > +		svm->pasid = alloc_pasid(svm, mm, pasid_max,
> > &new_pasid, flags);
> > +		if (svm->pasid < 0) {
> >  			kfree(svm);
> >  			kfree(sdev);
> > -			ret = -ENOSPC;  
> 
> ret gets magically initialized to an error return value, right?
> 
> >  			goto out;
> >  		}
> > +
> >  		svm->notifier.ops = &intel_mmuops;
> >  		svm->mm = mm;
> >  		svm->flags = flags;
> > @@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (mm) {
> >  			ret =
> > mmu_notifier_register(&svm->notifier, mm); if (ret) {
> > -				ioasid_free(svm->pasid);
> > +				if (new_pasid)
> > +					ioasid_free(svm->pasid);
> >  				kfree(svm);
> >  				kfree(sdev);
> >  				goto out;
> > @@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (ret) {
> >  			if (mm)
> >  				mmu_notifier_unregister(&svm->notifier,
> > mm);
> > -			ioasid_free(svm->pasid);
> > +			if (new_pasid)
> > +				ioasid_free(svm->pasid);
> >  			kfree(svm);
> >  			kfree(sdev);  
> 
> So there are 3 places now freeing svm ad sdev and 2 of them
> conditionally free svm->pasid. Can you please rewrite that to have a
> proper error exit path instead of glueing that stuff into the existing
> mess?
> 
> >  			goto out;
> >  		}
> >  
> > +		if (mm && new_pasid && !(flags &
> > SVM_FLAG_PRIVATE_PASID)) {
> > +			/*
> > +			 * Track the new pasid in the mm. The
> > pasid will be
> > +			 * freed at process exit. Don't track
> > requested
> > +			 * private PASID in the mm.  
> 
> What happens to private PASIDs?
> 
Private PASID feature will be removed. We are in the process of
converting from intel_svm_bind_mm to generic sva_bind_device API.
https://lkml.org/lkml/2020/3/23/1022

Thanks,

Jacob

> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 18:21       ` Jacob Pan (Jun)
  0 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 18:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, jacob.jun.pan, H Peter Anvin, David Woodhouse

On Sun, 26 Apr 2020 16:55:25 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Fenghua Yu <fenghua.yu@intel.com> writes:
> 
> > PASID is shared by all threads in a process. So the logical place
> > to keep track of it is in the "mm". Add the field to the
> > architecture specific mm_context_t structure.
> >
> > A PASID is allocated for an "mm" the first time any thread attaches
> > to an SVM capable device. Later device atatches (whether to the
> > same  
> 
> atatches?
> 
> > device or another SVM device) will re-use the same PASID.
> >
> > The PASID is freed when the process exits (so no need to keep
> > reference counts on how many SVM devices are sharing the PASID).  
> 
> I'm not buying that. If there is an outstanding request with the PASID
> of a process then tearing down the process address space and freeing
> the PASID (which might be reused) is fundamentally broken.
> 
Device driver unbind PASID is tied to FD release. So when a process
exits, FD close causes driver to do the following:
1. stops DMA
2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs, drain
in flight page requests)

For bare metal SVM, if the last mmdrop always happens after FD release,
we can ensure no outstanding requests at the point of ioasid_free().
Perhaps this is a wrong assumption?

For guest SVM, there will be more users of a PASID. I am also
working on adding refcounting to ioasid. ioasid_free() will not release
the PASID back to the pool until all references are dropped.

> > +void __free_pasid(struct mm_struct *mm);
> > +
> >  #endif /* _ASM_X86_IOMMU_H */
> > diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> > index bdeae9291e5c..137bf51f19e6 100644
> > --- a/arch/x86/include/asm/mmu.h
> > +++ b/arch/x86/include/asm/mmu.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map;
> >  	s16 execute_only_pkey;
> >  #endif
> > +
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +	int pasid;  
> 
> int? It's a value which gets programmed into the MSR along with the
> valid bit (bit 31) set. 
> 
> >  extern void switch_mm(struct mm_struct *prev, struct mm_struct
> > *next, diff --git a/drivers/iommu/intel-svm.c
> > b/drivers/iommu/intel-svm.c index d7f2a5358900..da718a49e91e 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -226,6 +226,45 @@ static LIST_HEAD(global_svm_list);
> >  	list_for_each_entry((sdev), &(svm)->devs, list)	\
> >  		if ((d) != (sdev)->dev) {} else
> >  
> > +/*
> > + * If this mm already has a PASID we can use it. Otherwise
> > allocate a new one.
> > + * Let the caller know if we did an allocation via 'new_pasid'.
> > + */
> > +static int alloc_pasid(struct intel_svm *svm, struct mm_struct *mm,
> > +		       int pasid_max,  bool *new_pasid, int
> > flags)  
> 
> Again, data types please. flags are generally unsigned and not plain
> int. Also pasid_max is certainly not plain int either.
> 
> > +{
> > +	int pasid;
> > +
> > +	/*
> > +	 * Reuse the PASID if the mm already has a PASID and not a
> > private
> > +	 * PASID is requested.
> > +	 */
> > +	if (mm && mm->context.pasid && !(flags &
> > SVM_FLAG_PRIVATE_PASID)) {
> > +		/*
> > +		 * Once a PASID is allocated for this mm, the PASID
> > +		 * stays with the mm until the mm is dropped. Reuse
> > +		 * the PASID which has been already allocated for
> > the
> > +		 * mm instead of allocating a new one.
> > +		 */
> > +		ioasid_set_data(mm->context.pasid, svm);  
> 
> So if the PASID is reused several times for different SVMs then every
> time ioasid_data->private is set to a different SVM. How is that
> supposed to work?
> 
For the lifetime of the mm, there is only one PASID. svm_bind/unbind_mm
could happen many times with different private data: intel_svm.
Multiple devices can bind to the same PASID as well. But private data
don't change within the first bind and last unbind.

> > +		*new_pasid = false;
> > +
> > +		return mm->context.pasid;
> > +	}
> > +
> > +	/*
> > +	 * Allocate a new pasid. Do not use PASID 0, reserved for
> > RID to
> > +	 * PASID.
> > +	 */
> > +	pasid = ioasid_alloc(NULL, PASID_MIN, pasid_max - 1,
> > svm);  
> 
> ioasid_alloc() uses ioasid_t which is
> 
> typedef unsigned int ioasid_t;
> 
> Can we please have consistent types and behaviour all over the place?
> 
> > +	if (pasid == INVALID_IOASID)
> > +		return -ENOSPC;
> > +
> > +	*new_pasid = true;
> > +
> > +	return pasid;
> > +}
> > +
> >  int intel_svm_bind_mm(struct device *dev, int *pasid, int flags,
> > struct svm_dev_ops *ops) {
> >  	struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> > @@ -324,6 +363,8 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ init_rcu_head(&sdev->rcu);
> >  
> >  	if (!svm) {
> > +		bool new_pasid;
> > +
> >  		svm = kzalloc(sizeof(*svm), GFP_KERNEL);
> >  		if (!svm) {
> >  			ret = -ENOMEM;
> > @@ -335,15 +376,13 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (pasid_max >
> > intel_pasid_max_id) pasid_max = intel_pasid_max_id;
> >  
> > -		/* Do not use PASID 0, reserved for RID to PASID */
> > -		svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> > -					  pasid_max - 1, svm);
> > -		if (svm->pasid == INVALID_IOASID) {
> > +		svm->pasid = alloc_pasid(svm, mm, pasid_max,
> > &new_pasid, flags);
> > +		if (svm->pasid < 0) {
> >  			kfree(svm);
> >  			kfree(sdev);
> > -			ret = -ENOSPC;  
> 
> ret gets magically initialized to an error return value, right?
> 
> >  			goto out;
> >  		}
> > +
> >  		svm->notifier.ops = &intel_mmuops;
> >  		svm->mm = mm;
> >  		svm->flags = flags;
> > @@ -353,7 +392,8 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (mm) {
> >  			ret =
> > mmu_notifier_register(&svm->notifier, mm); if (ret) {
> > -				ioasid_free(svm->pasid);
> > +				if (new_pasid)
> > +					ioasid_free(svm->pasid);
> >  				kfree(svm);
> >  				kfree(sdev);
> >  				goto out;
> > @@ -371,12 +411,21 @@ int intel_svm_bind_mm(struct device *dev, int
> > *pasid, int flags, struct svm_dev_ if (ret) {
> >  			if (mm)
> >  				mmu_notifier_unregister(&svm->notifier,
> > mm);
> > -			ioasid_free(svm->pasid);
> > +			if (new_pasid)
> > +				ioasid_free(svm->pasid);
> >  			kfree(svm);
> >  			kfree(sdev);  
> 
> So there are 3 places now freeing svm ad sdev and 2 of them
> conditionally free svm->pasid. Can you please rewrite that to have a
> proper error exit path instead of glueing that stuff into the existing
> mess?
> 
> >  			goto out;
> >  		}
> >  
> > +		if (mm && new_pasid && !(flags &
> > SVM_FLAG_PRIVATE_PASID)) {
> > +			/*
> > +			 * Track the new pasid in the mm. The
> > pasid will be
> > +			 * freed at process exit. Don't track
> > requested
> > +			 * private PASID in the mm.  
> 
> What happens to private PASIDs?
> 
Private PASID feature will be removed. We are in the process of
converting from intel_svm_bind_mm to generic sva_bind_device API.
https://lkml.org/lkml/2020/3/23/1022

Thanks,

Jacob

> Thanks,
> 
>         tglx

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 18:21       ` Jacob Pan (Jun)
@ 2020-04-28 18:54         ` Thomas Gleixner
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28 18:54 UTC (permalink / raw)
  To: Jacob Pan (Jun)
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, jacob.jun.pan

"Jacob Pan (Jun)" <jacob.jun.pan@intel.com> writes:
> On Sun, 26 Apr 2020 16:55:25 +0200
> Thomas Gleixner <tglx@linutronix.de> wrote:
>> Fenghua Yu <fenghua.yu@intel.com> writes:
>> > The PASID is freed when the process exits (so no need to keep
>> > reference counts on how many SVM devices are sharing the PASID).  
>> 
>> I'm not buying that. If there is an outstanding request with the PASID
>> of a process then tearing down the process address space and freeing
>> the PASID (which might be reused) is fundamentally broken.
>> 
> Device driver unbind PASID is tied to FD release. So when a process
> exits, FD close causes driver to do the following:
>
> 1. stops DMA
> 2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs, drain
> in flight page requests)

Fair enough. Explaining that somewhere might be helpful.

> For bare metal SVM, if the last mmdrop always happens after FD release,
> we can ensure no outstanding requests at the point of ioasid_free().
> Perhaps this is a wrong assumption?

If fd release cleans up then how should there be something in flight at
the final mmdrop?

> For guest SVM, there will be more users of a PASID. I am also
> working on adding refcounting to ioasid. ioasid_free() will not release
> the PASID back to the pool until all references are dropped.

What does more users mean?

>> > +	if (mm && mm->context.pasid && !(flags &
>> > SVM_FLAG_PRIVATE_PASID)) {
>> > +		/*
>> > +		 * Once a PASID is allocated for this mm, the PASID
>> > +		 * stays with the mm until the mm is dropped. Reuse
>> > +		 * the PASID which has been already allocated for
>> > the
>> > +		 * mm instead of allocating a new one.
>> > +		 */
>> > +		ioasid_set_data(mm->context.pasid, svm);  
>> 
>> So if the PASID is reused several times for different SVMs then every
>> time ioasid_data->private is set to a different SVM. How is that
>> supposed to work?
>> 
> For the lifetime of the mm, there is only one PASID. svm_bind/unbind_mm
> could happen many times with different private data: intel_svm.
> Multiple devices can bind to the same PASID as well. But private data
> don't change within the first bind and last unbind.

Ok. I read through that spaghetti of intel_svm_bind_mm() again and now I
start to get an idea how that is supposed to work. What a mess.

That function really wants to be restructured in a way so it is
understandable to mere mortals. 

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 18:54         ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2020-04-28 18:54 UTC (permalink / raw)
  To: Jacob Pan (Jun)
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, jacob.jun.pan, H Peter Anvin, David Woodhouse

"Jacob Pan (Jun)" <jacob.jun.pan@intel.com> writes:
> On Sun, 26 Apr 2020 16:55:25 +0200
> Thomas Gleixner <tglx@linutronix.de> wrote:
>> Fenghua Yu <fenghua.yu@intel.com> writes:
>> > The PASID is freed when the process exits (so no need to keep
>> > reference counts on how many SVM devices are sharing the PASID).  
>> 
>> I'm not buying that. If there is an outstanding request with the PASID
>> of a process then tearing down the process address space and freeing
>> the PASID (which might be reused) is fundamentally broken.
>> 
> Device driver unbind PASID is tied to FD release. So when a process
> exits, FD close causes driver to do the following:
>
> 1. stops DMA
> 2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs, drain
> in flight page requests)

Fair enough. Explaining that somewhere might be helpful.

> For bare metal SVM, if the last mmdrop always happens after FD release,
> we can ensure no outstanding requests at the point of ioasid_free().
> Perhaps this is a wrong assumption?

If fd release cleans up then how should there be something in flight at
the final mmdrop?

> For guest SVM, there will be more users of a PASID. I am also
> working on adding refcounting to ioasid. ioasid_free() will not release
> the PASID back to the pool until all references are dropped.

What does more users mean?

>> > +	if (mm && mm->context.pasid && !(flags &
>> > SVM_FLAG_PRIVATE_PASID)) {
>> > +		/*
>> > +		 * Once a PASID is allocated for this mm, the PASID
>> > +		 * stays with the mm until the mm is dropped. Reuse
>> > +		 * the PASID which has been already allocated for
>> > the
>> > +		 * mm instead of allocating a new one.
>> > +		 */
>> > +		ioasid_set_data(mm->context.pasid, svm);  
>> 
>> So if the PASID is reused several times for different SVMs then every
>> time ioasid_data->private is set to a different SVM. How is that
>> supposed to work?
>> 
> For the lifetime of the mm, there is only one PASID. svm_bind/unbind_mm
> could happen many times with different private data: intel_svm.
> Multiple devices can bind to the same PASID as well. But private data
> don't change within the first bind and last unbind.

Ok. I read through that spaghetti of intel_svm_bind_mm() again and now I
start to get an idea how that is supposed to work. What a mess.

That function really wants to be restructured in a way so it is
understandable to mere mortals. 

Thanks,

        tglx
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 18:54         ` Thomas Gleixner
@ 2020-04-28 19:07           ` Luck, Tony
  -1 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 19:07 UTC (permalink / raw)
  To: Thomas Gleixner, Pan, Jacob jun
  Cc: Yu, Fenghua, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Hansen, Dave, Raj, Ashok, Jiang, Dave,
	Mehta, Sohil, Shankar, Ravi V, linux-kernel, x86, iommu, Pan,
	Jacob jun

> If fd release cleans up then how should there be something in flight at
> the final mmdrop?

ENQCMD from the user is only synchronous in that it lets the user know their
request has been added to a queue (or not).  Execution of the request may happen
later (if the device is busy working on requests for other users).  The request will
take some time to complete. Someone told me the theoretical worst case once,
which I've since forgotten, but it can be a long time.

So the driver needs to use flush/drain operations to make sure all the in-flight
work has completed before releasing/re-using the PASID.

-Tony

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 19:07           ` Luck, Tony
  0 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 19:07 UTC (permalink / raw)
  To: Thomas Gleixner, Pan, Jacob jun
  Cc: Yu, Fenghua, Jiang, Dave, Raj, Ashok, Shankar, Ravi V, x86,
	linux-kernel, Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov,
	Pan,  Jacob jun, H Peter Anvin, David Woodhouse

> If fd release cleans up then how should there be something in flight at
> the final mmdrop?

ENQCMD from the user is only synchronous in that it lets the user know their
request has been added to a queue (or not).  Execution of the request may happen
later (if the device is busy working on requests for other users).  The request will
take some time to complete. Someone told me the theoretical worst case once,
which I've since forgotten, but it can be a long time.

So the driver needs to use flush/drain operations to make sure all the in-flight
work has completed before releasing/re-using the PASID.

-Tony
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 18:54         ` Thomas Gleixner
@ 2020-04-28 20:40           ` Jacob Pan (Jun)
  -1 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 20:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Ingo Molnar, Borislav Petkov, H Peter Anvin,
	David Woodhouse, Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu, jacob.jun.pan, jacob.jun.pan

On Tue, 28 Apr 2020 20:54:01 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> "Jacob Pan (Jun)" <jacob.jun.pan@intel.com> writes:
> > On Sun, 26 Apr 2020 16:55:25 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:  
> >> Fenghua Yu <fenghua.yu@intel.com> writes:  
> >> > The PASID is freed when the process exits (so no need to keep
> >> > reference counts on how many SVM devices are sharing the
> >> > PASID).    
> >> 
> >> I'm not buying that. If there is an outstanding request with the
> >> PASID of a process then tearing down the process address space and
> >> freeing the PASID (which might be reused) is fundamentally broken.
> >>   
> > Device driver unbind PASID is tied to FD release. So when a process
> > exits, FD close causes driver to do the following:
> >
> > 1. stops DMA
> > 2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs,
> > drain in flight page requests)  
> 
> Fair enough. Explaining that somewhere might be helpful.
> 
Will do. I plan to document this in a kernel doc for IOASID/PASID
lifecycle management.

> > For bare metal SVM, if the last mmdrop always happens after FD
> > release, we can ensure no outstanding requests at the point of
> > ioasid_free(). Perhaps this is a wrong assumption?  
> 
> If fd release cleans up then how should there be something in flight
> at the final mmdrop?
> 
> > For guest SVM, there will be more users of a PASID. I am also
> > working on adding refcounting to ioasid. ioasid_free() will not
> > release the PASID back to the pool until all references are
> > dropped.  
> 
> What does more users mean?
For VT-d, a PASID can be used by VFIO, IOMMU driver, KVM, and Virtual
Device Composition Module (VDCM*) at the same time.

*https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

There are HW context associated with the PASID in IOMMU, KVM, and VDCM.
So before the lifetime of the PASID is over, clean up must be done in
all of the above. PASID cannot be reclaimed until the last user drops
its reference. Our plan is to do notification and refcouting.


> 
> >> > +	if (mm && mm->context.pasid && !(flags &
> >> > SVM_FLAG_PRIVATE_PASID)) {
> >> > +		/*
> >> > +		 * Once a PASID is allocated for this mm, the
> >> > PASID
> >> > +		 * stays with the mm until the mm is dropped.
> >> > Reuse
> >> > +		 * the PASID which has been already allocated
> >> > for the
> >> > +		 * mm instead of allocating a new one.
> >> > +		 */
> >> > +		ioasid_set_data(mm->context.pasid, svm);    
> >> 
> >> So if the PASID is reused several times for different SVMs then
> >> every time ioasid_data->private is set to a different SVM. How is
> >> that supposed to work?
> >>   
> > For the lifetime of the mm, there is only one PASID.
> > svm_bind/unbind_mm could happen many times with different private
> > data: intel_svm. Multiple devices can bind to the same PASID as
> > well. But private data don't change within the first bind and last
> > unbind.  
> 
> Ok. I read through that spaghetti of intel_svm_bind_mm() again and
> now I start to get an idea how that is supposed to work. What a mess.
> 
> That function really wants to be restructured in a way so it is
> understandable to mere mortals. 
> 

Agreed. We are adding many new features and converging with generic
sva_bind_device. Things will get more clear after we have fewer moving
pieces.


> Thanks,
> 
>         tglx


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 20:40           ` Jacob Pan (Jun)
  0 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 20:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Fenghua Yu, Tony Luck, Dave Jiang, Ashok Raj, Ravi V Shankar,
	x86, linux-kernel, Dave Hansen, iommu, Ingo Molnar,
	Borislav Petkov, jacob.jun.pan, H Peter Anvin, David Woodhouse

On Tue, 28 Apr 2020 20:54:01 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> "Jacob Pan (Jun)" <jacob.jun.pan@intel.com> writes:
> > On Sun, 26 Apr 2020 16:55:25 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:  
> >> Fenghua Yu <fenghua.yu@intel.com> writes:  
> >> > The PASID is freed when the process exits (so no need to keep
> >> > reference counts on how many SVM devices are sharing the
> >> > PASID).    
> >> 
> >> I'm not buying that. If there is an outstanding request with the
> >> PASID of a process then tearing down the process address space and
> >> freeing the PASID (which might be reused) is fundamentally broken.
> >>   
> > Device driver unbind PASID is tied to FD release. So when a process
> > exits, FD close causes driver to do the following:
> >
> > 1. stops DMA
> > 2. unbind PASID (clears the PASID entry in IOMMU, flush all TLBs,
> > drain in flight page requests)  
> 
> Fair enough. Explaining that somewhere might be helpful.
> 
Will do. I plan to document this in a kernel doc for IOASID/PASID
lifecycle management.

> > For bare metal SVM, if the last mmdrop always happens after FD
> > release, we can ensure no outstanding requests at the point of
> > ioasid_free(). Perhaps this is a wrong assumption?  
> 
> If fd release cleans up then how should there be something in flight
> at the final mmdrop?
> 
> > For guest SVM, there will be more users of a PASID. I am also
> > working on adding refcounting to ioasid. ioasid_free() will not
> > release the PASID back to the pool until all references are
> > dropped.  
> 
> What does more users mean?
For VT-d, a PASID can be used by VFIO, IOMMU driver, KVM, and Virtual
Device Composition Module (VDCM*) at the same time.

*https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

There are HW context associated with the PASID in IOMMU, KVM, and VDCM.
So before the lifetime of the PASID is over, clean up must be done in
all of the above. PASID cannot be reclaimed until the last user drops
its reference. Our plan is to do notification and refcouting.


> 
> >> > +	if (mm && mm->context.pasid && !(flags &
> >> > SVM_FLAG_PRIVATE_PASID)) {
> >> > +		/*
> >> > +		 * Once a PASID is allocated for this mm, the
> >> > PASID
> >> > +		 * stays with the mm until the mm is dropped.
> >> > Reuse
> >> > +		 * the PASID which has been already allocated
> >> > for the
> >> > +		 * mm instead of allocating a new one.
> >> > +		 */
> >> > +		ioasid_set_data(mm->context.pasid, svm);    
> >> 
> >> So if the PASID is reused several times for different SVMs then
> >> every time ioasid_data->private is set to a different SVM. How is
> >> that supposed to work?
> >>   
> > For the lifetime of the mm, there is only one PASID.
> > svm_bind/unbind_mm could happen many times with different private
> > data: intel_svm. Multiple devices can bind to the same PASID as
> > well. But private data don't change within the first bind and last
> > unbind.  
> 
> Ok. I read through that spaghetti of intel_svm_bind_mm() again and
> now I start to get an idea how that is supposed to work. What a mess.
> 
> That function really wants to be restructured in a way so it is
> understandable to mere mortals. 
> 

Agreed. We are adding many new features and converging with generic
sva_bind_device. Things will get more clear after we have fewer moving
pieces.


> Thanks,
> 
>         tglx

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 19:07           ` Luck, Tony
@ 2020-04-28 20:42             ` Jacob Pan (Jun)
  -1 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 20:42 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Thomas Gleixner, Yu, Fenghua, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, David Woodhouse, Lu Baolu, Hansen, Dave, Raj,
	Ashok, Jiang, Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel,
	x86, iommu, jacob.jun.pan

On Tue, 28 Apr 2020 12:07:25 -0700
"Luck, Tony" <tony.luck@intel.com> wrote:

> > If fd release cleans up then how should there be something in
> > flight at the final mmdrop?  
> 
> ENQCMD from the user is only synchronous in that it lets the user
> know their request has been added to a queue (or not).  Execution of
> the request may happen later (if the device is busy working on
> requests for other users).  The request will take some time to
> complete. Someone told me the theoretical worst case once, which I've
> since forgotten, but it can be a long time.
> 
> So the driver needs to use flush/drain operations to make sure all
> the in-flight work has completed before releasing/re-using the PASID.
> 
Are you suggesting we should let driver also hold a reference of the
PASID?

> -Tony


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 20:42             ` Jacob Pan (Jun)
  0 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 20:42 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Yu, Fenghua, Jiang, Dave, Raj, Ashok, Shankar,  Ravi V, x86,
	linux-kernel, Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov,
	jacob.jun.pan, H Peter Anvin, Thomas Gleixner, David Woodhouse

On Tue, 28 Apr 2020 12:07:25 -0700
"Luck, Tony" <tony.luck@intel.com> wrote:

> > If fd release cleans up then how should there be something in
> > flight at the final mmdrop?  
> 
> ENQCMD from the user is only synchronous in that it lets the user
> know their request has been added to a queue (or not).  Execution of
> the request may happen later (if the device is busy working on
> requests for other users).  The request will take some time to
> complete. Someone told me the theoretical worst case once, which I've
> since forgotten, but it can be a long time.
> 
> So the driver needs to use flush/drain operations to make sure all
> the in-flight work has completed before releasing/re-using the PASID.
> 
Are you suggesting we should let driver also hold a reference of the
PASID?

> -Tony

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-26 14:55     ` Thomas Gleixner
@ 2020-04-28 20:57       ` Fenghua Yu
  -1 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-28 20:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Borislav Petkov, H Peter Anvin, David Woodhouse,
	Lu Baolu, Dave Hansen, Tony Luck, Ashok Raj, Jacob Jun Pan,
	Dave Jiang, Sohil Mehta, Ravi V Shankar, linux-kernel, x86,
	iommu

On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> > index bdeae9291e5c..137bf51f19e6 100644
> > --- a/arch/x86/include/asm/mmu.h
> > +++ b/arch/x86/include/asm/mmu.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map;
> >  	s16 execute_only_pkey;
> >  #endif
> > +
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +	int pasid;
> 
> int? It's a value which gets programmed into the MSR along with the
> valid bit (bit 31) set. 

BTW, ARM is working on PASID as well. Christoph suggested that the PASID
should be defined in mm_struct instead of mm->context so that both ARM and X86
can access it:
https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-philippe@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911

So I will define "pasid" to mm_struct in a separate patch in the next version.

Thanks.

-Fenghua


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 20:57       ` Fenghua Yu
  0 siblings, 0 replies; 74+ messages in thread
From: Fenghua Yu @ 2020-04-28 20:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ravi V Shankar, Tony Luck, Dave Jiang, Ashok Raj, x86,
	linux-kernel, Dave Hansen, iommu, Ingo Molnar, Borislav Petkov,
	Jacob Jun Pan, H Peter Anvin, David Woodhouse

On Sun, Apr 26, 2020 at 04:55:25PM +0200, Thomas Gleixner wrote:
> Fenghua Yu <fenghua.yu@intel.com> writes:
> > diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> > index bdeae9291e5c..137bf51f19e6 100644
> > --- a/arch/x86/include/asm/mmu.h
> > +++ b/arch/x86/include/asm/mmu.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >  	u16 pkey_allocation_map;
> >  	s16 execute_only_pkey;
> >  #endif
> > +
> > +#ifdef CONFIG_INTEL_IOMMU_SVM
> > +	int pasid;
> 
> int? It's a value which gets programmed into the MSR along with the
> valid bit (bit 31) set. 

BTW, ARM is working on PASID as well. Christoph suggested that the PASID
should be defined in mm_struct instead of mm->context so that both ARM and X86
can access it:
https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-philippe@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911

So I will define "pasid" to mm_struct in a separate patch in the next version.

Thanks.

-Fenghua

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 20:42             ` Jacob Pan (Jun)
@ 2020-04-28 20:59               ` Luck, Tony
  -1 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 20:59 UTC (permalink / raw)
  To: Pan, Jacob jun
  Cc: Thomas Gleixner, Yu, Fenghua, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, David Woodhouse, Lu Baolu, Hansen, Dave, Raj,
	Ashok, Jiang, Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel,
	x86, iommu

>> So the driver needs to use flush/drain operations to make sure all
>> the in-flight work has completed before releasing/re-using the PASID.
>> 
> Are you suggesting we should let driver also hold a reference of the
> PASID?

The sequence for bare metal is:

	process is queuing requests to DSA
	process exits (either deliberately, or crashes, or is killed)
	kernel does exit processing
	DSA driver is called as part of tear down of "mm"
		issues drain/flush commands to ensure that all
		queued operations on the PASID for this mm have
		completed
	PASID can be freed

There's a 1:1 map from "mm" to PASID ... so reference counting seems
like overkill. Once the kernel is in the "exit" path, we know that no more
work can be queued using this PASID.

-Tony

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 20:59               ` Luck, Tony
  0 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 20:59 UTC (permalink / raw)
  To: Pan, Jacob jun
  Cc: Yu, Fenghua, Jiang, Dave, Raj, Ashok, Shankar,  Ravi V, x86,
	linux-kernel, Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Thomas Gleixner, David Woodhouse

>> So the driver needs to use flush/drain operations to make sure all
>> the in-flight work has completed before releasing/re-using the PASID.
>> 
> Are you suggesting we should let driver also hold a reference of the
> PASID?

The sequence for bare metal is:

	process is queuing requests to DSA
	process exits (either deliberately, or crashes, or is killed)
	kernel does exit processing
	DSA driver is called as part of tear down of "mm"
		issues drain/flush commands to ensure that all
		queued operations on the PASID for this mm have
		completed
	PASID can be freed

There's a 1:1 map from "mm" to PASID ... so reference counting seems
like overkill. Once the kernel is in the "exit" path, we know that no more
work can be queued using this PASID.

-Tony
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 20:59               ` Luck, Tony
@ 2020-04-28 22:13                 ` Jacob Pan (Jun)
  -1 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 22:13 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Thomas Gleixner, Yu, Fenghua, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, David Woodhouse, Lu Baolu, Hansen, Dave, Raj,
	Ashok, Jiang, Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel,
	x86, iommu, jacob.jun.pan

On Tue, 28 Apr 2020 13:59:43 -0700
"Luck, Tony" <tony.luck@intel.com> wrote:

> >> So the driver needs to use flush/drain operations to make sure all
> >> the in-flight work has completed before releasing/re-using the
> >> PASID. 
> > Are you suggesting we should let driver also hold a reference of the
> > PASID?  
> 
> The sequence for bare metal is:
> 
> 	process is queuing requests to DSA
> 	process exits (either deliberately, or crashes, or is killed)
> 	kernel does exit processing
> 	DSA driver is called as part of tear down of "mm"
> 		issues drain/flush commands to ensure that all
> 		queued operations on the PASID for this mm have
> 		completed
> 	PASID can be freed
> 
> There's a 1:1 map from "mm" to PASID ... so reference counting seems
> like overkill. Once the kernel is in the "exit" path, we know that no
> more work can be queued using this PASID.
> 
There are two users of a PASID, mm and device driver(FD). If
either one is not done with the PASID, it cannot be reclaimed. As you
mentioned, it could take a long time for the driver to abort. If the
abort ends *after* mmdrop, we are in trouble.
If driver drops reference after abort/drain PASID is done, then we are
safe.


> -Tony


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 22:13                 ` Jacob Pan (Jun)
  0 siblings, 0 replies; 74+ messages in thread
From: Jacob Pan (Jun) @ 2020-04-28 22:13 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Yu, Fenghua, Jiang, Dave, Raj, Ashok, Shankar,  Ravi V, x86,
	linux-kernel, Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov,
	jacob.jun.pan, H Peter Anvin, Thomas Gleixner, David Woodhouse

On Tue, 28 Apr 2020 13:59:43 -0700
"Luck, Tony" <tony.luck@intel.com> wrote:

> >> So the driver needs to use flush/drain operations to make sure all
> >> the in-flight work has completed before releasing/re-using the
> >> PASID. 
> > Are you suggesting we should let driver also hold a reference of the
> > PASID?  
> 
> The sequence for bare metal is:
> 
> 	process is queuing requests to DSA
> 	process exits (either deliberately, or crashes, or is killed)
> 	kernel does exit processing
> 	DSA driver is called as part of tear down of "mm"
> 		issues drain/flush commands to ensure that all
> 		queued operations on the PASID for this mm have
> 		completed
> 	PASID can be freed
> 
> There's a 1:1 map from "mm" to PASID ... so reference counting seems
> like overkill. Once the kernel is in the "exit" path, we know that no
> more work can be queued using this PASID.
> 
There are two users of a PASID, mm and device driver(FD). If
either one is not done with the PASID, it cannot be reclaimed. As you
mentioned, it could take a long time for the driver to abort. If the
abort ends *after* mmdrop, we are in trouble.
If driver drops reference after abort/drain PASID is done, then we are
safe.


> -Tony

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
  2020-04-28 22:13                 ` Jacob Pan (Jun)
@ 2020-04-28 22:32                   ` Luck, Tony
  -1 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 22:32 UTC (permalink / raw)
  To: Pan, Jacob jun
  Cc: Thomas Gleixner, Yu, Fenghua, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, David Woodhouse, Lu Baolu, Hansen, Dave, Raj,
	Ashok, Jiang, Dave, Mehta, Sohil, Shankar, Ravi V, linux-kernel,
	x86, iommu

> There are two users of a PASID, mm and device driver(FD). If
> either one is not done with the PASID, it cannot be reclaimed. As you
> mentioned, it could take a long time for the driver to abort. If the
> abort ends *after* mmdrop, we are in trouble.
> If driver drops reference after abort/drain PASID is done, then we are
> safe.

I don't think there should be an abort ... suppose the application requested
the DSA to copy some large block of important results from DDR4 to
persistent memory.  Driver should wait for that copy operation to complete.

Note that for the operation to succeed, the kernel should still be processing
and fixing page faults for the "mm" (some parts of the data that the user wanted
to save to persistent memory may have been paged out).

The wait by the DSA diver needs to by synchronous ... the "mm" cannot be
freed until DSA says all the pending operations have completed.

Even without persistent memory, there are cases where you want the operations
to complete (mmap'd files, shared memory with other processes).

-Tony

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
@ 2020-04-28 22:32                   ` Luck, Tony
  0 siblings, 0 replies; 74+ messages in thread
From: Luck, Tony @ 2020-04-28 22:32 UTC (permalink / raw)
  To: Pan, Jacob jun
  Cc: Yu, Fenghua, Jiang, Dave, Raj, Ashok, Shankar,  Ravi V, x86,
	linux-kernel, Hansen, Dave, iommu, Ingo Molnar, Borislav Petkov,
	H Peter Anvin, Thomas Gleixner, David Woodhouse

> There are two users of a PASID, mm and device driver(FD). If
> either one is not done with the PASID, it cannot be reclaimed. As you
> mentioned, it could take a long time for the driver to abort. If the
> abort ends *after* mmdrop, we are in trouble.
> If driver drops reference after abort/drain PASID is done, then we are
> safe.

I don't think there should be an abort ... suppose the application requested
the DSA to copy some large block of important results from DDR4 to
persistent memory.  Driver should wait for that copy operation to complete.

Note that for the operation to succeed, the kernel should still be processing
and fixing page faults for the "mm" (some parts of the data that the user wanted
to save to persistent memory may have been paged out).

The wait by the DSA diver needs to by synchronous ... the "mm" cannot be
freed until DSA says all the pending operations have completed.

Even without persistent memory, there are cases where you want the operations
to complete (mmap'd files, shared memory with other processes).

-Tony
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2020-04-28 22:32 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-30 19:33 [PATCH 0/7] x86: tag application address space for devices Fenghua Yu
2020-03-30 19:33 ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 1/7] docs: x86: Add a documentation for ENQCMD Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 11:02   ` Thomas Gleixner
2020-04-26 11:02     ` Thomas Gleixner
2020-04-27 20:13     ` Fenghua Yu
2020-04-27 20:13       ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 2/7] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 11:06   ` Thomas Gleixner
2020-04-26 11:06     ` Thomas Gleixner
2020-04-27 20:17     ` Fenghua Yu
2020-04-27 20:17       ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 3/7] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 11:17   ` Thomas Gleixner
2020-04-26 11:17     ` Thomas Gleixner
2020-04-27 20:33     ` Fenghua Yu
2020-04-27 20:33       ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 4/7] x86/msr-index: Define IA32_PASID MSR Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 11:22   ` Thomas Gleixner
2020-04-26 11:22     ` Thomas Gleixner
2020-04-27 20:50     ` Fenghua Yu
2020-04-27 20:50       ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 5/7] x86/mmu: Allocate/free PASID Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 14:55   ` Thomas Gleixner
2020-04-26 14:55     ` Thomas Gleixner
2020-04-27 22:18     ` Fenghua Yu
2020-04-27 22:18       ` Fenghua Yu
2020-04-27 23:44       ` Thomas Gleixner
2020-04-27 23:44         ` Thomas Gleixner
2020-04-28 18:21     ` Jacob Pan (Jun)
2020-04-28 18:21       ` Jacob Pan (Jun)
2020-04-28 18:54       ` Thomas Gleixner
2020-04-28 18:54         ` Thomas Gleixner
2020-04-28 19:07         ` Luck, Tony
2020-04-28 19:07           ` Luck, Tony
2020-04-28 20:42           ` Jacob Pan (Jun)
2020-04-28 20:42             ` Jacob Pan (Jun)
2020-04-28 20:59             ` Luck, Tony
2020-04-28 20:59               ` Luck, Tony
2020-04-28 22:13               ` Jacob Pan (Jun)
2020-04-28 22:13                 ` Jacob Pan (Jun)
2020-04-28 22:32                 ` Luck, Tony
2020-04-28 22:32                   ` Luck, Tony
2020-04-28 20:40         ` Jacob Pan (Jun)
2020-04-28 20:40           ` Jacob Pan (Jun)
2020-04-28 20:57     ` Fenghua Yu
2020-04-28 20:57       ` Fenghua Yu
2020-03-30 19:33 ` [PATCH 6/7] x86/traps: Fix up invalid PASID Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-26 15:25   ` Thomas Gleixner
2020-04-26 15:25     ` Thomas Gleixner
2020-04-27 20:11     ` Fenghua Yu
2020-04-27 20:11       ` Fenghua Yu
2020-04-28  0:13       ` Thomas Gleixner
2020-04-28  0:13         ` Thomas Gleixner
2020-04-27 22:46     ` Raj, Ashok
2020-04-27 22:46       ` Raj, Ashok
2020-04-27 23:08       ` Luck, Tony
2020-04-27 23:08         ` Luck, Tony
2020-04-28  0:20         ` Thomas Gleixner
2020-04-28  0:20           ` Thomas Gleixner
2020-04-28  0:54       ` Thomas Gleixner
2020-04-28  0:54         ` Thomas Gleixner
2020-04-28  1:08         ` Raj, Ashok
2020-04-28  1:08           ` Raj, Ashok
2020-03-30 19:33 ` [PATCH 7/7] x86/process: Clear PASID state for a newly forked/cloned thread Fenghua Yu
2020-03-30 19:33   ` Fenghua Yu
2020-04-22 20:41 ` [PATCH 0/7] x86: tag application address space for devices Fenghua Yu
2020-04-22 20:41   ` Fenghua Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.