dmaengine Archive on lore.kernel.org
 help / color / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: vkoul@kernel.org, megha.dey@intel.com, maz@kernel.org,
	bhelgaas@google.com, rafael@kernel.org,
	gregkh@linuxfoundation.org, tglx@linutronix.de, hpa@zytor.com,
	alex.williamson@redhat.com, jacob.jun.pan@intel.com,
	ashok.raj@intel.com, jgg@mellanox.com, yi.l.liu@intel.com,
	baolu.lu@intel.com, kevin.tian@intel.com,
	sanjay.k.kumar@intel.com, tony.luck@intel.com,
	jing.lin@intel.com, dan.j.williams@intel.com,
	kwankhede@nvidia.com, eric.auger@redhat.com, parav@mellanox.com,
	jgg@mellanox.com, rafael@kernel.org, dave.hansen@intel.com,
	netanelg@mellanox.com, shahafs@mellanox.com,
	yan.y.zhao@linux.intel.com, pbonzini@redhat.com,
	samuel.ortiz@intel.com, mona.hossain@intel.com
Cc: dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
	x86@kernel.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org
Subject: [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver
Date: Tue, 21 Jul 2020 09:02:15 -0700
Message-ID: <159534667974.28840.2045034360240786644.stgit@djiang5-desk3.ch.intel.com> (raw)

v2:
IMS (now dev-msi):
With recommendations from Jason/Thomas/Dan on making IMS more generic:
Pass a non-pci generic device(struct device) for IMS management instead of mdev
Remove all references to mdev and symbol_get/put
Remove all references to IMS in common code and replace with dev-msi
remove dynamic allocation of platform-msi interrupts: no groups,no new msi list or list helpers
Create a generic dev-msi domain with and without interrupt remapping enabled.
Introduce dev_msi_domain_alloc_irqs and dev_msi_domain_free_irqs apis

mdev: 
Removing unrelated bits from SVA enabling that’s not necessary for the submission. (Kevin)
Restructured entire mdev driver series to make reviewing easier (Kevin)
Made rw emulation more robust (Kevin)
Removed uuid wq type and added single dedicated wq type (Kevin)
Locking fixes for vdev (Yan Zhao)
VFIO MSIX trigger fixes (Yan Zhao)

Link to previous discussions with Jason:
https://lore.kernel.org/lkml/57296ad1-20fe-caf2-b83f-46d823ca0b5f@intel.com/
The emulation part that can be moved to user space is very small due to the majority of the
emulations being control bits and need to reside in the kernel. We can revisit the necessity of
moving the small emulation part to userspace and required architectural changes at a later time.

This RFC series has been reviewed by Dan Williams <dan.j.williams@intel.com>

The actual code can be independent of the stage 2 driver code submission that adds support for SVM,
ENQCMD(S), PASID, and shared workqueues. This code series will match the support of the 5.6 kernel
(stage 1) driver but on guest. The code is dependent on Baolu’s iommu aux-domain API extensions
patches that’s still in process of being reviewed:
https://lkml.org/lkml/2020/7/14/48

Stage 1 of the driver has been accepted in v5.6 kernel. It supports dedicated workqueue (wq)
without Shared Virtual Memory (SVM) support. Stage 2 supports shared wq and SVM. It is pending
upstream review and targeting kernel v5.9.

VFIO mediated device framework allows vendor drivers to wrap a portion of device resources into
virtual devices (mdev). Each mdev can be assigned to different guest using the same set of VFIO
uAPIs as assigning a physical device. Accessing to the mdev resource is served with mixed policies.
For example, vendor drivers typically mark data-path interface as pass-through for fast guest
operations, and then trap-and-mediate the control-path interface to avoid undesired interference
between mdevs. Some level of emulation is necessary behind vfio mdev to compose the virtual device
interface. 

This series brings mdev to idxd driver to enable Intel Scalable IOV (SIOV), a hardware-assisted
mediated pass-through technology. SIOV makes each DSA wq independently assignable through
PASID-granular resource/DMA isolation. It helps improve scalability and reduces mediation
complexity against purely software-based mdev implementations. Each assigned wq is configured by
host and exposed to the guest in a read-only configuration mode, which allows the guest to use the
wq w/o additional setup. This design greatly reduces the emulation bits to focus on handling
commands from guests.

Introducing mdev types “1dwq” type. This mdev type allows allocation of a single dedicated wq from
available dedicated wqs. After a workqueue (wq) is enabled, the user will generate an uuid. On mdev
creation, the mdev driver code will find a dwq depending on the mdev type. When the create operation
is successful, the user generated uuid can be passed to qemu. When the guest boots up, it should
discover a DSA device when doing PCI discovery.

For example of “1dwq” type:
1. Enable wq with “mdev” wq type
2. A user generated uuid.
3. The uuid is written to the mdev class sysfs path:
echo $UUID > /sys/class/mdev_bus/0000\:00\:0a.0/mdev_supported_types/idxd-wq/create
4. Pass the following parameter to qemu:
"-device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:00:0a.0/$UUID"
 
The wq exported through mdev will have the read only config bit set for configuration. This means
that the device does not require the typical configuration. After enabling the device, the user
must set the WQ type and name. That is all is necessary to enable the WQ and start using it. The
single wq configuration is not the only way to create the mdev. Multi wqs support for mdev will be
in the future works.
 
The mdev utilizes Interrupt Message Store or IMS[3], a device-specific MSI implementation, instead
of MSIX for interrupts for the guest. This preserves MSIX for host usages and also allows a
significantly larger number of interrupt vectors for guest usage.

The idxd driver implements IMS as on-device memory mapped unified storage. Each interrupt message
is stored as a DWORD size data payload and a 64-bit address (same as MSI-X). Access to the IMS is
through the host idxd driver.

This patchset extends the existing platform-msi framework (which provides a generic mechanism to
support non-PCI compliant MSI interrupts) to benefit any driver which wants to allocate
msi-like(dev-msi) interrupts and provide its own ops functions (mask/unmask etc.)

Call-back functions defined by the kernel and implemented by the driver are used to
1. program the interrupt addr/data values instead of the kernel directly programming them.
2. mask/unmask the interrupt source

The kernel can specify the requirements for these callback functions (e.g., the driver is not
expected to block, or not expected to take a lock in the callback function).

Support for 2 new IRQ chip/domain is added(with and without IRQ_REMAP support- DEV-MSI/IR-DEV-MSI).

[1]: https://lore.kernel.org/lkml/157965011794.73301.15960052071729101309.stgit@djiang5-desk3.ch.intel.com/
[2]: https://software.intel.com/en-us/articles/intel-sdm
[3]: https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[4]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification
[5]: https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
[6]: https://intel.github.io/idxd/
[7]: https://github.com/intel/idxd-driver idxd-stage2.5

---

Dave Jiang (13):
      dmaengine: idxd: add support for readonly config devices
      dmaengine: idxd: add interrupt handle request support
      dmaengine: idxd: add DEV-MSI support in base driver
      dmaengine: idxd: add device support functions in prep for mdev
      dmaengine: idxd: add basic mdev registration and helper functions
      dmaengine: idxd: add emulation rw routines
      dmaengine: idxd: prep for virtual device commands
      dmaengine: idxd: virtual device commands emulation
      dmaengine: idxd: ims setup for the vdcm
      dmaengine: idxd: add mdev type as a new wq type
      dmaengine: idxd: add dedicated wq mdev type
      dmaengine: idxd: add new wq state for mdev
      dmaengine: idxd: add error notification from host driver to mediated device

Jing Lin (1):
      dmaengine: idxd: add ABI documentation for mediated device support

Megha Dey (4):
      platform-msi: Introduce platform_msi_ops
      irq/dev-msi: Add support for a new DEV_MSI irq domain
      irq/dev-msi: Create IR-DEV-MSI irq domain
      irq/dev-msi: Introduce APIs to allocate/free dev-msi interrupts


 Documentation/ABI/stable/sysfs-driver-dma-idxd |   15 
 arch/x86/include/asm/hw_irq.h                  |    6 
 arch/x86/kernel/apic/msi.c                     |   12 
 drivers/base/Kconfig                           |    7 
 drivers/base/Makefile                          |    1 
 drivers/base/dev-msi.c                         |  170 ++++
 drivers/base/platform-msi.c                    |   62 +
 drivers/base/platform-msi.h                    |   23 
 drivers/dma/Kconfig                            |    7 
 drivers/dma/idxd/Makefile                      |    2 
 drivers/dma/idxd/cdev.c                        |    6 
 drivers/dma/idxd/device.c                      |  266 +++++-
 drivers/dma/idxd/idxd.h                        |   62 +
 drivers/dma/idxd/ims.c                         |  174 ++++
 drivers/dma/idxd/ims.h                         |   17 
 drivers/dma/idxd/init.c                        |  100 ++
 drivers/dma/idxd/irq.c                         |    6 
 drivers/dma/idxd/mdev.c                        | 1106 ++++++++++++++++++++++++
 drivers/dma/idxd/mdev.h                        |  118 +++
 drivers/dma/idxd/registers.h                   |   24 -
 drivers/dma/idxd/submit.c                      |   37 +
 drivers/dma/idxd/sysfs.c                       |   55 +
 drivers/dma/idxd/vdev.c                        |  962 +++++++++++++++++++++
 drivers/dma/idxd/vdev.h                        |   28 +
 drivers/dma/mv_xor_v2.c                        |    6 
 drivers/dma/qcom/hidma.c                       |    6 
 drivers/iommu/arm-smmu-v3.c                    |    6 
 drivers/iommu/intel/irq_remapping.c            |   11 
 drivers/irqchip/irq-mbigen.c                   |    8 
 drivers/irqchip/irq-mvebu-icu.c                |    6 
 drivers/mailbox/bcm-flexrm-mailbox.c           |    6 
 drivers/perf/arm_smmuv3_pmu.c                  |    6 
 include/linux/intel-iommu.h                    |    1 
 include/linux/irqdomain.h                      |   11 
 include/linux/msi.h                            |   35 +
 include/uapi/linux/idxd.h                      |    2 
 36 files changed, 3270 insertions(+), 100 deletions(-)
 create mode 100644 drivers/base/dev-msi.c
 create mode 100644 drivers/base/platform-msi.h
 create mode 100644 drivers/dma/idxd/ims.c
 create mode 100644 drivers/dma/idxd/ims.h
 create mode 100644 drivers/dma/idxd/mdev.c
 create mode 100644 drivers/dma/idxd/mdev.h
 create mode 100644 drivers/dma/idxd/vdev.c
 create mode 100644 drivers/dma/idxd/vdev.h

--

             reply index

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 16:02 Dave Jiang [this message]
2020-07-21 16:02 ` [PATCH RFC v2 01/18] platform-msi: Introduce platform_msi_ops Dave Jiang
2020-07-21 16:02 ` [PATCH RFC v2 02/18] irq/dev-msi: Add support for a new DEV_MSI irq domain Dave Jiang
2020-07-21 16:13   ` Jason Gunthorpe
2020-07-22 16:50     ` Dey, Megha
2020-07-22 18:52   ` Marc Zyngier
2020-07-22 19:59     ` Jason Gunthorpe
2020-07-23  8:51       ` Marc Zyngier
2020-07-24  0:16         ` Jason Gunthorpe
2020-07-24  0:36           ` Thomas Gleixner
2020-08-05 19:18       ` Dey, Megha
2020-08-05 22:15         ` Jason Gunthorpe
2020-08-05 22:36           ` Dey, Megha
2020-08-05 22:53             ` Jason Gunthorpe
2020-08-06  0:13               ` Dey, Megha
2020-08-06  0:19                 ` Jason Gunthorpe
2020-08-06  0:32                   ` Dey, Megha
2020-08-06  0:46                     ` Jason Gunthorpe
2020-08-06 17:10                     ` Thomas Gleixner
2020-08-06 17:58                       ` Dey, Megha
2020-08-06 20:21                         ` Thomas Gleixner
2020-08-06 22:27                           ` Dey, Megha
2020-08-07  8:48                             ` Thomas Gleixner
2020-08-07 12:06                           ` Jason Gunthorpe
2020-08-07 12:38                             ` gregkh
2020-08-07 13:34                               ` Jason Gunthorpe
2020-08-07 16:47                                 ` Thomas Gleixner
2020-08-07 17:54                                   ` Dey, Megha
2020-08-07 18:39                                     ` Jason Gunthorpe
2020-08-07 20:31                                       ` Dey, Megha
2020-08-08 19:47                                     ` Thomas Gleixner
2020-08-10 21:46                                       ` Thomas Gleixner
2020-08-11  9:53                                         ` Thomas Gleixner
2020-08-11 18:46                                           ` Dey, Megha
2020-08-11 21:25                                             ` Thomas Gleixner
2020-08-11 18:39                                       ` Dey, Megha
2020-08-11 22:39                                         ` Thomas Gleixner
2020-08-07 15:22                             ` Thomas Gleixner
2020-08-05 18:55     ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 03/18] irq/dev-msi: Create IR-DEV-MSI " Dave Jiang
2020-07-21 16:21   ` Jason Gunthorpe
2020-07-22 17:03     ` Dey, Megha
2020-07-22 17:33       ` Jason Gunthorpe
2020-07-22 20:44   ` Thomas Gleixner
2020-08-05 19:02     ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 04/18] irq/dev-msi: Introduce APIs to allocate/free dev-msi interrupts Dave Jiang
2020-07-21 16:25   ` Jason Gunthorpe
2020-07-22 17:05     ` Dey, Megha
2020-07-22 17:35       ` Jason Gunthorpe
2020-08-05 20:19         ` Dey, Megha
2020-07-21 16:02 ` [PATCH RFC v2 05/18] dmaengine: idxd: add support for readonly config devices Dave Jiang
2020-07-21 16:02 ` [PATCH RFC v2 06/18] dmaengine: idxd: add interrupt handle request support Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 07/18] dmaengine: idxd: add DEV-MSI support in base driver Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 08/18] dmaengine: idxd: add device support functions in prep for mdev Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 09/18] dmaengine: idxd: add basic mdev registration and helper functions Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 10/18] dmaengine: idxd: add emulation rw routines Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 11/18] dmaengine: idxd: prep for virtual device commands Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 12/18] dmaengine: idxd: virtual device commands emulation Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 13/18] dmaengine: idxd: ims setup for the vdcm Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 14/18] dmaengine: idxd: add mdev type as a new wq type Dave Jiang
2020-07-21 16:03 ` [PATCH RFC v2 15/18] dmaengine: idxd: add dedicated wq mdev type Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 16/18] dmaengine: idxd: add new wq state for mdev Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 17/18] dmaengine: idxd: add error notification from host driver to mediated device Dave Jiang
2020-07-21 16:04 ` [PATCH RFC v2 18/18] dmaengine: idxd: add ABI documentation for mediated device support Dave Jiang
2020-07-21 16:28 ` [PATCH RFC v2 00/18] Add VFIO mediated device support and DEV-MSI support for the idxd driver Greg KH
2020-07-21 17:17   ` Dave Jiang
2020-07-21 21:35   ` Dan Williams
2020-07-21 16:45 ` Jason Gunthorpe
2020-07-21 18:00   ` Dave Jiang
2020-07-22 17:31     ` Dey, Megha
2020-07-22 18:16       ` Jason Gunthorpe
2020-07-21 23:54   ` Tian, Kevin
2020-07-24  0:19     ` Jason Gunthorpe
2020-08-06  1:22       ` Alex Williamson
2020-08-07 12:19         ` Jason Gunthorpe
2020-08-10  7:32           ` Tian, Kevin
2020-08-11 17:00             ` Alex Williamson
2020-08-12  1:58               ` Tian, Kevin
2020-08-12  2:36                 ` Alex Williamson
2020-08-12  3:35                   ` Tian, Kevin
2020-08-12  3:28             ` Jason Wang
2020-08-12  4:05               ` Tian, Kevin
2020-08-13  4:33                 ` Jason Wang
2020-08-13  5:26                   ` Tian, Kevin
2020-08-13  6:01                     ` Jason Wang
2020-08-14 13:23                       ` Jason Gunthorpe
2020-08-17  2:24                         ` Tian, Kevin
2020-08-14 13:35             ` Jason Gunthorpe
2020-08-17  2:12               ` Tian, Kevin
2020-08-18  0:43                 ` Jason Gunthorpe
2020-08-18  1:09                   ` Tian, Kevin
2020-08-18 11:50                     ` Jason Gunthorpe
2020-08-18 16:27                       ` Paolo Bonzini
2020-08-18 16:49                         ` Jason Gunthorpe
2020-08-18 17:05                           ` Paolo Bonzini
2020-08-18 17:18                             ` Jason Gunthorpe
2020-08-19  7:29                       ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=159534667974.28840.2045034360240786644.stgit@djiang5-desk3.ch.intel.com \
    --to=dave.jiang@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=eric.auger@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@mellanox.com \
    --cc=jing.lin@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=megha.dey@intel.com \
    --cc=mona.hossain@intel.com \
    --cc=netanelg@mellanox.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=rafael@kernel.org \
    --cc=samuel.ortiz@intel.com \
    --cc=sanjay.k.kumar@intel.com \
    --cc=shahafs@mellanox.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vkoul@kernel.org \
    --cc=x86@kernel.org \
    --cc=yan.y.zhao@linux.intel.com \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

dmaengine Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dmaengine/0 dmaengine/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dmaengine dmaengine/ https://lore.kernel.org/dmaengine \
		dmaengine@vger.kernel.org
	public-inbox-index dmaengine

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.dmaengine


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git