All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 00/15] KVM PCIe/MSI passthrough on ARM/ARM64
@ 2016-02-11 14:34 ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

This series addresses KVM PCIe passthrough with MSI enabled on ARM/ARM64.
It pursues the efforts done on [1], [2], [3]. It also aims at covering the
same need on PowerPC platforms although the same kind of integration
should be carried out.

On x86 all accesses to the 1MB PA region [FEE0_0000h - FEF0_000h] are directed
as interrupt messages: accesses to this special PA window directly target the
APIC configuration space and not DRAM, meaning the downstream IOMMU is bypassed.

This is not the case on above mentionned platforms where MSI messages emitted
by devices are conveyed through the IOMMU. This means an IOVA/host PA mapping
must exist for the MSI to reach the MSI controller. Normal way to create
IOVA bindings consists in using VFIO DMA MAP API. However in this case
the MSI IOVA is not mapped onto guest RAM but on host physical page (the MSI
controller frame).

In a nutshell, this series does:
- introduce an IOMMU API to register a IOVA window usable for reserved mapping
- reuse VFIO DMA MAP ioctl with a new flag to plug onto that new API
- check if the device MSI-parent controllers allow IRQ remapping
  (allow unsafe interrupt modality) for a given group
- introduce a new IOMMU API to allocate reserved IOVAs and bind them onto
  a physical address
- allow the GICv2M and GICv3-ITS PCI irqchip to map/unmap the MSI frame
  on irq_write_msi_msg

Best Regards

Eric

Testing:
- functional on ARM64 AMD Overdrive HW (single GICv2m frame) with an e1000e
  PCIe card.
- tested there is no regresion on
  x non assigned PCIe driver
  x platform device passthrough
- Not tested: ARM with SR-IOV, ARM GICv3 ITS

References:
[1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
    (https://lkml.org/lkml/2015/7/24/135)
[2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
    (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html)
[3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
    (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)

Git:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc3-pcie-passthrough-rfcv2

previous version at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc1-pcie-passthrough-v1

QEMU Integration:
[RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
(http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2

User Hints:
To allow PCI/MSI passthrough with GICv2M, compile VFIO as a module and
load the vfio_iommu_type1 module with allow_unsafe_interrupts param:
sudo modprobe -v vfio_iommu_type1 allow_unsafe_interrupts=1
sudo modprobe -v vfio-pci

History:

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (15):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
  vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
  iommu: add alloc/free_reserved_iova_domain
  iommu/arm-smmu: implement alloc/free_reserved_iova_domain
  iommu/arm-smmu: add a reserved binding RB tree
  iommu: iommu_get/put_single_reserved
  iommu/arm-smmu: implement iommu_get/put_single_reserved
  iommu/arm-smmu: relinquish reserved resources on domain deletion
  vfio: allow the user to register reserved iova range for MSI mapping
  msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
  msi: export msi_get_domain_info
  vfio/type1: also check IRQ remapping capability at msi domain
  iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

 drivers/iommu/arm-smmu.c                 | 292 +++++++++++++++++++++++++++++--
 drivers/iommu/fsl_pamu_domain.c          |   2 +
 drivers/iommu/iommu.c                    |  43 +++++
 drivers/irqchip/irq-gic-common.c         |  60 +++++++
 drivers/irqchip/irq-gic-common.h         |   5 +
 drivers/irqchip/irq-gic-v2m.c            |   3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |   3 +-
 drivers/irqchip/irq-gic-v3-its.c         |   1 +
 drivers/vfio/vfio_iommu_type1.c          | 156 ++++++++++++++++-
 include/linux/iommu.h                    |  53 ++++++
 include/linux/msi.h                      |   6 +
 include/uapi/linux/vfio.h                |  10 ++
 kernel/irq/msi.c                         |   1 +
 13 files changed, 613 insertions(+), 22 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC v2 00/15] KVM PCIe/MSI passthrough on ARM/ARM64
@ 2016-02-11 14:34 ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

This series addresses KVM PCIe passthrough with MSI enabled on ARM/ARM64.
It pursues the efforts done on [1], [2], [3]. It also aims at covering the
same need on PowerPC platforms although the same kind of integration
should be carried out.

On x86 all accesses to the 1MB PA region [FEE0_0000h - FEF0_000h] are directed
as interrupt messages: accesses to this special PA window directly target the
APIC configuration space and not DRAM, meaning the downstream IOMMU is bypassed.

This is not the case on above mentionned platforms where MSI messages emitted
by devices are conveyed through the IOMMU. This means an IOVA/host PA mapping
must exist for the MSI to reach the MSI controller. Normal way to create
IOVA bindings consists in using VFIO DMA MAP API. However in this case
the MSI IOVA is not mapped onto guest RAM but on host physical page (the MSI
controller frame).

In a nutshell, this series does:
- introduce an IOMMU API to register a IOVA window usable for reserved mapping
- reuse VFIO DMA MAP ioctl with a new flag to plug onto that new API
- check if the device MSI-parent controllers allow IRQ remapping
  (allow unsafe interrupt modality) for a given group
- introduce a new IOMMU API to allocate reserved IOVAs and bind them onto
  a physical address
- allow the GICv2M and GICv3-ITS PCI irqchip to map/unmap the MSI frame
  on irq_write_msi_msg

Best Regards

Eric

Testing:
- functional on ARM64 AMD Overdrive HW (single GICv2m frame) with an e1000e
  PCIe card.
- tested there is no regresion on
  x non assigned PCIe driver
  x platform device passthrough
- Not tested: ARM with SR-IOV, ARM GICv3 ITS

References:
[1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
    (https://lkml.org/lkml/2015/7/24/135)
[2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
    (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html)
[3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
    (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)

Git:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc3-pcie-passthrough-rfcv2

previous version at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc1-pcie-passthrough-v1

QEMU Integration:
[RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
(http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2

User Hints:
To allow PCI/MSI passthrough with GICv2M, compile VFIO as a module and
load the vfio_iommu_type1 module with allow_unsafe_interrupts param:
sudo modprobe -v vfio_iommu_type1 allow_unsafe_interrupts=1
sudo modprobe -v vfio-pci

History:

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (15):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
  vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
  iommu: add alloc/free_reserved_iova_domain
  iommu/arm-smmu: implement alloc/free_reserved_iova_domain
  iommu/arm-smmu: add a reserved binding RB tree
  iommu: iommu_get/put_single_reserved
  iommu/arm-smmu: implement iommu_get/put_single_reserved
  iommu/arm-smmu: relinquish reserved resources on domain deletion
  vfio: allow the user to register reserved iova range for MSI mapping
  msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
  msi: export msi_get_domain_info
  vfio/type1: also check IRQ remapping capability at msi domain
  iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

 drivers/iommu/arm-smmu.c                 | 292 +++++++++++++++++++++++++++++--
 drivers/iommu/fsl_pamu_domain.c          |   2 +
 drivers/iommu/iommu.c                    |  43 +++++
 drivers/irqchip/irq-gic-common.c         |  60 +++++++
 drivers/irqchip/irq-gic-common.h         |   5 +
 drivers/irqchip/irq-gic-v2m.c            |   3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |   3 +-
 drivers/irqchip/irq-gic-v3-its.c         |   1 +
 drivers/vfio/vfio_iommu_type1.c          | 156 ++++++++++++++++-
 include/linux/iommu.h                    |  53 ++++++
 include/linux/msi.h                      |   6 +
 include/uapi/linux/vfio.h                |  10 ++
 kernel/irq/msi.c                         |   1 +
 13 files changed, 613 insertions(+), 22 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC v2 00/15] KVM PCIe/MSI passthrough on ARM/ARM64
@ 2016-02-11 14:34 ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

This series addresses KVM PCIe passthrough with MSI enabled on ARM/ARM64.
It pursues the efforts done on [1], [2], [3]. It also aims at covering the
same need on PowerPC platforms although the same kind of integration
should be carried out.

On x86 all accesses to the 1MB PA region [FEE0_0000h - FEF0_000h] are directed
as interrupt messages: accesses to this special PA window directly target the
APIC configuration space and not DRAM, meaning the downstream IOMMU is bypassed.

This is not the case on above mentionned platforms where MSI messages emitted
by devices are conveyed through the IOMMU. This means an IOVA/host PA mapping
must exist for the MSI to reach the MSI controller. Normal way to create
IOVA bindings consists in using VFIO DMA MAP API. However in this case
the MSI IOVA is not mapped onto guest RAM but on host physical page (the MSI
controller frame).

In a nutshell, this series does:
- introduce an IOMMU API to register a IOVA window usable for reserved mapping
- reuse VFIO DMA MAP ioctl with a new flag to plug onto that new API
- check if the device MSI-parent controllers allow IRQ remapping
  (allow unsafe interrupt modality) for a given group
- introduce a new IOMMU API to allocate reserved IOVAs and bind them onto
  a physical address
- allow the GICv2M and GICv3-ITS PCI irqchip to map/unmap the MSI frame
  on irq_write_msi_msg

Best Regards

Eric

Testing:
- functional on ARM64 AMD Overdrive HW (single GICv2m frame) with an e1000e
  PCIe card.
- tested there is no regresion on
  x non assigned PCIe driver
  x platform device passthrough
- Not tested: ARM with SR-IOV, ARM GICv3 ITS

References:
[1] [RFC 0/2] VFIO: Add virtual MSI doorbell support
    (https://lkml.org/lkml/2015/7/24/135)
[2] [RFC PATCH 0/6] vfio: Add interface to map MSI pages
    (https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html)
[3] [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO
    (http://permalink.gmane.org/gmane.comp.emulators.kvm.arm.devel/3858)

Git:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc3-pcie-passthrough-rfcv2

previous version at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.5-rc1-pcie-passthrough-v1

QEMU Integration:
[RFC v2 0/8] KVM PCI/MSI passthrough with mach-virt
(http://lists.gnu.org/archive/html/qemu-arm/2016-01/msg00444.html)
https://git.linaro.org/people/eric.auger/qemu.git/shortlog/refs/heads/v2.5.0-pci-passthrough-rfc-v2

User Hints:
To allow PCI/MSI passthrough with GICv2M, compile VFIO as a module and
load the vfio_iommu_type1 module with allow_unsafe_interrupts param:
sudo modprobe -v vfio_iommu_type1 allow_unsafe_interrupts=1
sudo modprobe -v vfio-pci

History:

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (15):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
  vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
  iommu: add alloc/free_reserved_iova_domain
  iommu/arm-smmu: implement alloc/free_reserved_iova_domain
  iommu/arm-smmu: add a reserved binding RB tree
  iommu: iommu_get/put_single_reserved
  iommu/arm-smmu: implement iommu_get/put_single_reserved
  iommu/arm-smmu: relinquish reserved resources on domain deletion
  vfio: allow the user to register reserved iova range for MSI mapping
  msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
  msi: export msi_get_domain_info
  vfio/type1: also check IRQ remapping capability at msi domain
  iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

 drivers/iommu/arm-smmu.c                 | 292 +++++++++++++++++++++++++++++--
 drivers/iommu/fsl_pamu_domain.c          |   2 +
 drivers/iommu/iommu.c                    |  43 +++++
 drivers/irqchip/irq-gic-common.c         |  60 +++++++
 drivers/irqchip/irq-gic-common.h         |   5 +
 drivers/irqchip/irq-gic-v2m.c            |   3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |   3 +-
 drivers/irqchip/irq-gic-v3-its.c         |   1 +
 drivers/vfio/vfio_iommu_type1.c          | 156 ++++++++++++++++-
 include/linux/iommu.h                    |  53 ++++++
 include/linux/msi.h                      |   6 +
 include/uapi/linux/vfio.h                |  10 ++
 kernel/irq/msi.c                         |   1 +
 13 files changed, 613 insertions(+), 22 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC v2 01/15] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Introduce new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU. ARM SMMUS
and FSL PAMU, at least expose this attribute.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 drivers/iommu/arm-smmu.c        | 2 ++
 drivers/iommu/fsl_pamu_domain.c | 2 ++
 include/linux/iommu.h           | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 59ee4b8..c8b7e71 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1409,6 +1409,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index da0e1e3..46d5c6a 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -856,6 +856,8 @@ static int fsl_pamu_get_domain_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_FSL_PAMUV1:
 		*(int *)data = DOMAIN_ATTR_FSL_PAMUV1;
 		break;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		break;
 	default:
 		pr_debug("Unsupported attribute type\n");
 		ret = -EINVAL;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5c539f..a4fe04a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -112,6 +112,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 01/15] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

Introduce new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU. ARM SMMUS
and FSL PAMU, at least expose this attribute.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 drivers/iommu/arm-smmu.c        | 2 ++
 drivers/iommu/fsl_pamu_domain.c | 2 ++
 include/linux/iommu.h           | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 59ee4b8..c8b7e71 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1409,6 +1409,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index da0e1e3..46d5c6a 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -856,6 +856,8 @@ static int fsl_pamu_get_domain_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_FSL_PAMUV1:
 		*(int *)data = DOMAIN_ATTR_FSL_PAMUV1;
 		break;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		break;
 	default:
 		pr_debug("Unsupported attribute type\n");
 		ret = -EINVAL;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5c539f..a4fe04a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -112,6 +112,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 01/15] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU. ARM SMMUS
and FSL PAMU, at least expose this attribute.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 drivers/iommu/arm-smmu.c        | 2 ++
 drivers/iommu/fsl_pamu_domain.c | 2 ++
 include/linux/iommu.h           | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 59ee4b8..c8b7e71 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1409,6 +1409,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index da0e1e3..46d5c6a 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -856,6 +856,8 @@ static int fsl_pamu_get_domain_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_FSL_PAMUV1:
 		*(int *)data = DOMAIN_ATTR_FSL_PAMUV1;
 		break;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		break;
 	default:
 		pr_debug("Unsupported attribute type\n");
 		ret = -EINVAL;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a5c539f..a4fe04a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -112,6 +112,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

This patch allows the user-space to retrieve whether msi write
transaction addresses must be mapped. This is returned through the
VFIO_IOMMU_GET_INFO API and its new flag: VFIO_IOMMU_INFO_REQUIRE_MSI_MAP.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

RFC v1 -> v1:
- derived from
  [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
- renamed allow_msi_reconfig into require_msi_mapping
- fixed VFIO_IOMMU_GET_INFO
---
 drivers/vfio/vfio_iommu_type1.c | 26 ++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       |  1 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6f1ea3d..c5b57e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -255,6 +255,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
 }
 
 /*
+ * vfio_domains_require_msi_mapping: indicates whether MSI write transaction
+ * addresses must be mapped
+ *
+ * returns true if it does
+ */
+static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
+{
+	struct vfio_domain *d;
+	bool ret;
+
+	mutex_lock(&iommu->lock);
+	/* All domains have same require_msi_map property, pick first */
+	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
+	if (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) < 0)
+		ret = false;
+	else
+		ret = true;
+	mutex_unlock(&iommu->lock);
+
+	return ret;
+}
+
+/*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
@@ -997,6 +1020,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
+		if (vfio_domains_require_msi_mapping(iommu))
+			info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
+
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
 		return copy_to_user((void __user *)arg, &info, minsz);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 7d7a4c6..43e183b 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -400,6 +400,7 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

This patch allows the user-space to retrieve whether msi write
transaction addresses must be mapped. This is returned through the
VFIO_IOMMU_GET_INFO API and its new flag: VFIO_IOMMU_INFO_REQUIRE_MSI_MAP.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

RFC v1 -> v1:
- derived from
  [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
- renamed allow_msi_reconfig into require_msi_mapping
- fixed VFIO_IOMMU_GET_INFO
---
 drivers/vfio/vfio_iommu_type1.c | 26 ++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       |  1 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6f1ea3d..c5b57e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -255,6 +255,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
 }
 
 /*
+ * vfio_domains_require_msi_mapping: indicates whether MSI write transaction
+ * addresses must be mapped
+ *
+ * returns true if it does
+ */
+static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
+{
+	struct vfio_domain *d;
+	bool ret;
+
+	mutex_lock(&iommu->lock);
+	/* All domains have same require_msi_map property, pick first */
+	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
+	if (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) < 0)
+		ret = false;
+	else
+		ret = true;
+	mutex_unlock(&iommu->lock);
+
+	return ret;
+}
+
+/*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
@@ -997,6 +1020,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
+		if (vfio_domains_require_msi_mapping(iommu))
+			info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
+
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
 		return copy_to_user((void __user *)arg, &info, minsz);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 7d7a4c6..43e183b 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -400,6 +400,7 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

This patch allows the user-space to retrieve whether msi write
transaction addresses must be mapped. This is returned through the
VFIO_IOMMU_GET_INFO API and its new flag: VFIO_IOMMU_INFO_REQUIRE_MSI_MAP.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

RFC v1 -> v1:
- derived from
  [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
- renamed allow_msi_reconfig into require_msi_mapping
- fixed VFIO_IOMMU_GET_INFO
---
 drivers/vfio/vfio_iommu_type1.c | 26 ++++++++++++++++++++++++++
 include/uapi/linux/vfio.h       |  1 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 6f1ea3d..c5b57e1 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -255,6 +255,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
 }
 
 /*
+ * vfio_domains_require_msi_mapping: indicates whether MSI write transaction
+ * addresses must be mapped
+ *
+ * returns true if it does
+ */
+static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
+{
+	struct vfio_domain *d;
+	bool ret;
+
+	mutex_lock(&iommu->lock);
+	/* All domains have same require_msi_map property, pick first */
+	d = list_first_entry(&iommu->domain_list, struct vfio_domain, next);
+	if (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) < 0)
+		ret = false;
+	else
+		ret = true;
+	mutex_unlock(&iommu->lock);
+
+	return ret;
+}
+
+/*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
@@ -997,6 +1020,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 
 		info.flags = VFIO_IOMMU_INFO_PGSIZES;
 
+		if (vfio_domains_require_msi_mapping(iommu))
+			info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
+
 		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
 
 		return copy_to_user((void __user *)arg, &info, minsz);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 7d7a4c6..43e183b 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -400,6 +400,7 @@ struct vfio_iommu_type1_info {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
+#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
 	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 03/15] vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

We introduce a vfio_dma type since we will need to discriminate
legacy vfio_dma's from new reserved ones. Since those latter are
not mapped at registration, some treatments need to be reworked:
removal, replay. Currently they are unplugged. In subsequent patches
they will be reworked.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5b57e1..b9326c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -53,6 +53,15 @@ module_param_named(disable_hugepages,
 MODULE_PARM_DESC(disable_hugepages,
 		 "Disable VFIO IOMMU support for IOMMU hugepages.");
 
+enum vfio_iova_type {
+	VFIO_IOVA_USER = 0, /* standard IOVA used to map user vaddr */
+	/*
+	 * IOVA reserved to map special host physical addresses,
+	 * MSI frames for instance
+	 */
+	VFIO_IOVA_RESERVED,
+};
+
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct mutex		lock;
@@ -75,6 +84,7 @@ struct vfio_dma {
 	unsigned long		vaddr;		/* Process virtual addr */
 	size_t			size;		/* Map size (bytes) */
 	int			prot;		/* IOMMU_READ/WRITE */
+	enum vfio_iova_type	type;		/* type of IOVA */
 };
 
 struct vfio_group {
@@ -418,7 +428,8 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma)
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
-	vfio_unmap_unpin(iommu, dma);
+	if (likely(dma->type != VFIO_IOVA_RESERVED))
+		vfio_unmap_unpin(iommu, dma);
 	vfio_unlink_dma(iommu, dma);
 	kfree(dma);
 }
@@ -694,6 +705,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 		dma_addr_t iova;
 
 		dma = rb_entry(n, struct vfio_dma, node);
+
+		if (unlikely(dma->type == VFIO_IOVA_RESERVED))
+			continue;
+
 		iova = dma->iova;
 
 		while (iova < dma->iova + dma->size) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 03/15] vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

We introduce a vfio_dma type since we will need to discriminate
legacy vfio_dma's from new reserved ones. Since those latter are
not mapped at registration, some treatments need to be reworked:
removal, replay. Currently they are unplugged. In subsequent patches
they will be reworked.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5b57e1..b9326c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -53,6 +53,15 @@ module_param_named(disable_hugepages,
 MODULE_PARM_DESC(disable_hugepages,
 		 "Disable VFIO IOMMU support for IOMMU hugepages.");
 
+enum vfio_iova_type {
+	VFIO_IOVA_USER = 0, /* standard IOVA used to map user vaddr */
+	/*
+	 * IOVA reserved to map special host physical addresses,
+	 * MSI frames for instance
+	 */
+	VFIO_IOVA_RESERVED,
+};
+
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct mutex		lock;
@@ -75,6 +84,7 @@ struct vfio_dma {
 	unsigned long		vaddr;		/* Process virtual addr */
 	size_t			size;		/* Map size (bytes) */
 	int			prot;		/* IOMMU_READ/WRITE */
+	enum vfio_iova_type	type;		/* type of IOVA */
 };
 
 struct vfio_group {
@@ -418,7 +428,8 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma)
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
-	vfio_unmap_unpin(iommu, dma);
+	if (likely(dma->type != VFIO_IOVA_RESERVED))
+		vfio_unmap_unpin(iommu, dma);
 	vfio_unlink_dma(iommu, dma);
 	kfree(dma);
 }
@@ -694,6 +705,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 		dma_addr_t iova;
 
 		dma = rb_entry(n, struct vfio_dma, node);
+
+		if (unlikely(dma->type == VFIO_IOVA_RESERVED))
+			continue;
+
 		iova = dma->iova;
 
 		while (iova < dma->iova + dma->size) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 03/15] vfio: introduce VFIO_IOVA_RESERVED vfio_dma type
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

We introduce a vfio_dma type since we will need to discriminate
legacy vfio_dma's from new reserved ones. Since those latter are
not mapped at registration, some treatments need to be reworked:
removal, replay. Currently they are unplugged. In subsequent patches
they will be reworked.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5b57e1..b9326c9 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -53,6 +53,15 @@ module_param_named(disable_hugepages,
 MODULE_PARM_DESC(disable_hugepages,
 		 "Disable VFIO IOMMU support for IOMMU hugepages.");
 
+enum vfio_iova_type {
+	VFIO_IOVA_USER = 0, /* standard IOVA used to map user vaddr */
+	/*
+	 * IOVA reserved to map special host physical addresses,
+	 * MSI frames for instance
+	 */
+	VFIO_IOVA_RESERVED,
+};
+
 struct vfio_iommu {
 	struct list_head	domain_list;
 	struct mutex		lock;
@@ -75,6 +84,7 @@ struct vfio_dma {
 	unsigned long		vaddr;		/* Process virtual addr */
 	size_t			size;		/* Map size (bytes) */
 	int			prot;		/* IOMMU_READ/WRITE */
+	enum vfio_iova_type	type;		/* type of IOVA */
 };
 
 struct vfio_group {
@@ -418,7 +428,8 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma)
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
-	vfio_unmap_unpin(iommu, dma);
+	if (likely(dma->type != VFIO_IOVA_RESERVED))
+		vfio_unmap_unpin(iommu, dma);
 	vfio_unlink_dma(iommu, dma);
 	kfree(dma);
 }
@@ -694,6 +705,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
 		dma_addr_t iova;
 
 		dma = rb_entry(n, struct vfio_dma, node);
+
+		if (unlikely(dma->type == VFIO_IOVA_RESERVED))
+			continue;
+
 		iova = dma->iova;
 
 		while (iova < dma->iova + dma->size) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 04/15] iommu: add alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

This is the responsability of the API user to make sure any IOVA
belonging to that domain are allocated with those allocation functions.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/iommu.c | 22 ++++++++++++++++++++++
 include/linux/iommu.h | 21 +++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..a994f34 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1557,6 +1557,28 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size,
+				     unsigned long order)
+{
+	int ret;
+
+	if (!domain->ops->alloc_reserved_iova_domain)
+		return -EINVAL;
+	ret = domain->ops->alloc_reserved_iova_domain(domain, iova,
+						      size, order);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	if (!domain->ops->free_reserved_iova_domain)
+		return;
+	domain->ops->free_reserved_iova_domain(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
 void iommu_get_dm_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a4fe04a..32c1a4e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -195,6 +195,12 @@ struct iommu_ops {
 	int (*domain_set_windows)(struct iommu_domain *domain, u32 w_count);
 	/* Get the number of windows per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
+	/* allocates the reserved iova domain */
+	int (*alloc_reserved_iova_domain)(struct iommu_domain *domain,
+					  dma_addr_t iova, size_t size,
+					  unsigned long order);
+	/* frees the reserved iova domain */
+	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -266,6 +272,10 @@ extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
+extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order);
+extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -541,6 +551,17 @@ static inline void iommu_device_unlink(struct device *dev, struct device *link)
 {
 }
 
+static int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order)
+{
+	return -EINVAL;
+}
+
+static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 04/15] iommu: add alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

This is the responsability of the API user to make sure any IOVA
belonging to that domain are allocated with those allocation functions.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/iommu.c | 22 ++++++++++++++++++++++
 include/linux/iommu.h | 21 +++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..a994f34 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1557,6 +1557,28 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size,
+				     unsigned long order)
+{
+	int ret;
+
+	if (!domain->ops->alloc_reserved_iova_domain)
+		return -EINVAL;
+	ret = domain->ops->alloc_reserved_iova_domain(domain, iova,
+						      size, order);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	if (!domain->ops->free_reserved_iova_domain)
+		return;
+	domain->ops->free_reserved_iova_domain(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
 void iommu_get_dm_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a4fe04a..32c1a4e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -195,6 +195,12 @@ struct iommu_ops {
 	int (*domain_set_windows)(struct iommu_domain *domain, u32 w_count);
 	/* Get the number of windows per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
+	/* allocates the reserved iova domain */
+	int (*alloc_reserved_iova_domain)(struct iommu_domain *domain,
+					  dma_addr_t iova, size_t size,
+					  unsigned long order);
+	/* frees the reserved iova domain */
+	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -266,6 +272,10 @@ extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
+extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order);
+extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -541,6 +551,17 @@ static inline void iommu_device_unlink(struct device *dev, struct device *link)
 {
 }
 
+static int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order)
+{
+	return -EINVAL;
+}
+
+static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 04/15] iommu: add alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

This is the responsability of the API user to make sure any IOVA
belonging to that domain are allocated with those allocation functions.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/iommu.c | 22 ++++++++++++++++++++++
 include/linux/iommu.h | 21 +++++++++++++++++++++
 2 files changed, 43 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..a994f34 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1557,6 +1557,28 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size,
+				     unsigned long order)
+{
+	int ret;
+
+	if (!domain->ops->alloc_reserved_iova_domain)
+		return -EINVAL;
+	ret = domain->ops->alloc_reserved_iova_domain(domain, iova,
+						      size, order);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	if (!domain->ops->free_reserved_iova_domain)
+		return;
+	domain->ops->free_reserved_iova_domain(domain);
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
 void iommu_get_dm_regions(struct device *dev, struct list_head *list)
 {
 	const struct iommu_ops *ops = dev->bus->iommu_ops;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a4fe04a..32c1a4e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -195,6 +195,12 @@ struct iommu_ops {
 	int (*domain_set_windows)(struct iommu_domain *domain, u32 w_count);
 	/* Get the number of windows per domain */
 	u32 (*domain_get_windows)(struct iommu_domain *domain);
+	/* allocates the reserved iova domain */
+	int (*alloc_reserved_iova_domain)(struct iommu_domain *domain,
+					  dma_addr_t iova, size_t size,
+					  unsigned long order);
+	/* frees the reserved iova domain */
+	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -266,6 +272,10 @@ extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
+extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order);
+extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -541,6 +551,17 @@ static inline void iommu_device_unlink(struct device *dev, struct device *link)
 {
 }
 
+static int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					    dma_addr_t iova, size_t size,
+					    unsigned long order)
+{
+	return -EINVAL;
+}
+
+static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Implement alloc/free_reserved_iova_domain for arm-smmu. we use
the iova allocator (iova.c). The iova_domain is attached to the
arm_smmu_domain struct. A mutex is introduced to protect it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- formerly implemented in vfio_iommu_type1
---
 drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c8b7e71..f42341d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/iova.h>
 
 #include <linux/amba/bus.h>
 
@@ -347,6 +348,9 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
+	struct iova_domain		*reserved_iova_domain;
+	/* protects reserved domain manipulation */
+	struct mutex			reserved_mutex;
 };
 
 static struct iommu_ops arm_smmu_ops;
@@ -975,6 +979,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 		return NULL;
 
 	mutex_init(&smmu_domain->init_mutex);
+	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
 
 	return &smmu_domain->domain;
@@ -1446,22 +1451,74 @@ out_unlock:
 	return ret;
 }
 
+static int arm_smmu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					       dma_addr_t iova, size_t size,
+					       unsigned long order)
+{
+	unsigned long granule, mask;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	if (smmu_domain->reserved_iova_domain)
+		return -EEXIST;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	smmu_domain->reserved_iova_domain =
+		kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!smmu_domain->reserved_iova_domain) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	init_iova_domain(smmu_domain->reserved_iova_domain,
+			 granule, iova >> order, (iova + size - 1) >> order);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+
+	if (!iovad)
+		return;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	put_iova_domain(iovad);
+	kfree(iovad);
+
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
-	.capable		= arm_smmu_capable,
-	.domain_alloc		= arm_smmu_domain_alloc,
-	.domain_free		= arm_smmu_domain_free,
-	.attach_dev		= arm_smmu_attach_dev,
-	.detach_dev		= arm_smmu_detach_dev,
-	.map			= arm_smmu_map,
-	.unmap			= arm_smmu_unmap,
-	.map_sg			= default_iommu_map_sg,
-	.iova_to_phys		= arm_smmu_iova_to_phys,
-	.add_device		= arm_smmu_add_device,
-	.remove_device		= arm_smmu_remove_device,
-	.device_group		= arm_smmu_device_group,
-	.domain_get_attr	= arm_smmu_domain_get_attr,
-	.domain_set_attr	= arm_smmu_domain_set_attr,
-	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
+	.capable			= arm_smmu_capable,
+	.domain_alloc			= arm_smmu_domain_alloc,
+	.domain_free			= arm_smmu_domain_free,
+	.attach_dev			= arm_smmu_attach_dev,
+	.detach_dev			= arm_smmu_detach_dev,
+	.map				= arm_smmu_map,
+	.unmap				= arm_smmu_unmap,
+	.map_sg				= default_iommu_map_sg,
+	.iova_to_phys			= arm_smmu_iova_to_phys,
+	.add_device			= arm_smmu_add_device,
+	.remove_device			= arm_smmu_remove_device,
+	.device_group			= arm_smmu_device_group,
+	.domain_get_attr		= arm_smmu_domain_get_attr,
+	.domain_set_attr		= arm_smmu_domain_set_attr,
+	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
+	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	/* Page size bitmap, restricted during device attach */
+	.pgsize_bitmap			= -1UL,
 };
 
 static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

Implement alloc/free_reserved_iova_domain for arm-smmu. we use
the iova allocator (iova.c). The iova_domain is attached to the
arm_smmu_domain struct. A mutex is introduced to protect it.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v1 -> v2:
- formerly implemented in vfio_iommu_type1
---
 drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c8b7e71..f42341d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/iova.h>
 
 #include <linux/amba/bus.h>
 
@@ -347,6 +348,9 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
+	struct iova_domain		*reserved_iova_domain;
+	/* protects reserved domain manipulation */
+	struct mutex			reserved_mutex;
 };
 
 static struct iommu_ops arm_smmu_ops;
@@ -975,6 +979,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 		return NULL;
 
 	mutex_init(&smmu_domain->init_mutex);
+	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
 
 	return &smmu_domain->domain;
@@ -1446,22 +1451,74 @@ out_unlock:
 	return ret;
 }
 
+static int arm_smmu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					       dma_addr_t iova, size_t size,
+					       unsigned long order)
+{
+	unsigned long granule, mask;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	if (smmu_domain->reserved_iova_domain)
+		return -EEXIST;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	smmu_domain->reserved_iova_domain =
+		kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!smmu_domain->reserved_iova_domain) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	init_iova_domain(smmu_domain->reserved_iova_domain,
+			 granule, iova >> order, (iova + size - 1) >> order);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+
+	if (!iovad)
+		return;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	put_iova_domain(iovad);
+	kfree(iovad);
+
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
-	.capable		= arm_smmu_capable,
-	.domain_alloc		= arm_smmu_domain_alloc,
-	.domain_free		= arm_smmu_domain_free,
-	.attach_dev		= arm_smmu_attach_dev,
-	.detach_dev		= arm_smmu_detach_dev,
-	.map			= arm_smmu_map,
-	.unmap			= arm_smmu_unmap,
-	.map_sg			= default_iommu_map_sg,
-	.iova_to_phys		= arm_smmu_iova_to_phys,
-	.add_device		= arm_smmu_add_device,
-	.remove_device		= arm_smmu_remove_device,
-	.device_group		= arm_smmu_device_group,
-	.domain_get_attr	= arm_smmu_domain_get_attr,
-	.domain_set_attr	= arm_smmu_domain_set_attr,
-	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
+	.capable			= arm_smmu_capable,
+	.domain_alloc			= arm_smmu_domain_alloc,
+	.domain_free			= arm_smmu_domain_free,
+	.attach_dev			= arm_smmu_attach_dev,
+	.detach_dev			= arm_smmu_detach_dev,
+	.map				= arm_smmu_map,
+	.unmap				= arm_smmu_unmap,
+	.map_sg				= default_iommu_map_sg,
+	.iova_to_phys			= arm_smmu_iova_to_phys,
+	.add_device			= arm_smmu_add_device,
+	.remove_device			= arm_smmu_remove_device,
+	.device_group			= arm_smmu_device_group,
+	.domain_get_attr		= arm_smmu_domain_get_attr,
+	.domain_set_attr		= arm_smmu_domain_set_attr,
+	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
+	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	/* Page size bitmap, restricted during device attach */
+	.pgsize_bitmap			= -1UL,
 };
 
 static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Implement alloc/free_reserved_iova_domain for arm-smmu. we use
the iova allocator (iova.c). The iova_domain is attached to the
arm_smmu_domain struct. A mutex is introduced to protect it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- formerly implemented in vfio_iommu_type1
---
 drivers/iommu/arm-smmu.c | 87 +++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 72 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c8b7e71..f42341d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/iova.h>
 
 #include <linux/amba/bus.h>
 
@@ -347,6 +348,9 @@ struct arm_smmu_domain {
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
+	struct iova_domain		*reserved_iova_domain;
+	/* protects reserved domain manipulation */
+	struct mutex			reserved_mutex;
 };
 
 static struct iommu_ops arm_smmu_ops;
@@ -975,6 +979,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 		return NULL;
 
 	mutex_init(&smmu_domain->init_mutex);
+	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
 
 	return &smmu_domain->domain;
@@ -1446,22 +1451,74 @@ out_unlock:
 	return ret;
 }
 
+static int arm_smmu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+					       dma_addr_t iova, size_t size,
+					       unsigned long order)
+{
+	unsigned long granule, mask;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	if (smmu_domain->reserved_iova_domain)
+		return -EEXIST;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	smmu_domain->reserved_iova_domain =
+		kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!smmu_domain->reserved_iova_domain) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+
+	init_iova_domain(smmu_domain->reserved_iova_domain,
+			 granule, iova >> order, (iova + size - 1) >> order);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+
+	if (!iovad)
+		return;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	put_iova_domain(iovad);
+	kfree(iovad);
+
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
-	.capable		= arm_smmu_capable,
-	.domain_alloc		= arm_smmu_domain_alloc,
-	.domain_free		= arm_smmu_domain_free,
-	.attach_dev		= arm_smmu_attach_dev,
-	.detach_dev		= arm_smmu_detach_dev,
-	.map			= arm_smmu_map,
-	.unmap			= arm_smmu_unmap,
-	.map_sg			= default_iommu_map_sg,
-	.iova_to_phys		= arm_smmu_iova_to_phys,
-	.add_device		= arm_smmu_add_device,
-	.remove_device		= arm_smmu_remove_device,
-	.device_group		= arm_smmu_device_group,
-	.domain_get_attr	= arm_smmu_domain_get_attr,
-	.domain_set_attr	= arm_smmu_domain_set_attr,
-	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
+	.capable			= arm_smmu_capable,
+	.domain_alloc			= arm_smmu_domain_alloc,
+	.domain_free			= arm_smmu_domain_free,
+	.attach_dev			= arm_smmu_attach_dev,
+	.detach_dev			= arm_smmu_detach_dev,
+	.map				= arm_smmu_map,
+	.unmap				= arm_smmu_unmap,
+	.map_sg				= default_iommu_map_sg,
+	.iova_to_phys			= arm_smmu_iova_to_phys,
+	.add_device			= arm_smmu_add_device,
+	.remove_device			= arm_smmu_remove_device,
+	.device_group			= arm_smmu_device_group,
+	.domain_get_attr		= arm_smmu_domain_get_attr,
+	.domain_set_attr		= arm_smmu_domain_set_attr,
+	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
+	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	/* Page size bitmap, restricted during device attach */
+	.pgsize_bitmap			= -1UL,
 };
 
 static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 06/15] iommu/arm-smmu: add a reserved binding RB tree
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f42341d..729a4c6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -349,10 +349,21 @@ struct arm_smmu_domain {
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
 	struct iova_domain		*reserved_iova_domain;
-	/* protects reserved domain manipulation */
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root			reserved_binding_list;
+	/* protects reserved domain and rbtree manipulation */
 	struct mutex			reserved_mutex;
 };
 
+struct arm_smmu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct arm_smmu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
 static struct iommu_ops arm_smmu_ops;
 
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
@@ -400,6 +411,57 @@ static struct device_node *dev_get_dev_node(struct device *dev)
 	return dev->of_node;
 }
 
+/* Reserved binding RB-tree manipulation */
+
+static struct arm_smmu_reserved_binding *find_reserved_binding(
+				    struct arm_smmu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct arm_smmu_reserved_binding *binding =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+static void link_reserved_binding(struct arm_smmu_domain *d,
+				  struct arm_smmu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct arm_smmu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct arm_smmu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+static void unlink_reserved_binding(struct arm_smmu_domain *d,
+				    struct arm_smmu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
 						struct device_node *dev_node)
 {
@@ -981,6 +1043,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
+	smmu_domain->reserved_binding_list = RB_ROOT;
 
 	return &smmu_domain->domain;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 06/15] iommu/arm-smmu: add a reserved binding RB tree
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f42341d..729a4c6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -349,10 +349,21 @@ struct arm_smmu_domain {
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
 	struct iova_domain		*reserved_iova_domain;
-	/* protects reserved domain manipulation */
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root			reserved_binding_list;
+	/* protects reserved domain and rbtree manipulation */
 	struct mutex			reserved_mutex;
 };
 
+struct arm_smmu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct arm_smmu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
 static struct iommu_ops arm_smmu_ops;
 
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
@@ -400,6 +411,57 @@ static struct device_node *dev_get_dev_node(struct device *dev)
 	return dev->of_node;
 }
 
+/* Reserved binding RB-tree manipulation */
+
+static struct arm_smmu_reserved_binding *find_reserved_binding(
+				    struct arm_smmu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct arm_smmu_reserved_binding *binding =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+static void link_reserved_binding(struct arm_smmu_domain *d,
+				  struct arm_smmu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct arm_smmu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct arm_smmu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+static void unlink_reserved_binding(struct arm_smmu_domain *d,
+				    struct arm_smmu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
 						struct device_node *dev_node)
 {
@@ -981,6 +1043,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
+	smmu_domain->reserved_binding_list = RB_ROOT;
 
 	return &smmu_domain->domain;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 06/15] iommu/arm-smmu: add a reserved binding RB tree
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f42341d..729a4c6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -349,10 +349,21 @@ struct arm_smmu_domain {
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	struct iommu_domain		domain;
 	struct iova_domain		*reserved_iova_domain;
-	/* protects reserved domain manipulation */
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root			reserved_binding_list;
+	/* protects reserved domain and rbtree manipulation */
 	struct mutex			reserved_mutex;
 };
 
+struct arm_smmu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct arm_smmu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
 static struct iommu_ops arm_smmu_ops;
 
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
@@ -400,6 +411,57 @@ static struct device_node *dev_get_dev_node(struct device *dev)
 	return dev->of_node;
 }
 
+/* Reserved binding RB-tree manipulation */
+
+static struct arm_smmu_reserved_binding *find_reserved_binding(
+				    struct arm_smmu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct arm_smmu_reserved_binding *binding =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+static void link_reserved_binding(struct arm_smmu_domain *d,
+				  struct arm_smmu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct arm_smmu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct arm_smmu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+static void unlink_reserved_binding(struct arm_smmu_domain *d,
+				    struct arm_smmu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
 						struct device_node *dev_node)
 {
@@ -981,6 +1043,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	mutex_init(&smmu_domain->reserved_mutex);
 	spin_lock_init(&smmu_domain->pgtbl_lock);
+	smmu_domain->reserved_binding_list = RB_ROOT;
 
 	return &smmu_domain->domain;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 07/15] iommu: iommu_get/put_single_reserved
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

This patch introduces iommu_get/put_single_reserved.

iommu_get_single_reserved allows to allocate a new reserved iova page
and map it onto the physical page that contains a given physical address.
It returns the iova that is mapped onto the provided physical address.
Hence the physical address passed in argument does not need to be aligned.

In case a mapping already exists between both pages, the IOVA mapped
to the PA is directly returned.

Each time an iova is successfully returned a binding ref count is
incremented.

iommu_put_single_reserved decrements the ref count and when this latter
is null, the mapping is destroyed and the iova is released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Ankit Jindal <ajindal@apm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

Currently the ref counting is does not really used. All bindings will be
destroyed when the domain is killed.

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/iommu.c | 21 +++++++++++++++++++++
 include/linux/iommu.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a994f34..14ebde1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1415,6 +1415,27 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 	return unmapped;
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
+int iommu_get_single_reserved(struct iommu_domain *domain,
+			      phys_addr_t addr, int prot,
+			      dma_addr_t *iova)
+{
+	if (!domain->ops->get_single_reserved)
+		return  -ENODEV;
+
+	return domain->ops->get_single_reserved(domain, addr, prot, iova);
+
+}
+EXPORT_SYMBOL_GPL(iommu_get_single_reserved);
+
+void iommu_put_single_reserved(struct iommu_domain *domain,
+			       dma_addr_t iova)
+{
+	if (!domain->ops->put_single_reserved)
+		return;
+
+	domain->ops->put_single_reserved(domain, iova);
+}
+EXPORT_SYMBOL_GPL(iommu_put_single_reserved);
 
 size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			 struct scatterlist *sg, unsigned int nents, int prot)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32c1a4e..148465b8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -201,6 +201,21 @@ struct iommu_ops {
 					  unsigned long order);
 	/* frees the reserved iova domain */
 	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
+	/**
+	 * allocate a reserved iova page and bind it onto the page that
+	 * contains a physical address (@addr), returns the @iova bound to
+	 * @addr. In case the 2 pages already are bound simply return @iova
+	 * and increment a ref count.
+	 */
+	int (*get_single_reserved)(struct iommu_domain *domain,
+					 phys_addr_t addr, int prot,
+					 dma_addr_t *iova);
+	/**
+	 * decrement a ref count of the iova page. If null, unmap the iova page
+	 * and release the iova
+	 */
+	void (*put_single_reserved)(struct iommu_domain *domain,
+					   dma_addr_t iova);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -276,6 +291,11 @@ extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 					    dma_addr_t iova, size_t size,
 					    unsigned long order);
 extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+extern int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova);
+extern void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -562,6 +582,17 @@ static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 }
 
+static int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova)
+{
+	return -EINVAL;
+}
+static void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 07/15] iommu: iommu_get/put_single_reserved
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

This patch introduces iommu_get/put_single_reserved.

iommu_get_single_reserved allows to allocate a new reserved iova page
and map it onto the physical page that contains a given physical address.
It returns the iova that is mapped onto the provided physical address.
Hence the physical address passed in argument does not need to be aligned.

In case a mapping already exists between both pages, the IOVA mapped
to the PA is directly returned.

Each time an iova is successfully returned a binding ref count is
incremented.

iommu_put_single_reserved decrements the ref count and when this latter
is null, the mapping is destroyed and the iova is released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Ankit Jindal <ajindal@apm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

Currently the ref counting is does not really used. All bindings will be
destroyed when the domain is killed.

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/iommu.c | 21 +++++++++++++++++++++
 include/linux/iommu.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a994f34..14ebde1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1415,6 +1415,27 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 	return unmapped;
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
+int iommu_get_single_reserved(struct iommu_domain *domain,
+			      phys_addr_t addr, int prot,
+			      dma_addr_t *iova)
+{
+	if (!domain->ops->get_single_reserved)
+		return  -ENODEV;
+
+	return domain->ops->get_single_reserved(domain, addr, prot, iova);
+
+}
+EXPORT_SYMBOL_GPL(iommu_get_single_reserved);
+
+void iommu_put_single_reserved(struct iommu_domain *domain,
+			       dma_addr_t iova)
+{
+	if (!domain->ops->put_single_reserved)
+		return;
+
+	domain->ops->put_single_reserved(domain, iova);
+}
+EXPORT_SYMBOL_GPL(iommu_put_single_reserved);
 
 size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			 struct scatterlist *sg, unsigned int nents, int prot)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32c1a4e..148465b8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -201,6 +201,21 @@ struct iommu_ops {
 					  unsigned long order);
 	/* frees the reserved iova domain */
 	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
+	/**
+	 * allocate a reserved iova page and bind it onto the page that
+	 * contains a physical address (@addr), returns the @iova bound to
+	 * @addr. In case the 2 pages already are bound simply return @iova
+	 * and increment a ref count.
+	 */
+	int (*get_single_reserved)(struct iommu_domain *domain,
+					 phys_addr_t addr, int prot,
+					 dma_addr_t *iova);
+	/**
+	 * decrement a ref count of the iova page. If null, unmap the iova page
+	 * and release the iova
+	 */
+	void (*put_single_reserved)(struct iommu_domain *domain,
+					   dma_addr_t iova);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -276,6 +291,11 @@ extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 					    dma_addr_t iova, size_t size,
 					    unsigned long order);
 extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+extern int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova);
+extern void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -562,6 +582,17 @@ static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 }
 
+static int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova)
+{
+	return -EINVAL;
+}
+static void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 07/15] iommu: iommu_get/put_single_reserved
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces iommu_get/put_single_reserved.

iommu_get_single_reserved allows to allocate a new reserved iova page
and map it onto the physical page that contains a given physical address.
It returns the iova that is mapped onto the provided physical address.
Hence the physical address passed in argument does not need to be aligned.

In case a mapping already exists between both pages, the IOVA mapped
to the PA is directly returned.

Each time an iova is successfully returned a binding ref count is
incremented.

iommu_put_single_reserved decrements the ref count and when this latter
is null, the mapping is destroyed and the iova is released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Ankit Jindal <ajindal@apm.com>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

Currently the ref counting is does not really used. All bindings will be
destroyed when the domain is killed.

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/iommu.c | 21 +++++++++++++++++++++
 include/linux/iommu.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index a994f34..14ebde1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1415,6 +1415,27 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size)
 	return unmapped;
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
+int iommu_get_single_reserved(struct iommu_domain *domain,
+			      phys_addr_t addr, int prot,
+			      dma_addr_t *iova)
+{
+	if (!domain->ops->get_single_reserved)
+		return  -ENODEV;
+
+	return domain->ops->get_single_reserved(domain, addr, prot, iova);
+
+}
+EXPORT_SYMBOL_GPL(iommu_get_single_reserved);
+
+void iommu_put_single_reserved(struct iommu_domain *domain,
+			       dma_addr_t iova)
+{
+	if (!domain->ops->put_single_reserved)
+		return;
+
+	domain->ops->put_single_reserved(domain, iova);
+}
+EXPORT_SYMBOL_GPL(iommu_put_single_reserved);
 
 size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 			 struct scatterlist *sg, unsigned int nents, int prot)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32c1a4e..148465b8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -201,6 +201,21 @@ struct iommu_ops {
 					  unsigned long order);
 	/* frees the reserved iova domain */
 	void (*free_reserved_iova_domain)(struct iommu_domain *domain);
+	/**
+	 * allocate a reserved iova page and bind it onto the page that
+	 * contains a physical address (@addr), returns the @iova bound to
+	 * @addr. In case the 2 pages already are bound simply return @iova
+	 * and increment a ref count.
+	 */
+	int (*get_single_reserved)(struct iommu_domain *domain,
+					 phys_addr_t addr, int prot,
+					 dma_addr_t *iova);
+	/**
+	 * decrement a ref count of the iova page. If null, unmap the iova page
+	 * and release the iova
+	 */
+	void (*put_single_reserved)(struct iommu_domain *domain,
+					   dma_addr_t iova);
 
 #ifdef CONFIG_OF_IOMMU
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
@@ -276,6 +291,11 @@ extern int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 					    dma_addr_t iova, size_t size,
 					    unsigned long order);
 extern void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+extern int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova);
+extern void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova);
 struct device *iommu_device_create(struct device *parent, void *drvdata,
 				   const struct attribute_group **groups,
 				   const char *fmt, ...) __printf(4, 5);
@@ -562,6 +582,17 @@ static void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 }
 
+static int iommu_get_single_reserved(struct iommu_domain *domain,
+				     phys_addr_t paddr, int prot,
+				     dma_addr_t *iova)
+{
+	return -EINVAL;
+}
+static void iommu_put_single_reserved(struct iommu_domain *domain,
+				      dma_addr_t iova)
+{
+}
+
 #endif /* CONFIG_IOMMU_API */
 
 #endif /* __LINUX_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 08/15] iommu/arm-smmu: implement iommu_get/put_single_reserved
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Implement the iommu_get/put_single_reserved API in arm-smmu.

In order to track which physical address is already mapped we
use the RB tree indexed by PA.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- previously in vfio_iommu_type1.c
---
 drivers/iommu/arm-smmu.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 729a4c6..9961bfd 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1563,6 +1563,118 @@ static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static int arm_smmu_get_single_reserved(struct iommu_domain *domain,
+					phys_addr_t addr, int prot,
+					dma_addr_t *iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	unsigned long order = __ffs(domain->ops->pgsize_bitmap);
+	size_t page_size = 1 << order;
+	phys_addr_t mask = page_size - 1;
+	phys_addr_t aligned_addr = addr & ~mask;
+	phys_addr_t offset  = addr - aligned_addr;
+	struct arm_smmu_reserved_binding *b;
+	struct iova *p_iova;
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	int ret;
+
+	if (!iovad)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (b) {
+		*iova = b->iova + offset;
+		kref_get(&b->kref);
+		ret = 0;
+		goto unlock;
+	}
+
+	/* there is no existing reserved iova for this pa */
+	p_iova = alloc_iova(iovad, 1, iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+	*iova = p_iova->pfn_lo << order;
+
+	b = kzalloc(sizeof(*b), GFP_KERNEL);
+	if (!b) {
+		ret = -ENOMEM;
+		goto free_iova_unlock;
+	}
+
+	ret = arm_smmu_map(domain, *iova, aligned_addr, page_size, prot);
+	if (ret)
+		goto free_binding_iova_unlock;
+
+	kref_init(&b->kref);
+	kref_get(&b->kref);
+	b->domain = smmu_domain;
+	b->addr = aligned_addr;
+	b->iova = *iova;
+	b->size = page_size;
+
+	link_reserved_binding(smmu_domain, b);
+
+	*iova += offset;
+	goto unlock;
+
+free_binding_iova_unlock:
+	kfree(b);
+free_iova_unlock:
+	free_iova(smmu_domain->reserved_iova_domain, *iova >> order);
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+/* called with reserved_mutex locked */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct arm_smmu_reserved_binding *b =
+		container_of(kref, struct arm_smmu_reserved_binding, kref);
+	struct arm_smmu_domain *smmu_domain = b->domain;
+	struct iommu_domain *d = &smmu_domain->domain;
+	unsigned long order = __ffs(b->size);
+
+
+	arm_smmu_unmap(d, b->iova, b->size);
+	free_iova(smmu_domain->reserved_iova_domain, b->iova >> order);
+	unlink_reserved_binding(smmu_domain, b);
+	kfree(b);
+}
+
+static void arm_smmu_put_single_reserved(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	unsigned long order;
+	phys_addr_t aligned_addr;
+	dma_addr_t aligned_iova, page_size, mask, offset;
+	struct arm_smmu_reserved_binding *b;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	order = __ffs(domain->ops->pgsize_bitmap);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+
+	aligned_iova = iova & ~mask;
+	offset = iova - aligned_iova;
+
+	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+	kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
@@ -1580,6 +1692,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.domain_set_attr		= arm_smmu_domain_set_attr,
 	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
 	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	.get_single_reserved		= arm_smmu_get_single_reserved,
+	.put_single_reserved		= arm_smmu_put_single_reserved,
 	/* Page size bitmap, restricted during device attach */
 	.pgsize_bitmap			= -1UL,
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 08/15] iommu/arm-smmu: implement iommu_get/put_single_reserved
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

Implement the iommu_get/put_single_reserved API in arm-smmu.

In order to track which physical address is already mapped we
use the RB tree indexed by PA.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v1 -> v2:
- previously in vfio_iommu_type1.c
---
 drivers/iommu/arm-smmu.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 729a4c6..9961bfd 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1563,6 +1563,118 @@ static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static int arm_smmu_get_single_reserved(struct iommu_domain *domain,
+					phys_addr_t addr, int prot,
+					dma_addr_t *iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	unsigned long order = __ffs(domain->ops->pgsize_bitmap);
+	size_t page_size = 1 << order;
+	phys_addr_t mask = page_size - 1;
+	phys_addr_t aligned_addr = addr & ~mask;
+	phys_addr_t offset  = addr - aligned_addr;
+	struct arm_smmu_reserved_binding *b;
+	struct iova *p_iova;
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	int ret;
+
+	if (!iovad)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (b) {
+		*iova = b->iova + offset;
+		kref_get(&b->kref);
+		ret = 0;
+		goto unlock;
+	}
+
+	/* there is no existing reserved iova for this pa */
+	p_iova = alloc_iova(iovad, 1, iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+	*iova = p_iova->pfn_lo << order;
+
+	b = kzalloc(sizeof(*b), GFP_KERNEL);
+	if (!b) {
+		ret = -ENOMEM;
+		goto free_iova_unlock;
+	}
+
+	ret = arm_smmu_map(domain, *iova, aligned_addr, page_size, prot);
+	if (ret)
+		goto free_binding_iova_unlock;
+
+	kref_init(&b->kref);
+	kref_get(&b->kref);
+	b->domain = smmu_domain;
+	b->addr = aligned_addr;
+	b->iova = *iova;
+	b->size = page_size;
+
+	link_reserved_binding(smmu_domain, b);
+
+	*iova += offset;
+	goto unlock;
+
+free_binding_iova_unlock:
+	kfree(b);
+free_iova_unlock:
+	free_iova(smmu_domain->reserved_iova_domain, *iova >> order);
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+/* called with reserved_mutex locked */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct arm_smmu_reserved_binding *b =
+		container_of(kref, struct arm_smmu_reserved_binding, kref);
+	struct arm_smmu_domain *smmu_domain = b->domain;
+	struct iommu_domain *d = &smmu_domain->domain;
+	unsigned long order = __ffs(b->size);
+
+
+	arm_smmu_unmap(d, b->iova, b->size);
+	free_iova(smmu_domain->reserved_iova_domain, b->iova >> order);
+	unlink_reserved_binding(smmu_domain, b);
+	kfree(b);
+}
+
+static void arm_smmu_put_single_reserved(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	unsigned long order;
+	phys_addr_t aligned_addr;
+	dma_addr_t aligned_iova, page_size, mask, offset;
+	struct arm_smmu_reserved_binding *b;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	order = __ffs(domain->ops->pgsize_bitmap);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+
+	aligned_iova = iova & ~mask;
+	offset = iova - aligned_iova;
+
+	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+	kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
@@ -1580,6 +1692,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.domain_set_attr		= arm_smmu_domain_set_attr,
 	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
 	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	.get_single_reserved		= arm_smmu_get_single_reserved,
+	.put_single_reserved		= arm_smmu_put_single_reserved,
 	/* Page size bitmap, restricted during device attach */
 	.pgsize_bitmap			= -1UL,
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 08/15] iommu/arm-smmu: implement iommu_get/put_single_reserved
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Implement the iommu_get/put_single_reserved API in arm-smmu.

In order to track which physical address is already mapped we
use the RB tree indexed by PA.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v1 -> v2:
- previously in vfio_iommu_type1.c
---
 drivers/iommu/arm-smmu.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 729a4c6..9961bfd 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1563,6 +1563,118 @@ static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static int arm_smmu_get_single_reserved(struct iommu_domain *domain,
+					phys_addr_t addr, int prot,
+					dma_addr_t *iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	unsigned long order = __ffs(domain->ops->pgsize_bitmap);
+	size_t page_size = 1 << order;
+	phys_addr_t mask = page_size - 1;
+	phys_addr_t aligned_addr = addr & ~mask;
+	phys_addr_t offset  = addr - aligned_addr;
+	struct arm_smmu_reserved_binding *b;
+	struct iova *p_iova;
+	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	int ret;
+
+	if (!iovad)
+		return -EINVAL;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (b) {
+		*iova = b->iova + offset;
+		kref_get(&b->kref);
+		ret = 0;
+		goto unlock;
+	}
+
+	/* there is no existing reserved iova for this pa */
+	p_iova = alloc_iova(iovad, 1, iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto unlock;
+	}
+	*iova = p_iova->pfn_lo << order;
+
+	b = kzalloc(sizeof(*b), GFP_KERNEL);
+	if (!b) {
+		ret = -ENOMEM;
+		goto free_iova_unlock;
+	}
+
+	ret = arm_smmu_map(domain, *iova, aligned_addr, page_size, prot);
+	if (ret)
+		goto free_binding_iova_unlock;
+
+	kref_init(&b->kref);
+	kref_get(&b->kref);
+	b->domain = smmu_domain;
+	b->addr = aligned_addr;
+	b->iova = *iova;
+	b->size = page_size;
+
+	link_reserved_binding(smmu_domain, b);
+
+	*iova += offset;
+	goto unlock;
+
+free_binding_iova_unlock:
+	kfree(b);
+free_iova_unlock:
+	free_iova(smmu_domain->reserved_iova_domain, *iova >> order);
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+	return ret;
+}
+
+/* called with reserved_mutex locked */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct arm_smmu_reserved_binding *b =
+		container_of(kref, struct arm_smmu_reserved_binding, kref);
+	struct arm_smmu_domain *smmu_domain = b->domain;
+	struct iommu_domain *d = &smmu_domain->domain;
+	unsigned long order = __ffs(b->size);
+
+
+	arm_smmu_unmap(d, b->iova, b->size);
+	free_iova(smmu_domain->reserved_iova_domain, b->iova >> order);
+	unlink_reserved_binding(smmu_domain, b);
+	kfree(b);
+}
+
+static void arm_smmu_put_single_reserved(struct iommu_domain *domain,
+					 dma_addr_t iova)
+{
+	unsigned long order;
+	phys_addr_t aligned_addr;
+	dma_addr_t aligned_iova, page_size, mask, offset;
+	struct arm_smmu_reserved_binding *b;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	order = __ffs(domain->ops->pgsize_bitmap);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+
+	aligned_iova = iova & ~mask;
+	offset = iova - aligned_iova;
+
+	aligned_addr = iommu_iova_to_phys(domain, aligned_iova);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+
+	b = find_reserved_binding(smmu_domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+	kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
@@ -1580,6 +1692,8 @@ static struct iommu_ops arm_smmu_ops = {
 	.domain_set_attr		= arm_smmu_domain_set_attr,
 	.alloc_reserved_iova_domain	= arm_smmu_alloc_reserved_iova_domain,
 	.free_reserved_iova_domain	= arm_smmu_free_reserved_iova_domain,
+	.get_single_reserved		= arm_smmu_get_single_reserved,
+	.put_single_reserved		= arm_smmu_put_single_reserved,
 	/* Page size bitmap, restricted during device attach */
 	.pgsize_bitmap			= -1UL,
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 09/15] iommu/arm-smmu: relinquish reserved resources on domain deletion
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

arm_smmu_unmap_reserved releases all reserved binding resources:
destroy all bindings, free iova, free iova_domain. This happens
on domain deletion.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9961bfd..ae8a97d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -363,6 +363,7 @@ struct arm_smmu_reserved_binding {
 	dma_addr_t		iova;
 	size_t			size;
 };
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain);
 
 static struct iommu_ops arm_smmu_ops;
 
@@ -1057,6 +1058,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	arm_smmu_destroy_domain_context(domain);
+	arm_smmu_unmap_reserved(domain);
 	kfree(smmu_domain);
 }
 
@@ -1547,19 +1549,23 @@ unlock:
 	return ret;
 }
 
-static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+static void __arm_smmu_free_reserved_iova_domain(struct arm_smmu_domain *sd)
 {
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	struct iova_domain *iovad = sd->reserved_iova_domain;
 
 	if (!iovad)
 		return;
 
-	mutex_lock(&smmu_domain->reserved_mutex);
-
 	put_iova_domain(iovad);
 	kfree(iovad);
+}
 
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
@@ -1675,6 +1681,24 @@ unlock:
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct rb_node *node;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	while ((node = rb_first(&smmu_domain->reserved_binding_list))) {
+		struct arm_smmu_reserved_binding *b =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+	}
+	smmu_domain->reserved_binding_list = RB_ROOT;
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 09/15] iommu/arm-smmu: relinquish reserved resources on domain deletion
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

arm_smmu_unmap_reserved releases all reserved binding resources:
destroy all bindings, free iova, free iova_domain. This happens
on domain deletion.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9961bfd..ae8a97d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -363,6 +363,7 @@ struct arm_smmu_reserved_binding {
 	dma_addr_t		iova;
 	size_t			size;
 };
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain);
 
 static struct iommu_ops arm_smmu_ops;
 
@@ -1057,6 +1058,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	arm_smmu_destroy_domain_context(domain);
+	arm_smmu_unmap_reserved(domain);
 	kfree(smmu_domain);
 }
 
@@ -1547,19 +1549,23 @@ unlock:
 	return ret;
 }
 
-static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+static void __arm_smmu_free_reserved_iova_domain(struct arm_smmu_domain *sd)
 {
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	struct iova_domain *iovad = sd->reserved_iova_domain;
 
 	if (!iovad)
 		return;
 
-	mutex_lock(&smmu_domain->reserved_mutex);
-
 	put_iova_domain(iovad);
 	kfree(iovad);
+}
 
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
@@ -1675,6 +1681,24 @@ unlock:
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct rb_node *node;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	while ((node = rb_first(&smmu_domain->reserved_binding_list))) {
+		struct arm_smmu_reserved_binding *b =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+	}
+	smmu_domain->reserved_binding_list = RB_ROOT;
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 09/15] iommu/arm-smmu: relinquish reserved resources on domain deletion
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

arm_smmu_unmap_reserved releases all reserved binding resources:
destroy all bindings, free iova, free iova_domain. This happens
on domain deletion.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9961bfd..ae8a97d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -363,6 +363,7 @@ struct arm_smmu_reserved_binding {
 	dma_addr_t		iova;
 	size_t			size;
 };
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain);
 
 static struct iommu_ops arm_smmu_ops;
 
@@ -1057,6 +1058,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	arm_smmu_destroy_domain_context(domain);
+	arm_smmu_unmap_reserved(domain);
 	kfree(smmu_domain);
 }
 
@@ -1547,19 +1549,23 @@ unlock:
 	return ret;
 }
 
-static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+static void __arm_smmu_free_reserved_iova_domain(struct arm_smmu_domain *sd)
 {
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+	struct iova_domain *iovad = sd->reserved_iova_domain;
 
 	if (!iovad)
 		return;
 
-	mutex_lock(&smmu_domain->reserved_mutex);
-
 	put_iova_domain(iovad);
 	kfree(iovad);
+}
 
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
@@ -1675,6 +1681,24 @@ unlock:
 	mutex_unlock(&smmu_domain->reserved_mutex);
 }
 
+static void arm_smmu_unmap_reserved(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct rb_node *node;
+
+	mutex_lock(&smmu_domain->reserved_mutex);
+	while ((node = rb_first(&smmu_domain->reserved_binding_list))) {
+		struct arm_smmu_reserved_binding *b =
+			rb_entry(node, struct arm_smmu_reserved_binding, node);
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+	}
+	smmu_domain->reserved_binding_list = RB_ROOT;
+	__arm_smmu_free_reserved_iova_domain(smmu_domain);
+	mutex_unlock(&smmu_domain->reserved_mutex);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable			= arm_smmu_capable,
 	.domain_alloc			= arm_smmu_domain_alloc,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 10/15] vfio: allow the user to register reserved iova range for MSI mapping
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

The user is allowed to register a reserved IOVA range by using the
DMA MAP API and setting the new flag: VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA.
It provides the base address and the size. This region is stored in the
vfio_dma rb tree. At that point the iova range is not mapped to any target
address yet. The host kernel will use those iova when needed, typically
when the VFIO-PCI device allocates its MSI's.

This patch also handles the destruction of the reserved binding RB-tree and
domain's iova_domains.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v1 -> v2:
- set returned value according to alloc_reserved_iova_domain result
- free the iova domains in case any error occurs

RFC v1 -> v1:
- takes into account Alex comments, based on
  [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova region:
- use the existing dma map/unmap ioctl interface with a flag to register
  a reserved IOVA range. A single reserved iova region is allowed.
---
 drivers/vfio/vfio_iommu_type1.c | 75 ++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  9 +++++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b9326c9..c5d3b48 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -673,6 +673,75 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	return ret;
 }
 
+static int vfio_register_reserved_iova_range(struct vfio_iommu *iommu,
+			   struct vfio_iommu_type1_dma_map *map)
+{
+	dma_addr_t iova = map->iova;
+	size_t size = map->size;
+	uint64_t mask;
+	struct vfio_dma *dma;
+	int ret = 0;
+	struct vfio_domain *d;
+	unsigned long order;
+
+	/* Verify that none of our __u64 fields overflow */
+	if (map->size != size || map->iova != iova)
+		return -EINVAL;
+
+	order =  __ffs(vfio_pgsize_bitmap(iommu));
+	mask = ((uint64_t)1 << order) - 1;
+
+	WARN_ON(mask & PAGE_MASK);
+
+	/* we currently only support MSI_RESERVED_IOVA */
+	if (!(map->flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA))
+		return -EINVAL;
+
+	if (!size || (size | iova) & mask)
+		return -EINVAL;
+
+	/* Don't allow IOVA address wrap */
+	if (iova + size - 1 < iova)
+		return -EINVAL;
+
+	mutex_lock(&iommu->lock);
+
+	/* check if the iova domain has not been instantiated already*/
+	d = list_first_entry(&iommu->domain_list,
+				  struct vfio_domain, next);
+
+	if (vfio_find_dma(iommu, iova, size)) {
+		ret =  -EEXIST;
+		goto out;
+	}
+
+	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
+	if (!dma) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	dma->iova = iova;
+	dma->size = size;
+	dma->type = VFIO_IOVA_RESERVED;
+
+	list_for_each_entry(d, &iommu->domain_list, next)
+		ret |= iommu_alloc_reserved_iova_domain(d->domain, iova,
+							size, order);
+
+	if (ret) {
+		list_for_each_entry(d, &iommu->domain_list, next)
+			iommu_free_reserved_iova_domain(d->domain);
+		goto out;
+	}
+
+	vfio_link_dma(iommu, dma);
+
+out:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static int vfio_bus_type(struct device *dev, void *data)
 {
 	struct bus_type **bus = data;
@@ -1045,7 +1114,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
 		struct vfio_iommu_type1_dma_map map;
 		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
-				VFIO_DMA_MAP_FLAG_WRITE;
+				VFIO_DMA_MAP_FLAG_WRITE |
+				VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA;
 
 		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
@@ -1055,6 +1125,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		if (map.argsz < minsz || map.flags & ~mask)
 			return -EINVAL;
 
+		if (map.flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA)
+			return vfio_register_reserved_iova_range(iommu, &map);
+
 		return vfio_dma_do_map(iommu, &map);
 
 	} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 43e183b..982e326 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -411,12 +411,21 @@ struct vfio_iommu_type1_info {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * In case MSI_RESERVED_IOVA is set, the API only aims at registering an IOVA
+ * region which will be used on some platforms to map the host MSI frame.
+ * in that specific case, vaddr and prot are ignored. The requirement for
+ * provisioning such IOVA range can be checked by calling VFIO_IOMMU_GET_INFO
+ * with the VFIO_IOMMU_INFO_REQUIRE_MSI_MAP attribute. A single
+ * MSI_RESERVED_IOVA region can be registered
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+/* reserved iova for MSI vectors*/
+#define VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA (1 << 2)
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 10/15] vfio: allow the user to register reserved iova range for MSI mapping
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	kvmarm-FPEHb7Xf0XXUo1n7N8X6UoWGPAHP3yOg,
	kvm-u79uwXL29TY76Z2rM5mHXA
  Cc: Thomas.Lendacky-5C7GfCeVMHo, brijesh.singh-5C7GfCeVMHo,
	patches-QSEj5FYQhm4dnm+yROfE0A,
	Manish.Jaggi-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	sherry.hurwitz-5C7GfCeVMHo

The user is allowed to register a reserved IOVA range by using the
DMA MAP API and setting the new flag: VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA.
It provides the base address and the size. This region is stored in the
vfio_dma rb tree. At that point the iova range is not mapped to any target
address yet. The host kernel will use those iova when needed, typically
when the VFIO-PCI device allocates its MSI's.

This patch also handles the destruction of the reserved binding RB-tree and
domain's iova_domains.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>

---

v1 -> v2:
- set returned value according to alloc_reserved_iova_domain result
- free the iova domains in case any error occurs

RFC v1 -> v1:
- takes into account Alex comments, based on
  [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova region:
- use the existing dma map/unmap ioctl interface with a flag to register
  a reserved IOVA range. A single reserved iova region is allowed.
---
 drivers/vfio/vfio_iommu_type1.c | 75 ++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  9 +++++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b9326c9..c5d3b48 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -673,6 +673,75 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	return ret;
 }
 
+static int vfio_register_reserved_iova_range(struct vfio_iommu *iommu,
+			   struct vfio_iommu_type1_dma_map *map)
+{
+	dma_addr_t iova = map->iova;
+	size_t size = map->size;
+	uint64_t mask;
+	struct vfio_dma *dma;
+	int ret = 0;
+	struct vfio_domain *d;
+	unsigned long order;
+
+	/* Verify that none of our __u64 fields overflow */
+	if (map->size != size || map->iova != iova)
+		return -EINVAL;
+
+	order =  __ffs(vfio_pgsize_bitmap(iommu));
+	mask = ((uint64_t)1 << order) - 1;
+
+	WARN_ON(mask & PAGE_MASK);
+
+	/* we currently only support MSI_RESERVED_IOVA */
+	if (!(map->flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA))
+		return -EINVAL;
+
+	if (!size || (size | iova) & mask)
+		return -EINVAL;
+
+	/* Don't allow IOVA address wrap */
+	if (iova + size - 1 < iova)
+		return -EINVAL;
+
+	mutex_lock(&iommu->lock);
+
+	/* check if the iova domain has not been instantiated already*/
+	d = list_first_entry(&iommu->domain_list,
+				  struct vfio_domain, next);
+
+	if (vfio_find_dma(iommu, iova, size)) {
+		ret =  -EEXIST;
+		goto out;
+	}
+
+	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
+	if (!dma) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	dma->iova = iova;
+	dma->size = size;
+	dma->type = VFIO_IOVA_RESERVED;
+
+	list_for_each_entry(d, &iommu->domain_list, next)
+		ret |= iommu_alloc_reserved_iova_domain(d->domain, iova,
+							size, order);
+
+	if (ret) {
+		list_for_each_entry(d, &iommu->domain_list, next)
+			iommu_free_reserved_iova_domain(d->domain);
+		goto out;
+	}
+
+	vfio_link_dma(iommu, dma);
+
+out:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static int vfio_bus_type(struct device *dev, void *data)
 {
 	struct bus_type **bus = data;
@@ -1045,7 +1114,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
 		struct vfio_iommu_type1_dma_map map;
 		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
-				VFIO_DMA_MAP_FLAG_WRITE;
+				VFIO_DMA_MAP_FLAG_WRITE |
+				VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA;
 
 		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
@@ -1055,6 +1125,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		if (map.argsz < minsz || map.flags & ~mask)
 			return -EINVAL;
 
+		if (map.flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA)
+			return vfio_register_reserved_iova_range(iommu, &map);
+
 		return vfio_dma_do_map(iommu, &map);
 
 	} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 43e183b..982e326 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -411,12 +411,21 @@ struct vfio_iommu_type1_info {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * In case MSI_RESERVED_IOVA is set, the API only aims at registering an IOVA
+ * region which will be used on some platforms to map the host MSI frame.
+ * in that specific case, vaddr and prot are ignored. The requirement for
+ * provisioning such IOVA range can be checked by calling VFIO_IOMMU_GET_INFO
+ * with the VFIO_IOMMU_INFO_REQUIRE_MSI_MAP attribute. A single
+ * MSI_RESERVED_IOVA region can be registered
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+/* reserved iova for MSI vectors*/
+#define VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA (1 << 2)
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 10/15] vfio: allow the user to register reserved iova range for MSI mapping
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

The user is allowed to register a reserved IOVA range by using the
DMA MAP API and setting the new flag: VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA.
It provides the base address and the size. This region is stored in the
vfio_dma rb tree. At that point the iova range is not mapped to any target
address yet. The host kernel will use those iova when needed, typically
when the VFIO-PCI device allocates its MSI's.

This patch also handles the destruction of the reserved binding RB-tree and
domain's iova_domains.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v1 -> v2:
- set returned value according to alloc_reserved_iova_domain result
- free the iova domains in case any error occurs

RFC v1 -> v1:
- takes into account Alex comments, based on
  [RFC PATCH 1/6] vfio: Add interface for add/del reserved iova region:
- use the existing dma map/unmap ioctl interface with a flag to register
  a reserved IOVA range. A single reserved iova region is allowed.
---
 drivers/vfio/vfio_iommu_type1.c | 75 ++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/vfio.h       |  9 +++++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index b9326c9..c5d3b48 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -673,6 +673,75 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
 	return ret;
 }
 
+static int vfio_register_reserved_iova_range(struct vfio_iommu *iommu,
+			   struct vfio_iommu_type1_dma_map *map)
+{
+	dma_addr_t iova = map->iova;
+	size_t size = map->size;
+	uint64_t mask;
+	struct vfio_dma *dma;
+	int ret = 0;
+	struct vfio_domain *d;
+	unsigned long order;
+
+	/* Verify that none of our __u64 fields overflow */
+	if (map->size != size || map->iova != iova)
+		return -EINVAL;
+
+	order =  __ffs(vfio_pgsize_bitmap(iommu));
+	mask = ((uint64_t)1 << order) - 1;
+
+	WARN_ON(mask & PAGE_MASK);
+
+	/* we currently only support MSI_RESERVED_IOVA */
+	if (!(map->flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA))
+		return -EINVAL;
+
+	if (!size || (size | iova) & mask)
+		return -EINVAL;
+
+	/* Don't allow IOVA address wrap */
+	if (iova + size - 1 < iova)
+		return -EINVAL;
+
+	mutex_lock(&iommu->lock);
+
+	/* check if the iova domain has not been instantiated already*/
+	d = list_first_entry(&iommu->domain_list,
+				  struct vfio_domain, next);
+
+	if (vfio_find_dma(iommu, iova, size)) {
+		ret =  -EEXIST;
+		goto out;
+	}
+
+	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
+	if (!dma) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	dma->iova = iova;
+	dma->size = size;
+	dma->type = VFIO_IOVA_RESERVED;
+
+	list_for_each_entry(d, &iommu->domain_list, next)
+		ret |= iommu_alloc_reserved_iova_domain(d->domain, iova,
+							size, order);
+
+	if (ret) {
+		list_for_each_entry(d, &iommu->domain_list, next)
+			iommu_free_reserved_iova_domain(d->domain);
+		goto out;
+	}
+
+	vfio_link_dma(iommu, dma);
+
+out:
+	mutex_unlock(&iommu->lock);
+	return ret;
+}
+
 static int vfio_bus_type(struct device *dev, void *data)
 {
 	struct bus_type **bus = data;
@@ -1045,7 +1114,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 	} else if (cmd == VFIO_IOMMU_MAP_DMA) {
 		struct vfio_iommu_type1_dma_map map;
 		uint32_t mask = VFIO_DMA_MAP_FLAG_READ |
-				VFIO_DMA_MAP_FLAG_WRITE;
+				VFIO_DMA_MAP_FLAG_WRITE |
+				VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA;
 
 		minsz = offsetofend(struct vfio_iommu_type1_dma_map, size);
 
@@ -1055,6 +1125,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
 		if (map.argsz < minsz || map.flags & ~mask)
 			return -EINVAL;
 
+		if (map.flags & VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA)
+			return vfio_register_reserved_iova_range(iommu, &map);
+
 		return vfio_dma_do_map(iommu, &map);
 
 	} else if (cmd == VFIO_IOMMU_UNMAP_DMA) {
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 43e183b..982e326 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -411,12 +411,21 @@ struct vfio_iommu_type1_info {
  *
  * Map process virtual addresses to IO virtual addresses using the
  * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required.
+ *
+ * In case MSI_RESERVED_IOVA is set, the API only aims@registering an IOVA
+ * region which will be used on some platforms to map the host MSI frame.
+ * in that specific case, vaddr and prot are ignored. The requirement for
+ * provisioning such IOVA range can be checked by calling VFIO_IOMMU_GET_INFO
+ * with the VFIO_IOMMU_INFO_REQUIRE_MSI_MAP attribute. A single
+ * MSI_RESERVED_IOVA region can be registered
  */
 struct vfio_iommu_type1_dma_map {
 	__u32	argsz;
 	__u32	flags;
 #define VFIO_DMA_MAP_FLAG_READ (1 << 0)		/* readable from device */
 #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)	/* writable from device */
+/* reserved iova for MSI vectors*/
+#define VFIO_DMA_MAP_FLAG_MSI_RESERVED_IOVA (1 << 2)
 	__u64	vaddr;				/* Process virtual address */
 	__u64	iova;				/* IO virtual address */
 	__u64	size;				/* Size of mapping (bytes) */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 11/15] msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Let's introduce a new msi_domain_info flag value, MSI_FLAG_IRQ_REMAPPING
meant to tell the domain supports IRQ REMAPPING, also known as Interrupt
Translation Service. On Intel HW this capability is abstracted on IOMMU
side while on ARM it is abstracted on MSI controller side.

GICv3 ITS HW is the first HW advertising that feature.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-v3-its.c | 1 +
 include/linux/msi.h              | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 3447549..a5e0d8b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1544,6 +1544,7 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 		inner_domain->parent = parent;
 		inner_domain->bus_token = DOMAIN_BUS_NEXUS;
 		info->ops = &its_msi_domain_ops;
+		info->flags |= MSI_FLAG_IRQ_REMAPPING;
 		info->data = its;
 		inner_domain->host_data = info;
 	}
diff --git a/include/linux/msi.h b/include/linux/msi.h
index a2a0068..03eda72 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -261,6 +261,8 @@ enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 4),
+	/* Support MSI IRQ remapping service */
+	MSI_FLAG_IRQ_REMAPPING		= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 11/15] msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

Let's introduce a new msi_domain_info flag value, MSI_FLAG_IRQ_REMAPPING
meant to tell the domain supports IRQ REMAPPING, also known as Interrupt
Translation Service. On Intel HW this capability is abstracted on IOMMU
side while on ARM it is abstracted on MSI controller side.

GICv3 ITS HW is the first HW advertising that feature.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-v3-its.c | 1 +
 include/linux/msi.h              | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 3447549..a5e0d8b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1544,6 +1544,7 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 		inner_domain->parent = parent;
 		inner_domain->bus_token = DOMAIN_BUS_NEXUS;
 		info->ops = &its_msi_domain_ops;
+		info->flags |= MSI_FLAG_IRQ_REMAPPING;
 		info->data = its;
 		inner_domain->host_data = info;
 	}
diff --git a/include/linux/msi.h b/include/linux/msi.h
index a2a0068..03eda72 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -261,6 +261,8 @@ enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 4),
+	/* Support MSI IRQ remapping service */
+	MSI_FLAG_IRQ_REMAPPING		= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 11/15] msi: Add a new MSI_FLAG_IRQ_REMAPPING flag
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Let's introduce a new msi_domain_info flag value, MSI_FLAG_IRQ_REMAPPING
meant to tell the domain supports IRQ REMAPPING, also known as Interrupt
Translation Service. On Intel HW this capability is abstracted on IOMMU
side while on ARM it is abstracted on MSI controller side.

GICv3 ITS HW is the first HW advertising that feature.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-v3-its.c | 1 +
 include/linux/msi.h              | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 3447549..a5e0d8b 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1544,6 +1544,7 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 		inner_domain->parent = parent;
 		inner_domain->bus_token = DOMAIN_BUS_NEXUS;
 		info->ops = &its_msi_domain_ops;
+		info->flags |= MSI_FLAG_IRQ_REMAPPING;
 		info->data = its;
 		inner_domain->host_data = info;
 	}
diff --git a/include/linux/msi.h b/include/linux/msi.h
index a2a0068..03eda72 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -261,6 +261,8 @@ enum {
 	MSI_FLAG_MULTI_PCI_MSI		= (1 << 3),
 	/* Support PCI MSIX interrupts */
 	MSI_FLAG_PCI_MSIX		= (1 << 4),
+	/* Support MSI IRQ remapping service */
+	MSI_FLAG_IRQ_REMAPPING		= (1 << 5),
 };
 
 int msi_domain_set_affinity(struct irq_data *data, const struct cpumask *mask,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 12/15] msi: export msi_get_domain_info
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

We plan to use msi_get_domain_info in VFIO module so let's export it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 include/linux/msi.h | 4 ++++
 kernel/irq/msi.c    | 1 +
 2 files changed, 5 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 03eda72..df19315 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -323,6 +323,10 @@ static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev)
 {
 	return NULL;
 }
+static struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
+{
+	return NULL;
+}
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
 
 #endif /* LINUX_MSI_H */
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 38e89ce..9b0ba4a 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -400,5 +400,6 @@ struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
 {
 	return (struct msi_domain_info *)domain->host_data;
 }
+EXPORT_SYMBOL_GPL(msi_get_domain_info);
 
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 12/15] msi: export msi_get_domain_info
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

We plan to use msi_get_domain_info in VFIO module so let's export it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 include/linux/msi.h | 4 ++++
 kernel/irq/msi.c    | 1 +
 2 files changed, 5 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 03eda72..df19315 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -323,6 +323,10 @@ static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev)
 {
 	return NULL;
 }
+static struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
+{
+	return NULL;
+}
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
 
 #endif /* LINUX_MSI_H */
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 38e89ce..9b0ba4a 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -400,5 +400,6 @@ struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
 {
 	return (struct msi_domain_info *)domain->host_data;
 }
+EXPORT_SYMBOL_GPL(msi_get_domain_info);
 
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 12/15] msi: export msi_get_domain_info
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

We plan to use msi_get_domain_info in VFIO module so let's export it.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 include/linux/msi.h | 4 ++++
 kernel/irq/msi.c    | 1 +
 2 files changed, 5 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 03eda72..df19315 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -323,6 +323,10 @@ static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev)
 {
 	return NULL;
 }
+static struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
+{
+	return NULL;
+}
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
 
 #endif /* LINUX_MSI_H */
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 38e89ce..9b0ba4a 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -400,5 +400,6 @@ struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
 {
 	return (struct msi_domain_info *)domain->host_data;
 }
+EXPORT_SYMBOL_GPL(msi_get_domain_info);
 
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 13/15] vfio/type1: also check IRQ remapping capability at msi domain
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

On x86 IRQ remapping is abstracted by the IOMMU. On ARM this is abstracted
by the msi controller. vfio_msi_parent_irq_remapping_capable allows to
check whether interrupts are "safe" for a given device. There are if the device
does not use MSI or if the device uses MSI and the msi-parent controller
supports IRQ remapping.

Then we check at group level if all devices have safe interrupts: if not
only allow the group to be attached if allow_unsafe_interrupts is set.

At this point ARM sMMU still advertises IOMMU_CAP_INTR_REMAP. This is
changed in next patch.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5d3b48..080321b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,8 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
+#include <linux/irqdomain.h>
+#include <linux/msi.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -754,6 +756,31 @@ static int vfio_bus_type(struct device *dev, void *data)
 	return 0;
 }
 
+/**
+ * vfio_msi_parent_irq_remapping_capable: returns whether the device msi-parent
+ * controller supports IRQ remapping, aka interrupt translation
+ *
+ * @dev: device handle
+ * @data: unused
+ * returns 0 if irq remapping is supported or -1 if not supported.
+ */
+static int vfio_msi_parent_irq_remapping_capable(struct device *dev, void *data)
+{
+	struct irq_domain *domain;
+	struct msi_domain_info *info;
+
+	domain = dev_get_msi_domain(dev);
+	if (!domain)
+		return 0;
+
+	info = msi_get_domain_info(domain);
+
+	if (!(info->flags & MSI_FLAG_IRQ_REMAPPING))
+		return -1;
+
+	return 0;
+}
+
 static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			     struct vfio_domain *domain)
 {
@@ -848,7 +875,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_group *group, *g;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
+	int ret, irq_remapping;
 
 	mutex_lock(&iommu->lock);
 
@@ -871,6 +898,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 
 	group->iommu_group = iommu_group;
 
+	/*
+	 * Determine if all the devices of the group has an MSI-parent that
+	 * supports irq remapping
+	 */
+	irq_remapping = !iommu_group_for_each_dev(iommu_group, &bus,
+				       vfio_msi_parent_irq_remapping_capable);
+
 	/* Determine bus_type in order to allocate a domain */
 	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
 	if (ret)
@@ -899,7 +933,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_add(&group->next, &domain->group_list);
 
 	if (!allow_unsafe_interrupts &&
-	    !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+	    (!iommu_capable(bus, IOMMU_CAP_INTR_REMAP) && !irq_remapping)) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 13/15] vfio/type1: also check IRQ remapping capability at msi domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

On x86 IRQ remapping is abstracted by the IOMMU. On ARM this is abstracted
by the msi controller. vfio_msi_parent_irq_remapping_capable allows to
check whether interrupts are "safe" for a given device. There are if the device
does not use MSI or if the device uses MSI and the msi-parent controller
supports IRQ remapping.

Then we check at group level if all devices have safe interrupts: if not
only allow the group to be attached if allow_unsafe_interrupts is set.

At this point ARM sMMU still advertises IOMMU_CAP_INTR_REMAP. This is
changed in next patch.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5d3b48..080321b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,8 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
+#include <linux/irqdomain.h>
+#include <linux/msi.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -754,6 +756,31 @@ static int vfio_bus_type(struct device *dev, void *data)
 	return 0;
 }
 
+/**
+ * vfio_msi_parent_irq_remapping_capable: returns whether the device msi-parent
+ * controller supports IRQ remapping, aka interrupt translation
+ *
+ * @dev: device handle
+ * @data: unused
+ * returns 0 if irq remapping is supported or -1 if not supported.
+ */
+static int vfio_msi_parent_irq_remapping_capable(struct device *dev, void *data)
+{
+	struct irq_domain *domain;
+	struct msi_domain_info *info;
+
+	domain = dev_get_msi_domain(dev);
+	if (!domain)
+		return 0;
+
+	info = msi_get_domain_info(domain);
+
+	if (!(info->flags & MSI_FLAG_IRQ_REMAPPING))
+		return -1;
+
+	return 0;
+}
+
 static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			     struct vfio_domain *domain)
 {
@@ -848,7 +875,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_group *group, *g;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
+	int ret, irq_remapping;
 
 	mutex_lock(&iommu->lock);
 
@@ -871,6 +898,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 
 	group->iommu_group = iommu_group;
 
+	/*
+	 * Determine if all the devices of the group has an MSI-parent that
+	 * supports irq remapping
+	 */
+	irq_remapping = !iommu_group_for_each_dev(iommu_group, &bus,
+				       vfio_msi_parent_irq_remapping_capable);
+
 	/* Determine bus_type in order to allocate a domain */
 	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
 	if (ret)
@@ -899,7 +933,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_add(&group->next, &domain->group_list);
 
 	if (!allow_unsafe_interrupts &&
-	    !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+	    (!iommu_capable(bus, IOMMU_CAP_INTR_REMAP) && !irq_remapping)) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 13/15] vfio/type1: also check IRQ remapping capability at msi domain
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

On x86 IRQ remapping is abstracted by the IOMMU. On ARM this is abstracted
by the msi controller. vfio_msi_parent_irq_remapping_capable allows to
check whether interrupts are "safe" for a given device. There are if the device
does not use MSI or if the device uses MSI and the msi-parent controller
supports IRQ remapping.

Then we check at group level if all devices have safe interrupts: if not
only allow the group to be attached if allow_unsafe_interrupts is set.

At this point ARM sMMU still advertises IOMMU_CAP_INTR_REMAP. This is
changed in next patch.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/vfio/vfio_iommu_type1.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c5d3b48..080321b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,8 @@
 #include <linux/uaccess.h>
 #include <linux/vfio.h>
 #include <linux/workqueue.h>
+#include <linux/irqdomain.h>
+#include <linux/msi.h>
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
@@ -754,6 +756,31 @@ static int vfio_bus_type(struct device *dev, void *data)
 	return 0;
 }
 
+/**
+ * vfio_msi_parent_irq_remapping_capable: returns whether the device msi-parent
+ * controller supports IRQ remapping, aka interrupt translation
+ *
+ * @dev: device handle
+ * @data: unused
+ * returns 0 if irq remapping is supported or -1 if not supported.
+ */
+static int vfio_msi_parent_irq_remapping_capable(struct device *dev, void *data)
+{
+	struct irq_domain *domain;
+	struct msi_domain_info *info;
+
+	domain = dev_get_msi_domain(dev);
+	if (!domain)
+		return 0;
+
+	info = msi_get_domain_info(domain);
+
+	if (!(info->flags & MSI_FLAG_IRQ_REMAPPING))
+		return -1;
+
+	return 0;
+}
+
 static int vfio_iommu_replay(struct vfio_iommu *iommu,
 			     struct vfio_domain *domain)
 {
@@ -848,7 +875,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	struct vfio_group *group, *g;
 	struct vfio_domain *domain, *d;
 	struct bus_type *bus = NULL;
-	int ret;
+	int ret, irq_remapping;
 
 	mutex_lock(&iommu->lock);
 
@@ -871,6 +898,13 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 
 	group->iommu_group = iommu_group;
 
+	/*
+	 * Determine if all the devices of the group has an MSI-parent that
+	 * supports irq remapping
+	 */
+	irq_remapping = !iommu_group_for_each_dev(iommu_group, &bus,
+				       vfio_msi_parent_irq_remapping_capable);
+
 	/* Determine bus_type in order to allocate a domain */
 	ret = iommu_group_for_each_dev(iommu_group, &bus, vfio_bus_type);
 	if (ret)
@@ -899,7 +933,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
 	list_add(&group->next, &domain->group_list);
 
 	if (!allow_unsafe_interrupts &&
-	    !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+	    (!iommu_capable(bus, IOMMU_CAP_INTR_REMAP) && !irq_remapping)) {
 		pr_warn("%s: No interrupt remapping support.  Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
 		       __func__);
 		ret = -EPERM;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 14/15] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

Do not advertise IOMMU_CAP_INTR_REMAP for arm-smmu. Indeed the
irq_remapping capability is abstracted on irqchip side for ARM as
opposed to Intel IOMMU featuring IRQ remapping HW.

So to check IRQ remmapping capability, the msi domain needs to be
checked instead.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ae8a97d..9a83285 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1354,7 +1354,7 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 		 */
 		return true;
 	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
+		return false; /* MSIs are just memory writes */
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 14/15] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

Do not advertise IOMMU_CAP_INTR_REMAP for arm-smmu. Indeed the
irq_remapping capability is abstracted on irqchip side for ARM as
opposed to Intel IOMMU featuring IRQ remapping HW.

So to check IRQ remmapping capability, the msi domain needs to be
checked instead.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ae8a97d..9a83285 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1354,7 +1354,7 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 		 */
 		return true;
 	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
+		return false; /* MSIs are just memory writes */
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 14/15] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

Do not advertise IOMMU_CAP_INTR_REMAP for arm-smmu. Indeed the
irq_remapping capability is abstracted on irqchip side for ARM as
opposed to Intel IOMMU featuring IRQ remapping HW.

So to check IRQ remmapping capability, the msi domain needs to be
checked instead.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/arm-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ae8a97d..9a83285 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1354,7 +1354,7 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 		 */
 		return true;
 	case IOMMU_CAP_INTR_REMAP:
-		return true; /* MSIs are just memory writes */
+		return false; /* MSIs are just memory writes */
 	case IOMMU_CAP_NOEXEC:
 		return true;
 	default:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed
  2016-02-11 14:34 ` Eric Auger
  (?)
@ 2016-02-11 14:34   ` Eric Auger
  -1 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: suravee.suthikulpanit, patches, linux-kernel, Manish.Jaggi,
	Bharat.Bhushan, pranav.sawargaonkar, p.fedin, iommu,
	sherry.hurwitz, brijesh.singh, leo.duran, Thomas.Lendacky

In case the msi_desc references a device attached to an iommu
domain, the msi address needs to be mapped in the IOMMU. Else any
MSI write transaction will cause a fault.

gic_set_msi_addr detects that case and allocates an iova bound
to the physical address page comprising the MSI frame. This iova
then is used as the msi_msg address. Unset operation decrements the
reference on the binding.

The functions are called in the irq_write_msi_msg ops implementation.
At that time we can recognize whether the msi is setup or teared down
looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
the fields.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-common.c         | 60 ++++++++++++++++++++++++++++++++
 drivers/irqchip/irq-gic-common.h         |  5 +++
 drivers/irqchip/irq-gic-v2m.c            |  3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |  3 +-
 4 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index f174ce0..690802b 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -18,6 +18,8 @@
 #include <linux/io.h>
 #include <linux/irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/iommu.h>
+#include <linux/msi.h>
 
 #include "irq-gic-common.h"
 
@@ -121,3 +123,61 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void))
 	if (sync_access)
 		sync_access();
 }
+
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev = msi_desc_to_dev(desc);
+	struct iommu_domain *d;
+	phys_addr_t addr;
+	dma_addr_t iova;
+	int ret;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return 0;
+
+	addr = ((phys_addr_t)(msg->address_hi) << 32) |
+		msg->address_lo;
+
+	ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, &iova);
+
+	if (!ret) {
+		msg->address_lo = lower_32_bits(iova);
+		msg->address_hi = upper_32_bits(iova);
+	}
+	return ret;
+}
+
+
+void gic_unset_msi_addr(struct irq_data *data)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+
+	iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
+		desc->msg.address_lo;
+
+	dev = msi_desc_to_dev(desc);
+	if (!dev)
+		return;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return;
+
+	iommu_put_single_reserved(d, iova);
+}
+
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg)
+{
+	if (!msg->address_hi && !msg->address_lo && !msg->data)
+		gic_unset_msi_addr(irq_data); /* deactivate */
+	else
+		gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
+
+	pci_msi_domain_write_msg(irq_data, msg);
+}
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index fff697d..e99e321 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,4 +35,9 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
 void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
 		void *data);
 
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg);
+void gic_unset_msi_addr(struct irq_data *data);
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg);
+
 #endif /* _IRQ_GIC_COMMON_H */
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index c779f83..5d7b89f 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -24,6 +24,7 @@
 #include <linux/of_pci.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include "irq-gic-common.h"
 
 /*
 * MSI_TYPER:
@@ -83,7 +84,7 @@ static struct irq_chip gicv2m_msi_irq_chip = {
 	.irq_mask		= gicv2m_mask_msi_irq,
 	.irq_unmask		= gicv2m_unmask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 static struct msi_domain_info gicv2m_msi_domain_info = {
diff --git a/drivers/irqchip/irq-gic-v3-its-pci-msi.c b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
index aee60ed..6d5cbce 100644
--- a/drivers/irqchip/irq-gic-v3-its-pci-msi.c
+++ b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/of_irq.h>
 #include <linux/of_pci.h>
+#include "irq-gic-common.h"
 
 static void its_mask_msi_irq(struct irq_data *d)
 {
@@ -37,7 +38,7 @@ static struct irq_chip its_msi_irq_chip = {
 	.irq_unmask		= its_unmask_msi_irq,
 	.irq_mask		= its_mask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 struct its_pci_alias {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: eric.auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel, kvmarm,
	kvm
  Cc: Thomas.Lendacky, brijesh.singh, patches, Manish.Jaggi,
	linux-kernel, iommu, leo.duran, sherry.hurwitz

In case the msi_desc references a device attached to an iommu
domain, the msi address needs to be mapped in the IOMMU. Else any
MSI write transaction will cause a fault.

gic_set_msi_addr detects that case and allocates an iova bound
to the physical address page comprising the MSI frame. This iova
then is used as the msi_msg address. Unset operation decrements the
reference on the binding.

The functions are called in the irq_write_msi_msg ops implementation.
At that time we can recognize whether the msi is setup or teared down
looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
the fields.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-common.c         | 60 ++++++++++++++++++++++++++++++++
 drivers/irqchip/irq-gic-common.h         |  5 +++
 drivers/irqchip/irq-gic-v2m.c            |  3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |  3 +-
 4 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index f174ce0..690802b 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -18,6 +18,8 @@
 #include <linux/io.h>
 #include <linux/irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/iommu.h>
+#include <linux/msi.h>
 
 #include "irq-gic-common.h"
 
@@ -121,3 +123,61 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void))
 	if (sync_access)
 		sync_access();
 }
+
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev = msi_desc_to_dev(desc);
+	struct iommu_domain *d;
+	phys_addr_t addr;
+	dma_addr_t iova;
+	int ret;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return 0;
+
+	addr = ((phys_addr_t)(msg->address_hi) << 32) |
+		msg->address_lo;
+
+	ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, &iova);
+
+	if (!ret) {
+		msg->address_lo = lower_32_bits(iova);
+		msg->address_hi = upper_32_bits(iova);
+	}
+	return ret;
+}
+
+
+void gic_unset_msi_addr(struct irq_data *data)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+
+	iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
+		desc->msg.address_lo;
+
+	dev = msi_desc_to_dev(desc);
+	if (!dev)
+		return;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return;
+
+	iommu_put_single_reserved(d, iova);
+}
+
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg)
+{
+	if (!msg->address_hi && !msg->address_lo && !msg->data)
+		gic_unset_msi_addr(irq_data); /* deactivate */
+	else
+		gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
+
+	pci_msi_domain_write_msg(irq_data, msg);
+}
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index fff697d..e99e321 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,4 +35,9 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
 void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
 		void *data);
 
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg);
+void gic_unset_msi_addr(struct irq_data *data);
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg);
+
 #endif /* _IRQ_GIC_COMMON_H */
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index c779f83..5d7b89f 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -24,6 +24,7 @@
 #include <linux/of_pci.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include "irq-gic-common.h"
 
 /*
 * MSI_TYPER:
@@ -83,7 +84,7 @@ static struct irq_chip gicv2m_msi_irq_chip = {
 	.irq_mask		= gicv2m_mask_msi_irq,
 	.irq_unmask		= gicv2m_unmask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 static struct msi_domain_info gicv2m_msi_domain_info = {
diff --git a/drivers/irqchip/irq-gic-v3-its-pci-msi.c b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
index aee60ed..6d5cbce 100644
--- a/drivers/irqchip/irq-gic-v3-its-pci-msi.c
+++ b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/of_irq.h>
 #include <linux/of_pci.h>
+#include "irq-gic-common.h"
 
 static void its_mask_msi_irq(struct irq_data *d)
 {
@@ -37,7 +38,7 @@ static struct irq_chip its_msi_irq_chip = {
 	.irq_unmask		= its_unmask_msi_irq,
 	.irq_mask		= its_mask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 struct its_pci_alias {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC v2 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed
@ 2016-02-11 14:34   ` Eric Auger
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Auger @ 2016-02-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel

In case the msi_desc references a device attached to an iommu
domain, the msi address needs to be mapped in the IOMMU. Else any
MSI write transaction will cause a fault.

gic_set_msi_addr detects that case and allocates an iova bound
to the physical address page comprising the MSI frame. This iova
then is used as the msi_msg address. Unset operation decrements the
reference on the binding.

The functions are called in the irq_write_msi_msg ops implementation.
At that time we can recognize whether the msi is setup or teared down
looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
the fields.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/irqchip/irq-gic-common.c         | 60 ++++++++++++++++++++++++++++++++
 drivers/irqchip/irq-gic-common.h         |  5 +++
 drivers/irqchip/irq-gic-v2m.c            |  3 +-
 drivers/irqchip/irq-gic-v3-its-pci-msi.c |  3 +-
 4 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index f174ce0..690802b 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -18,6 +18,8 @@
 #include <linux/io.h>
 #include <linux/irq.h>
 #include <linux/irqchip/arm-gic.h>
+#include <linux/iommu.h>
+#include <linux/msi.h>
 
 #include "irq-gic-common.h"
 
@@ -121,3 +123,61 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void))
 	if (sync_access)
 		sync_access();
 }
+
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev = msi_desc_to_dev(desc);
+	struct iommu_domain *d;
+	phys_addr_t addr;
+	dma_addr_t iova;
+	int ret;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return 0;
+
+	addr = ((phys_addr_t)(msg->address_hi) << 32) |
+		msg->address_lo;
+
+	ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, &iova);
+
+	if (!ret) {
+		msg->address_lo = lower_32_bits(iova);
+		msg->address_hi = upper_32_bits(iova);
+	}
+	return ret;
+}
+
+
+void gic_unset_msi_addr(struct irq_data *data)
+{
+	struct msi_desc *desc = irq_data_get_msi_desc(data);
+	struct device *dev;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+
+	iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
+		desc->msg.address_lo;
+
+	dev = msi_desc_to_dev(desc);
+	if (!dev)
+		return;
+
+	d = iommu_get_domain_for_dev(dev);
+	if (!d)
+		return;
+
+	iommu_put_single_reserved(d, iova);
+}
+
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg)
+{
+	if (!msg->address_hi && !msg->address_lo && !msg->data)
+		gic_unset_msi_addr(irq_data); /* deactivate */
+	else
+		gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
+
+	pci_msi_domain_write_msg(irq_data, msg);
+}
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index fff697d..e99e321 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,4 +35,9 @@ void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
 void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
 		void *data);
 
+int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg);
+void gic_unset_msi_addr(struct irq_data *data);
+void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
+				  struct msi_msg *msg);
+
 #endif /* _IRQ_GIC_COMMON_H */
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index c779f83..5d7b89f 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -24,6 +24,7 @@
 #include <linux/of_pci.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include "irq-gic-common.h"
 
 /*
 * MSI_TYPER:
@@ -83,7 +84,7 @@ static struct irq_chip gicv2m_msi_irq_chip = {
 	.irq_mask		= gicv2m_mask_msi_irq,
 	.irq_unmask		= gicv2m_unmask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 static struct msi_domain_info gicv2m_msi_domain_info = {
diff --git a/drivers/irqchip/irq-gic-v3-its-pci-msi.c b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
index aee60ed..6d5cbce 100644
--- a/drivers/irqchip/irq-gic-v3-its-pci-msi.c
+++ b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
@@ -19,6 +19,7 @@
 #include <linux/of.h>
 #include <linux/of_irq.h>
 #include <linux/of_pci.h>
+#include "irq-gic-common.h"
 
 static void its_mask_msi_irq(struct irq_data *d)
 {
@@ -37,7 +38,7 @@ static struct irq_chip its_msi_irq_chip = {
 	.irq_unmask		= its_unmask_msi_irq,
 	.irq_mask		= its_mask_msi_irq,
 	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+	.irq_write_msi_msg	= gic_pci_msi_domain_write_msg,
 };
 
 struct its_pci_alias {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2016-02-11 14:42 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-11 14:34 [RFC v2 00/15] KVM PCIe/MSI passthrough on ARM/ARM64 Eric Auger
2016-02-11 14:34 ` Eric Auger
2016-02-11 14:34 ` Eric Auger
2016-02-11 14:34 ` [RFC v2 01/15] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 03/15] vfio: introduce VFIO_IOVA_RESERVED vfio_dma type Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 04/15] iommu: add alloc/free_reserved_iova_domain Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 06/15] iommu/arm-smmu: add a reserved binding RB tree Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 07/15] iommu: iommu_get/put_single_reserved Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 08/15] iommu/arm-smmu: implement iommu_get/put_single_reserved Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 09/15] iommu/arm-smmu: relinquish reserved resources on domain deletion Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 10/15] vfio: allow the user to register reserved iova range for MSI mapping Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 11/15] msi: Add a new MSI_FLAG_IRQ_REMAPPING flag Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 12/15] msi: export msi_get_domain_info Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 13/15] vfio/type1: also check IRQ remapping capability at msi domain Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 14/15] iommu/arm-smmu: do not advertise IOMMU_CAP_INTR_REMAP Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34 ` [RFC v2 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed Eric Auger
2016-02-11 14:34   ` Eric Auger
2016-02-11 14:34   ` Eric Auger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.