All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-19 16:56 ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This series introduces the dma-reserved-iommu api used to:

- create/destroy an iova domain dedicated to reserved iova bindings
- map/unmap physical addresses onto reserved IOVAs.
- search for an existing reserved iova mapping matching a PA window
- determine whether an msi needs to be iommu mapped
- translate an msi_msg PA address into its IOVA counterpart

Currently reserved IOVAs are meant to map MSI physical doorbells. A single
reserved domain does exit per domain.

Also a new domain attribute is introduced to signal whether the MSI
addresses must be mapped in the IOMMU.

In current usage:
VFIO subsystem is supposed to create/destroy the iommu reserved domain.
The MSI layer is supposed to allocate/free iova mappings

Since several drivers are likely to use the same doorbell, a reference
counting takes place on the bindings. An RB-tree indexed by PA is used
to easily lookup for existing mappings at MSI message composition time.

More details & context can be found at:
http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/

Best Regards

Eric

Git: complete series available at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7

History:

v6 -> v7:
- fixed known lock bugs and multiple page sized slots matching
  (I currently only have a single MSI frame made of a single page)
- reserved_iova_cookie now pointing to a struct that encapsulates the
  iova domain handle + protection attribute passed from VFIO (Alex' req)
- 2 new functions exposed: iommu_msi_mapping_translate_msg,
  iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
  though
- iommu_put_reserved_iova now takes a phys_addr_t
- everything now is cleanup on iommu_domain destruction

RFC v5 -> patch v6:
- split to ease the review process
- in dma-reserved-api use a spin lock instead of a mutex (reported by
  Jean-Philippe)
- revisit iommu_get_reserved_iova API to pass a size parameter upon
  Marc's request
- Consistently use the page order passed when creating the iova domain.
- init reserved_binding_list (reported by Julien)

RFC v4 -> RFC v5:
- take into account Thomas' comments on MSI related patches
  - split "msi: IOMMU map the doorbell address when needed"
  - increase readability and add comments
  - fix style issues
 - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
 - platform ITS now advertises IOMMU_CAP_INTR_REMAP
 - fix compilation issue with CONFIG_IOMMU API unset
 - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING

RFC v3 -> v4:
- Move doorbell mapping/unmapping in msi.c
- fix ref count issue on set_affinity: in case of a change in the address
  the previous address is decremented
- doorbell map/unmap now is done on msi composition. Should allow the use
  case for platform MSI controllers
- create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
  to reserved IOVA management (looking like dma-iommu glue)
- series reordering to ease the review:
  - first part is related to IOMMU
  - second related to MSI sub-system
  - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
- expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
  [this partially addresses Marc's comments on iommu_get/put_single_reserved
   size/alignment problematic - which I did not ignore - but I don't know
   how much I can do at the moment]

RFC v2 -> RFC v3:
- should fix wrong handling of some CONFIG combinations:
  CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
- fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (10):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
  iommu: introduce a reserved iova cookie
  iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
  iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
  iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
  iommu/dma-reserved-iommu: delete bindings in
    iommu_free_reserved_iova_domain
  iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
  iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
  iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
    destruction

 drivers/iommu/Kconfig              |   8 +
 drivers/iommu/Makefile             |   1 +
 drivers/iommu/arm-smmu-v3.c        |   4 +
 drivers/iommu/arm-smmu.c           |   4 +
 drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c              |   2 +
 include/linux/dma-reserved-iommu.h | 142 +++++++++++++
 include/linux/iommu.h              |   7 +
 8 files changed, 590 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-19 16:56 ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

This series introduces the dma-reserved-iommu api used to:

- create/destroy an iova domain dedicated to reserved iova bindings
- map/unmap physical addresses onto reserved IOVAs.
- search for an existing reserved iova mapping matching a PA window
- determine whether an msi needs to be iommu mapped
- translate an msi_msg PA address into its IOVA counterpart

Currently reserved IOVAs are meant to map MSI physical doorbells. A single
reserved domain does exit per domain.

Also a new domain attribute is introduced to signal whether the MSI
addresses must be mapped in the IOMMU.

In current usage:
VFIO subsystem is supposed to create/destroy the iommu reserved domain.
The MSI layer is supposed to allocate/free iova mappings

Since several drivers are likely to use the same doorbell, a reference
counting takes place on the bindings. An RB-tree indexed by PA is used
to easily lookup for existing mappings at MSI message composition time.

More details & context can be found at:
http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/

Best Regards

Eric

Git: complete series available at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7

History:

v6 -> v7:
- fixed known lock bugs and multiple page sized slots matching
  (I currently only have a single MSI frame made of a single page)
- reserved_iova_cookie now pointing to a struct that encapsulates the
  iova domain handle + protection attribute passed from VFIO (Alex' req)
- 2 new functions exposed: iommu_msi_mapping_translate_msg,
  iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
  though
- iommu_put_reserved_iova now takes a phys_addr_t
- everything now is cleanup on iommu_domain destruction

RFC v5 -> patch v6:
- split to ease the review process
- in dma-reserved-api use a spin lock instead of a mutex (reported by
  Jean-Philippe)
- revisit iommu_get_reserved_iova API to pass a size parameter upon
  Marc's request
- Consistently use the page order passed when creating the iova domain.
- init reserved_binding_list (reported by Julien)

RFC v4 -> RFC v5:
- take into account Thomas' comments on MSI related patches
  - split "msi: IOMMU map the doorbell address when needed"
  - increase readability and add comments
  - fix style issues
 - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
 - platform ITS now advertises IOMMU_CAP_INTR_REMAP
 - fix compilation issue with CONFIG_IOMMU API unset
 - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING

RFC v3 -> v4:
- Move doorbell mapping/unmapping in msi.c
- fix ref count issue on set_affinity: in case of a change in the address
  the previous address is decremented
- doorbell map/unmap now is done on msi composition. Should allow the use
  case for platform MSI controllers
- create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
  to reserved IOVA management (looking like dma-iommu glue)
- series reordering to ease the review:
  - first part is related to IOMMU
  - second related to MSI sub-system
  - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
- expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
  [this partially addresses Marc's comments on iommu_get/put_single_reserved
   size/alignment problematic - which I did not ignore - but I don't know
   how much I can do at the moment]

RFC v2 -> RFC v3:
- should fix wrong handling of some CONFIG combinations:
  CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
- fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (10):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
  iommu: introduce a reserved iova cookie
  iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
  iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
  iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
  iommu/dma-reserved-iommu: delete bindings in
    iommu_free_reserved_iova_domain
  iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
  iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
  iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
    destruction

 drivers/iommu/Kconfig              |   8 +
 drivers/iommu/Makefile             |   1 +
 drivers/iommu/arm-smmu-v3.c        |   4 +
 drivers/iommu/arm-smmu.c           |   4 +
 drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c              |   2 +
 include/linux/dma-reserved-iommu.h | 142 +++++++++++++
 include/linux/iommu.h              |   7 +
 8 files changed, 590 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-19 16:56 ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

This series introduces the dma-reserved-iommu api used to:

- create/destroy an iova domain dedicated to reserved iova bindings
- map/unmap physical addresses onto reserved IOVAs.
- search for an existing reserved iova mapping matching a PA window
- determine whether an msi needs to be iommu mapped
- translate an msi_msg PA address into its IOVA counterpart

Currently reserved IOVAs are meant to map MSI physical doorbells. A single
reserved domain does exit per domain.

Also a new domain attribute is introduced to signal whether the MSI
addresses must be mapped in the IOMMU.

In current usage:
VFIO subsystem is supposed to create/destroy the iommu reserved domain.
The MSI layer is supposed to allocate/free iova mappings

Since several drivers are likely to use the same doorbell, a reference
counting takes place on the bindings. An RB-tree indexed by PA is used
to easily lookup for existing mappings at MSI message composition time.

More details & context can be found at:
http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/

Best Regards

Eric

Git: complete series available at
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7

History:

v6 -> v7:
- fixed known lock bugs and multiple page sized slots matching
  (I currently only have a single MSI frame made of a single page)
- reserved_iova_cookie now pointing to a struct that encapsulates the
  iova domain handle + protection attribute passed from VFIO (Alex' req)
- 2 new functions exposed: iommu_msi_mapping_translate_msg,
  iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
  though
- iommu_put_reserved_iova now takes a phys_addr_t
- everything now is cleanup on iommu_domain destruction

RFC v5 -> patch v6:
- split to ease the review process
- in dma-reserved-api use a spin lock instead of a mutex (reported by
  Jean-Philippe)
- revisit iommu_get_reserved_iova API to pass a size parameter upon
  Marc's request
- Consistently use the page order passed when creating the iova domain.
- init reserved_binding_list (reported by Julien)

RFC v4 -> RFC v5:
- take into account Thomas' comments on MSI related patches
  - split "msi: IOMMU map the doorbell address when needed"
  - increase readability and add comments
  - fix style issues
 - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
 - platform ITS now advertises IOMMU_CAP_INTR_REMAP
 - fix compilation issue with CONFIG_IOMMU API unset
 - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING

RFC v3 -> v4:
- Move doorbell mapping/unmapping in msi.c
- fix ref count issue on set_affinity: in case of a change in the address
  the previous address is decremented
- doorbell map/unmap now is done on msi composition. Should allow the use
  case for platform MSI controllers
- create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
  to reserved IOVA management (looking like dma-iommu glue)
- series reordering to ease the review:
  - first part is related to IOMMU
  - second related to MSI sub-system
  - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
- expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
  [this partially addresses Marc's comments on iommu_get/put_single_reserved
   size/alignment problematic - which I did not ignore - but I don't know
   how much I can do at the moment]

RFC v2 -> RFC v3:
- should fix wrong handling of some CONFIG combinations:
  CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
- fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)

PATCH v1 -> RFC v2:
- reverted to RFC since it looks more reasonable ;-) the code is split
  between VFIO, IOMMU, MSI controller and I am not sure I did the right
  choices. Also API need to be further discussed.
- iova API usage in arm-smmu.c.
- MSI controller natively programs the MSI addr with either the PA or IOVA.
  This is not done anymore in vfio-pci driver as suggested by Alex.
- check irq remapping capability of the group

RFC v1 [2] -> PATCH v1:
- use the existing dma map/unmap ioctl interface with a flag to register a
  reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
- a single reserved IOVA contiguous region now is allowed
- use of an RB tree indexed by PA to store allocated reserved slots
- use of a vfio_domain iova_domain to manage iova allocation within the
  window provided by the userspace
- vfio alloc_map/unmap_free take a vfio_group handle
- vfio_group handle is cached in vfio_pci_device
- add ref counting to bindings
- user modality enabled at the end of the series


Eric Auger (10):
  iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
  iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
  iommu: introduce a reserved iova cookie
  iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
  iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
  iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
  iommu/dma-reserved-iommu: delete bindings in
    iommu_free_reserved_iova_domain
  iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
  iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
  iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
    destruction

 drivers/iommu/Kconfig              |   8 +
 drivers/iommu/Makefile             |   1 +
 drivers/iommu/arm-smmu-v3.c        |   4 +
 drivers/iommu/arm-smmu.c           |   4 +
 drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c              |   2 +
 include/linux/dma-reserved-iommu.h | 142 +++++++++++++
 include/linux/iommu.h              |   7 +
 8 files changed, 590 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
conveyed through the IOMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v4 -> v5:
- introduce the user in the next patch

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 62a5eae..b3e8c5b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -113,6 +113,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
conveyed through the IOMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v4 -> v5:
- introduce the user in the next patch

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 62a5eae..b3e8c5b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -113,6 +113,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
this means the MSI addresses need to be mapped in the IOMMU.

x86 IOMMUs typically don't expose the attribute since on x86, MSI write
transaction addresses always are within the 1MB PA region [FEE0_0000h -
FEF0_000h] window which directly targets the APIC configuration space and
hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
conveyed through the IOMMU.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v4 -> v5:
- introduce the user in the next patch

RFC v1 -> v1:
- the data field is not used
- for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
  capability if needed or <0 if not.
- removed struct iommu_domain_msi_maps
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 62a5eae..b3e8c5b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -113,6 +113,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 02/10] iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On ARM, MSI write transactions from device upstream to the smmu
are conveyed through the iommu. Therefore target physical addresses
must be mapped and DOMAIN_ATTR_MSI_MAPPING is set to advertise
this requirement on arm-smmu and arm-smmu-v3.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v4 -> v5:
- don't handle fsl_pamu_domain anymore
- handle arm-smmu-v3
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..a077a35 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1900,6 +1900,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7c39ac4..8cd7b8a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1435,6 +1435,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 02/10] iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On ARM, MSI write transactions from device upstream to the smmu
are conveyed through the iommu. Therefore target physical addresses
must be mapped and DOMAIN_ATTR_MSI_MAPPING is set to advertise
this requirement on arm-smmu and arm-smmu-v3.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>

---

v4 -> v5:
- don't handle fsl_pamu_domain anymore
- handle arm-smmu-v3
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..a077a35 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1900,6 +1900,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7c39ac4..8cd7b8a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1435,6 +1435,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 02/10] iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

On ARM, MSI write transactions from device upstream to the smmu
are conveyed through the iommu. Therefore target physical addresses
must be mapped and DOMAIN_ATTR_MSI_MAPPING is set to advertise
this requirement on arm-smmu and arm-smmu-v3.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>

---

v4 -> v5:
- don't handle fsl_pamu_domain anymore
- handle arm-smmu-v3
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4ff73ff..a077a35 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1900,6 +1900,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7c39ac4..8cd7b8a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1435,6 +1435,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_MSI_MAPPING:
+		return 0;
 	default:
 		return -ENODEV;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This patch introduces some new fields in the iommu_domain struct,
dedicated to reserved iova management.

In a similar way as DMA mapping IOVA window, we need to store
information related to a reserved IOVA window.

The reserved_iova_cookie will store the reserved iova_domain
handle. An RB tree indexed by physical address is introduced to
store the host physical addresses bound to reserved IOVAs.

Those physical addresses will correspond to MSI frame base
addresses, also referred to as doorbells. Their number should be
quite limited per domain.

Also a spin_lock is introduced to protect accesses to the iova_domain
and RB tree. The choice of a spin_lock is driven by the fact the RB
tree will need to be accessed in MSI controller code not allowed to
sleep.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- initialize reserved_binding_list
- use a spinlock instead of a mutex
---
 drivers/iommu/iommu.c | 2 ++
 include/linux/iommu.h | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b9df141..f70ef3b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	spin_lock_init(&domain->reserved_lock);
+	domain->reserved_binding_list = RB_ROOT;
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3e8c5b..60999db 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/types.h>
 #include <linux/scatterlist.h>
+#include <linux/spinlock.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -83,6 +84,11 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	void *reserved_iova_cookie;
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root reserved_binding_list;
+	/* protects reserved cookie and rbtree manipulation */
+	spinlock_t reserved_lock;
 };
 
 enum iommu_cap {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

This patch introduces some new fields in the iommu_domain struct,
dedicated to reserved iova management.

In a similar way as DMA mapping IOVA window, we need to store
information related to a reserved IOVA window.

The reserved_iova_cookie will store the reserved iova_domain
handle. An RB tree indexed by physical address is introduced to
store the host physical addresses bound to reserved IOVAs.

Those physical addresses will correspond to MSI frame base
addresses, also referred to as doorbells. Their number should be
quite limited per domain.

Also a spin_lock is introduced to protect accesses to the iova_domain
and RB tree. The choice of a spin_lock is driven by the fact the RB
tree will need to be accessed in MSI controller code not allowed to
sleep.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---
v5 -> v6:
- initialize reserved_binding_list
- use a spinlock instead of a mutex
---
 drivers/iommu/iommu.c | 2 ++
 include/linux/iommu.h | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b9df141..f70ef3b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	spin_lock_init(&domain->reserved_lock);
+	domain->reserved_binding_list = RB_ROOT;
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3e8c5b..60999db 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/types.h>
 #include <linux/scatterlist.h>
+#include <linux/spinlock.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -83,6 +84,11 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	void *reserved_iova_cookie;
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root reserved_binding_list;
+	/* protects reserved cookie and rbtree manipulation */
+	spinlock_t reserved_lock;
 };
 
 enum iommu_cap {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces some new fields in the iommu_domain struct,
dedicated to reserved iova management.

In a similar way as DMA mapping IOVA window, we need to store
information related to a reserved IOVA window.

The reserved_iova_cookie will store the reserved iova_domain
handle. An RB tree indexed by physical address is introduced to
store the host physical addresses bound to reserved IOVAs.

Those physical addresses will correspond to MSI frame base
addresses, also referred to as doorbells. Their number should be
quite limited per domain.

Also a spin_lock is introduced to protect accesses to the iova_domain
and RB tree. The choice of a spin_lock is driven by the fact the RB
tree will need to be accessed in MSI controller code not allowed to
sleep.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- initialize reserved_binding_list
- use a spinlock instead of a mutex
---
 drivers/iommu/iommu.c | 2 ++
 include/linux/iommu.h | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b9df141..f70ef3b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
 
 	domain->ops  = bus->iommu_ops;
 	domain->type = type;
+	spin_lock_init(&domain->reserved_lock);
+	domain->reserved_binding_list = RB_ROOT;
 
 	return domain;
 }
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b3e8c5b..60999db 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -24,6 +24,7 @@
 #include <linux/of.h>
 #include <linux/types.h>
 #include <linux/scatterlist.h>
+#include <linux/spinlock.h>
 #include <trace/events/iommu.h>
 
 #define IOMMU_READ	(1 << 0)
@@ -83,6 +84,11 @@ struct iommu_domain {
 	void *handler_token;
 	struct iommu_domain_geometry geometry;
 	void *iova_cookie;
+	void *reserved_iova_cookie;
+	/* rb tree indexed by PA, for reserved bindings only */
+	struct rb_root reserved_binding_list;
+	/* protects reserved cookie and rbtree manipulation */
+	spinlock_t reserved_lock;
 };
 
 enum iommu_cap {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
is set.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v7:
- fix locking
- add iova_cache_get/put
- static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
- introduce struct reserved_iova_domain to encapsulate prot info &
  add prot parameter in alloc_reserved_iova_domain

v5 -> v6:
- use spin lock instead of mutex

v3 -> v4:
- formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
  "iommu: add alloc/free_reserved_iova_domain"

v2 -> v3:
- remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
  static implementation in case CONFIG_IOMMU_API is not set

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/Kconfig              |  8 +++
 drivers/iommu/Makefile             |  1 +
 drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
 4 files changed, 167 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd1dc39..a5a1530 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,6 +74,12 @@ config IOMMU_DMA
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
 
+# IOMMU reserved IOVA mapping (MSI doorbell)
+config IOMMU_DMA_RESERVED
+	bool
+	select IOMMU_API
+	select IOMMU_IOVA
+
 config FSL_PAMU
 	bool "Freescale IOMMU support"
 	depends on PPC32
@@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
 config ARM_SMMU
 	bool "ARM Ltd. System MMU (SMMU) Support"
 	depends on (ARM64 || ARM) && MMU
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select ARM_DMA_USE_IOMMU if ARM
@@ -320,6 +327,7 @@ config ARM_SMMU
 config ARM_SMMU_V3
 	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
 	depends on ARM64 && PCI
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c6edb31..6c9ae99 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
 obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
+obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
new file mode 100644
index 0000000..2562af0
--- /dev/null
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -0,0 +1,99 @@
+/*
+ * Reserved IOVA Management
+ *
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/iommu.h>
+#include <linux/iova.h>
+
+struct reserved_iova_domain {
+	struct iova_domain *iovad;
+	int prot; /* iommu protection attributes to be obeyed */
+};
+
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order)
+{
+	unsigned long granule, mask, flags;
+	struct reserved_iova_domain *rid;
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
+	if (!rid)
+		return -ENOMEM;
+
+	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!rid->iovad) {
+		kfree(rid);
+		return -ENOMEM;
+	}
+
+	iova_cache_get();
+
+	init_iova_domain(rid->iovad, granule,
+			 iova >> order, (iova + size - 1) >> order);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	if (domain->reserved_iova_cookie) {
+		ret = -EEXIST;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = rid;
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct reserved_iova_domain *rid;
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = NULL;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (!ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
new file mode 100644
index 0000000..01ec385
--- /dev/null
+++ b/include/linux/dma-reserved-iommu.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#ifndef __DMA_RESERVED_IOMMU_H
+#define __DMA_RESERVED_IOMMU_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+struct iommu_domain;
+
+#ifdef CONFIG_IOMMU_DMA_RESERVED
+
+/**
+ * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ * @iova: base iova address
+ * @size: iova window size
+ * @prot: protection attribute flags
+ * @order: page order
+ */
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order);
+
+/**
+ * iommu_free_reserved_iova_domain: free the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ */
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+
+#else
+
+static inline int
+iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				 dma_addr_t iova, size_t size, int prot,
+				 unsigned long order)
+{
+	return -ENOENT;
+}
+
+static inline void
+iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
+
+#endif	/* CONFIG_IOMMU_DMA_RESERVED */
+#endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
is set.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---
v7:
- fix locking
- add iova_cache_get/put
- static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
- introduce struct reserved_iova_domain to encapsulate prot info &
  add prot parameter in alloc_reserved_iova_domain

v5 -> v6:
- use spin lock instead of mutex

v3 -> v4:
- formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
  "iommu: add alloc/free_reserved_iova_domain"

v2 -> v3:
- remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
  static implementation in case CONFIG_IOMMU_API is not set

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/Kconfig              |  8 +++
 drivers/iommu/Makefile             |  1 +
 drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
 4 files changed, 167 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd1dc39..a5a1530 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,6 +74,12 @@ config IOMMU_DMA
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
 
+# IOMMU reserved IOVA mapping (MSI doorbell)
+config IOMMU_DMA_RESERVED
+	bool
+	select IOMMU_API
+	select IOMMU_IOVA
+
 config FSL_PAMU
 	bool "Freescale IOMMU support"
 	depends on PPC32
@@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
 config ARM_SMMU
 	bool "ARM Ltd. System MMU (SMMU) Support"
 	depends on (ARM64 || ARM) && MMU
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select ARM_DMA_USE_IOMMU if ARM
@@ -320,6 +327,7 @@ config ARM_SMMU
 config ARM_SMMU_V3
 	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
 	depends on ARM64 && PCI
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c6edb31..6c9ae99 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
 obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
+obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
new file mode 100644
index 0000000..2562af0
--- /dev/null
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -0,0 +1,99 @@
+/*
+ * Reserved IOVA Management
+ *
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/iommu.h>
+#include <linux/iova.h>
+
+struct reserved_iova_domain {
+	struct iova_domain *iovad;
+	int prot; /* iommu protection attributes to be obeyed */
+};
+
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order)
+{
+	unsigned long granule, mask, flags;
+	struct reserved_iova_domain *rid;
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
+	if (!rid)
+		return -ENOMEM;
+
+	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!rid->iovad) {
+		kfree(rid);
+		return -ENOMEM;
+	}
+
+	iova_cache_get();
+
+	init_iova_domain(rid->iovad, granule,
+			 iova >> order, (iova + size - 1) >> order);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	if (domain->reserved_iova_cookie) {
+		ret = -EEXIST;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = rid;
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct reserved_iova_domain *rid;
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = NULL;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (!ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
new file mode 100644
index 0000000..01ec385
--- /dev/null
+++ b/include/linux/dma-reserved-iommu.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#ifndef __DMA_RESERVED_IOMMU_H
+#define __DMA_RESERVED_IOMMU_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+struct iommu_domain;
+
+#ifdef CONFIG_IOMMU_DMA_RESERVED
+
+/**
+ * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ * @iova: base iova address
+ * @size: iova window size
+ * @prot: protection attribute flags
+ * @order: page order
+ */
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order);
+
+/**
+ * iommu_free_reserved_iova_domain: free the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ */
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+
+#else
+
+static inline int
+iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				 dma_addr_t iova, size_t size, int prot,
+				 unsigned long order)
+{
+	return -ENOENT;
+}
+
+static inline void
+iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
+
+#endif	/* CONFIG_IOMMU_DMA_RESERVED */
+#endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce alloc/free_reserved_iova_domain in the IOMMU API.
alloc_reserved_iova_domain initializes an iova domain at a given
iova base address and with a given size. This iova domain will
be used to allocate iova within that window. Those IOVAs will be reserved
for special purpose, typically MSI frame binding. Allocation function
within the reserved iova domain will be introduced in subsequent patches.

Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
is set.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v7:
- fix locking
- add iova_cache_get/put
- static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
- introduce struct reserved_iova_domain to encapsulate prot info &
  add prot parameter in alloc_reserved_iova_domain

v5 -> v6:
- use spin lock instead of mutex

v3 -> v4:
- formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
  "iommu: add alloc/free_reserved_iova_domain"

v2 -> v3:
- remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
  static implementation in case CONFIG_IOMMU_API is not set

v1 -> v2:
- moved from vfio API to IOMMU API
---
 drivers/iommu/Kconfig              |  8 +++
 drivers/iommu/Makefile             |  1 +
 drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
 4 files changed, 167 insertions(+)
 create mode 100644 drivers/iommu/dma-reserved-iommu.c
 create mode 100644 include/linux/dma-reserved-iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd1dc39..a5a1530 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,6 +74,12 @@ config IOMMU_DMA
 	select IOMMU_IOVA
 	select NEED_SG_DMA_LENGTH
 
+# IOMMU reserved IOVA mapping (MSI doorbell)
+config IOMMU_DMA_RESERVED
+	bool
+	select IOMMU_API
+	select IOMMU_IOVA
+
 config FSL_PAMU
 	bool "Freescale IOMMU support"
 	depends on PPC32
@@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
 config ARM_SMMU
 	bool "ARM Ltd. System MMU (SMMU) Support"
 	depends on (ARM64 || ARM) && MMU
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select ARM_DMA_USE_IOMMU if ARM
@@ -320,6 +327,7 @@ config ARM_SMMU
 config ARM_SMMU_V3
 	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
 	depends on ARM64 && PCI
+	select IOMMU_DMA_RESERVED
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index c6edb31..6c9ae99 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
 obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
+obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
new file mode 100644
index 0000000..2562af0
--- /dev/null
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -0,0 +1,99 @@
+/*
+ * Reserved IOVA Management
+ *
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/iommu.h>
+#include <linux/iova.h>
+
+struct reserved_iova_domain {
+	struct iova_domain *iovad;
+	int prot; /* iommu protection attributes to be obeyed */
+};
+
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order)
+{
+	unsigned long granule, mask, flags;
+	struct reserved_iova_domain *rid;
+	int ret = 0;
+
+	granule = 1UL << order;
+	mask = granule - 1;
+	if (iova & mask || (!size) || (size & mask))
+		return -EINVAL;
+
+	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
+	if (!rid)
+		return -ENOMEM;
+
+	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+	if (!rid->iovad) {
+		kfree(rid);
+		return -ENOMEM;
+	}
+
+	iova_cache_get();
+
+	init_iova_domain(rid->iovad, granule,
+			 iova >> order, (iova + size - 1) >> order);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	if (domain->reserved_iova_cookie) {
+		ret = -EEXIST;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = rid;
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
+
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+	struct reserved_iova_domain *rid;
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	domain->reserved_iova_cookie = NULL;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (!ret) {
+		put_iova_domain(rid->iovad);
+		kfree(rid->iovad);
+		kfree(rid);
+		iova_cache_put();
+	}
+}
+EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
new file mode 100644
index 0000000..01ec385
--- /dev/null
+++ b/include/linux/dma-reserved-iommu.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (c) 2015 Linaro Ltd.
+ *              www.linaro.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#ifndef __DMA_RESERVED_IOMMU_H
+#define __DMA_RESERVED_IOMMU_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+struct iommu_domain;
+
+#ifdef CONFIG_IOMMU_DMA_RESERVED
+
+/**
+ * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ * @iova: base iova address
+ * @size: iova window size
+ * @prot: protection attribute flags
+ * @order: page order
+ */
+int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				     dma_addr_t iova, size_t size, int prot,
+				     unsigned long order);
+
+/**
+ * iommu_free_reserved_iova_domain: free the reserved iova domain
+ *
+ * @domain: iommu domain handle
+ */
+void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
+
+#else
+
+static inline int
+iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+				 dma_addr_t iova, size_t size, int prot,
+				 unsigned long order)
+{
+	return -ENOENT;
+}
+
+static inline void
+iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
+
+#endif	/* CONFIG_IOMMU_DMA_RESERVED */
+#endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- add comment about @d->reserved_lock to be held

v3 -> v4:
- that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
---
 drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2562af0..f6fa18e 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -23,6 +23,69 @@ struct reserved_iova_domain {
 	int prot; /* iommu protection attributes to be obeyed */
 };
 
+struct iommu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct iommu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
+/* Reserved binding RB-tree manipulation */
+
+/* @d->reserved_lock must be held */
+static struct iommu_reserved_binding *find_reserved_binding(
+				    struct iommu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct iommu_reserved_binding *binding =
+			rb_entry(node, struct iommu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+/* @d->reserved_lock must be held */
+static void link_reserved_binding(struct iommu_domain *d,
+				  struct iommu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct iommu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct iommu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+/* @d->reserved_lock must be held */
+static void unlink_reserved_binding(struct iommu_domain *d,
+				    struct iommu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 				     dma_addr_t iova, size_t size, int prot,
 				     unsigned long order)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---
v5 -> v6:
- add comment about @d->reserved_lock to be held

v3 -> v4:
- that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
---
 drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2562af0..f6fa18e 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -23,6 +23,69 @@ struct reserved_iova_domain {
 	int prot; /* iommu protection attributes to be obeyed */
 };
 
+struct iommu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct iommu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
+/* Reserved binding RB-tree manipulation */
+
+/* @d->reserved_lock must be held */
+static struct iommu_reserved_binding *find_reserved_binding(
+				    struct iommu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct iommu_reserved_binding *binding =
+			rb_entry(node, struct iommu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+/* @d->reserved_lock must be held */
+static void link_reserved_binding(struct iommu_domain *d,
+				  struct iommu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct iommu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct iommu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+/* @d->reserved_lock must be held */
+static void unlink_reserved_binding(struct iommu_domain *d,
+				    struct iommu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 				     dma_addr_t iova, size_t size, int prot,
 				     unsigned long order)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

we will need to track which host physical addresses are mapped to
reserved IOVA. In that prospect we introduce a new RB tree indexed
by physical address. This RB tree only is used for reserved IOVA
bindings.

It is expected this RB tree will contain very few bindings. Those
generally correspond to single page mapping one MSI frame (GICv2m
frame or ITS GITS_TRANSLATER frame).

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v5 -> v6:
- add comment about @d->reserved_lock to be held

v3 -> v4:
- that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
---
 drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2562af0..f6fa18e 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -23,6 +23,69 @@ struct reserved_iova_domain {
 	int prot; /* iommu protection attributes to be obeyed */
 };
 
+struct iommu_reserved_binding {
+	struct kref		kref;
+	struct rb_node		node;
+	struct iommu_domain	*domain;
+	phys_addr_t		addr;
+	dma_addr_t		iova;
+	size_t			size;
+};
+
+/* Reserved binding RB-tree manipulation */
+
+/* @d->reserved_lock must be held */
+static struct iommu_reserved_binding *find_reserved_binding(
+				    struct iommu_domain *d,
+				    phys_addr_t start, size_t size)
+{
+	struct rb_node *node = d->reserved_binding_list.rb_node;
+
+	while (node) {
+		struct iommu_reserved_binding *binding =
+			rb_entry(node, struct iommu_reserved_binding, node);
+
+		if (start + size <= binding->addr)
+			node = node->rb_left;
+		else if (start >= binding->addr + binding->size)
+			node = node->rb_right;
+		else
+			return binding;
+	}
+
+	return NULL;
+}
+
+/* @d->reserved_lock must be held */
+static void link_reserved_binding(struct iommu_domain *d,
+				  struct iommu_reserved_binding *new)
+{
+	struct rb_node **link = &d->reserved_binding_list.rb_node;
+	struct rb_node *parent = NULL;
+	struct iommu_reserved_binding *binding;
+
+	while (*link) {
+		parent = *link;
+		binding = rb_entry(parent, struct iommu_reserved_binding,
+				   node);
+
+		if (new->addr + new->size <= binding->addr)
+			link = &(*link)->rb_left;
+		else
+			link = &(*link)->rb_right;
+	}
+
+	rb_link_node(&new->node, parent, link);
+	rb_insert_color(&new->node, &d->reserved_binding_list);
+}
+
+/* @d->reserved_lock must be held */
+static void unlink_reserved_binding(struct iommu_domain *d,
+				    struct iommu_reserved_binding *old)
+{
+	rb_erase(&old->node, &d->reserved_binding_list);
+}
+
 int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 				     dma_addr_t iova, size_t size, int prot,
 				     unsigned long order)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This patch introduces iommu_get/put_reserved_iova.

iommu_get_reserved_iova allows to iommu map a contiguous physical region
onto a reserved contiguous IOVA region. The physical region base address
does not need to be iommu page size aligned. iova pages are allocated and
mapped so that they cover all the physical region. This mapping is
tracked as a whole (and cannot be split) in an RB tree indexed by PA.

In case a mapping already exists for the physical pages, the IOVA mapped
to the PA base is directly returned.

Each time the get succeeds a binding ref count is incremented.

iommu_put_reserved_iova decrements the ref count and when this latter
is null, the mapping is destroyed and the iovas are released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v7:
- change title and rework commit message with new name of the functions
  and size parameter
- fix locking
- rework header doc comments
- put now takes a phys_addr_t
- check prot argument against reserved_iova_domain prot flags

v5 -> v6:
- revisit locking with spin_lock instead of mutex
- do not kref_get on 1st get
- add size parameter to the get function following Marc's request
- use the iova domain shift instead of using the smallest supported page size

v3 -> v4:
- formerly in iommu: iommu_get/put_single_reserved &
  iommu/arm-smmu: implement iommu_get/put_single_reserved
- Attempted to address Marc's doubts about missing size/alignment
  at VFIO level (user-space knows the IOMMU page size and the number
  of IOVA pages to provision)

v2 -> v3:
- remove static implementation of iommu_get_single_reserved &
  iommu_put_single_reserved when CONFIG_IOMMU_API is not set

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h |  38 ++++++++++
 2 files changed, 188 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index f6fa18e..426d339 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -135,6 +135,22 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
 
+/* called with domain's reserved_lock held */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct iommu_reserved_binding *b =
+		container_of(kref, struct iommu_reserved_binding, kref);
+	struct iommu_domain *d = b->domain;
+	struct reserved_iova_domain *rid =
+		(struct reserved_iova_domain *)d->reserved_iova_cookie;
+	unsigned long order;
+
+	order = iova_shift(rid->iovad);
+	free_iova(rid->iovad, b->iova >> order);
+	unlink_reserved_binding(d, b);
+	kfree(b);
+}
+
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 	struct reserved_iova_domain *rid;
@@ -160,3 +176,137 @@ unlock:
 	}
 }
 EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			      phys_addr_t addr, size_t size, int prot,
+			      dma_addr_t *iova)
+{
+	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	struct iommu_reserved_binding *b, *newb;
+	size_t iommu_page_size, binding_size;
+	phys_addr_t aligned_base, offset;
+	struct reserved_iova_domain *rid;
+	struct iova_domain *iovad;
+	struct iova *p_iova;
+	int ret = -EINVAL;
+
+	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
+	if (!newb)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto free_newb;
+
+	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
+	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
+		goto free_newb;
+
+	iovad = rid->iovad;
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b) {
+		*iova = b->iova + offset + aligned_base - b->addr;
+		kref_get(&b->kref);
+		ret = 0;
+		goto free_newb;
+	}
+
+	p_iova = alloc_iova(iovad, nb_iommu_pages,
+			    iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto free_newb;
+	}
+
+	*iova = iova_dma_addr(iovad, p_iova);
+
+	/* unlock to call iommu_map which is not guaranteed to be atomic */
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+
+	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
+	if (!rid || (rid->iovad != iovad)) {
+		/* reserved iova domain was destroyed in our back */
+		ret = -EBUSY;
+		goto free_newb; /* iova already released */
+	}
+
+	/* no change in iova reserved domain but iommu_map failed */
+	if (ret)
+		goto free_iova;
+
+	/* everything is fine, add in the new node in the rb tree */
+	kref_init(&newb->kref);
+	newb->domain = domain;
+	newb->addr = aligned_base;
+	newb->iova = *iova;
+	newb->size = binding_size;
+
+	link_reserved_binding(domain, newb);
+
+	*iova += offset;
+	goto unlock;
+
+free_iova:
+	free_iova(rid->iovad, p_iova->pfn_lo);
+free_newb:
+	kfree(newb);
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
+
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
+{
+	phys_addr_t aligned_addr, page_size, mask;
+	struct iommu_reserved_binding *b;
+	struct reserved_iova_domain *rid;
+	unsigned long order, flags;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+	size_t size;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto unlock;
+
+	order = iova_shift(rid->iovad);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+	aligned_addr = addr & ~mask;
+
+	b = find_reserved_binding(domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+
+	iova = b->iova;
+	size = b->size;
+	d = b->domain;
+
+	ret = kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret)
+		iommu_unmap(d, iova, size);
+}
+EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
+
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 01ec385..8722131 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
  */
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 
+/**
+ * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
+ * map them to the physical range defined by @addr and @size.
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address to bind
+ * @size: size of the binding
+ * @prot: mapping protection attribute
+ * @iova: returned iova
+ *
+ * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
+ * where order corresponds to the reserved iova domain order.
+ * This mapping is tracked and reference counted with the minimal granularity
+ * of @size.
+ */
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			    phys_addr_t addr, size_t size, int prot,
+			    dma_addr_t *iova);
+
+/**
+ * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address whose binding ref count is decremented
+ *
+ * if the binding ref count is null, destroy the reserved mapping
+ */
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
 #else
 
 static inline int
@@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 static inline void
 iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
 
+static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
+					  phys_addr_t addr, size_t size,
+					  int prot, dma_addr_t *iova)
+{
+	return -ENOENT;
+}
+
+static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
+					   phys_addr_t addr) {}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

This patch introduces iommu_get/put_reserved_iova.

iommu_get_reserved_iova allows to iommu map a contiguous physical region
onto a reserved contiguous IOVA region. The physical region base address
does not need to be iommu page size aligned. iova pages are allocated and
mapped so that they cover all the physical region. This mapping is
tracked as a whole (and cannot be split) in an RB tree indexed by PA.

In case a mapping already exists for the physical pages, the IOVA mapped
to the PA base is directly returned.

Each time the get succeeds a binding ref count is incremented.

iommu_put_reserved_iova decrements the ref count and when this latter
is null, the mapping is destroyed and the iovas are released.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---
v7:
- change title and rework commit message with new name of the functions
  and size parameter
- fix locking
- rework header doc comments
- put now takes a phys_addr_t
- check prot argument against reserved_iova_domain prot flags

v5 -> v6:
- revisit locking with spin_lock instead of mutex
- do not kref_get on 1st get
- add size parameter to the get function following Marc's request
- use the iova domain shift instead of using the smallest supported page size

v3 -> v4:
- formerly in iommu: iommu_get/put_single_reserved &
  iommu/arm-smmu: implement iommu_get/put_single_reserved
- Attempted to address Marc's doubts about missing size/alignment
  at VFIO level (user-space knows the IOMMU page size and the number
  of IOVA pages to provision)

v2 -> v3:
- remove static implementation of iommu_get_single_reserved &
  iommu_put_single_reserved when CONFIG_IOMMU_API is not set

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h |  38 ++++++++++
 2 files changed, 188 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index f6fa18e..426d339 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -135,6 +135,22 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
 
+/* called with domain's reserved_lock held */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct iommu_reserved_binding *b =
+		container_of(kref, struct iommu_reserved_binding, kref);
+	struct iommu_domain *d = b->domain;
+	struct reserved_iova_domain *rid =
+		(struct reserved_iova_domain *)d->reserved_iova_cookie;
+	unsigned long order;
+
+	order = iova_shift(rid->iovad);
+	free_iova(rid->iovad, b->iova >> order);
+	unlink_reserved_binding(d, b);
+	kfree(b);
+}
+
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 	struct reserved_iova_domain *rid;
@@ -160,3 +176,137 @@ unlock:
 	}
 }
 EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			      phys_addr_t addr, size_t size, int prot,
+			      dma_addr_t *iova)
+{
+	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	struct iommu_reserved_binding *b, *newb;
+	size_t iommu_page_size, binding_size;
+	phys_addr_t aligned_base, offset;
+	struct reserved_iova_domain *rid;
+	struct iova_domain *iovad;
+	struct iova *p_iova;
+	int ret = -EINVAL;
+
+	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
+	if (!newb)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto free_newb;
+
+	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
+	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
+		goto free_newb;
+
+	iovad = rid->iovad;
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b) {
+		*iova = b->iova + offset + aligned_base - b->addr;
+		kref_get(&b->kref);
+		ret = 0;
+		goto free_newb;
+	}
+
+	p_iova = alloc_iova(iovad, nb_iommu_pages,
+			    iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto free_newb;
+	}
+
+	*iova = iova_dma_addr(iovad, p_iova);
+
+	/* unlock to call iommu_map which is not guaranteed to be atomic */
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+
+	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
+	if (!rid || (rid->iovad != iovad)) {
+		/* reserved iova domain was destroyed in our back */
+		ret = -EBUSY;
+		goto free_newb; /* iova already released */
+	}
+
+	/* no change in iova reserved domain but iommu_map failed */
+	if (ret)
+		goto free_iova;
+
+	/* everything is fine, add in the new node in the rb tree */
+	kref_init(&newb->kref);
+	newb->domain = domain;
+	newb->addr = aligned_base;
+	newb->iova = *iova;
+	newb->size = binding_size;
+
+	link_reserved_binding(domain, newb);
+
+	*iova += offset;
+	goto unlock;
+
+free_iova:
+	free_iova(rid->iovad, p_iova->pfn_lo);
+free_newb:
+	kfree(newb);
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
+
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
+{
+	phys_addr_t aligned_addr, page_size, mask;
+	struct iommu_reserved_binding *b;
+	struct reserved_iova_domain *rid;
+	unsigned long order, flags;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+	size_t size;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto unlock;
+
+	order = iova_shift(rid->iovad);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+	aligned_addr = addr & ~mask;
+
+	b = find_reserved_binding(domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+
+	iova = b->iova;
+	size = b->size;
+	d = b->domain;
+
+	ret = kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret)
+		iommu_unmap(d, iova, size);
+}
+EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
+
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 01ec385..8722131 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
  */
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 
+/**
+ * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
+ * map them to the physical range defined by @addr and @size.
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address to bind
+ * @size: size of the binding
+ * @prot: mapping protection attribute
+ * @iova: returned iova
+ *
+ * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
+ * where order corresponds to the reserved iova domain order.
+ * This mapping is tracked and reference counted with the minimal granularity
+ * of @size.
+ */
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			    phys_addr_t addr, size_t size, int prot,
+			    dma_addr_t *iova);
+
+/**
+ * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address whose binding ref count is decremented
+ *
+ * if the binding ref count is null, destroy the reserved mapping
+ */
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
 #else
 
 static inline int
@@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 static inline void
 iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
 
+static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
+					  phys_addr_t addr, size_t size,
+					  int prot, dma_addr_t *iova)
+{
+	return -ENOENT;
+}
+
+static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
+					   phys_addr_t addr) {}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

This patch introduces iommu_get/put_reserved_iova.

iommu_get_reserved_iova allows to iommu map a contiguous physical region
onto a reserved contiguous IOVA region. The physical region base address
does not need to be iommu page size aligned. iova pages are allocated and
mapped so that they cover all the physical region. This mapping is
tracked as a whole (and cannot be split) in an RB tree indexed by PA.

In case a mapping already exists for the physical pages, the IOVA mapped
to the PA base is directly returned.

Each time the get succeeds a binding ref count is incremented.

iommu_put_reserved_iova decrements the ref count and when this latter
is null, the mapping is destroyed and the iovas are released.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v7:
- change title and rework commit message with new name of the functions
  and size parameter
- fix locking
- rework header doc comments
- put now takes a phys_addr_t
- check prot argument against reserved_iova_domain prot flags

v5 -> v6:
- revisit locking with spin_lock instead of mutex
- do not kref_get on 1st get
- add size parameter to the get function following Marc's request
- use the iova domain shift instead of using the smallest supported page size

v3 -> v4:
- formerly in iommu: iommu_get/put_single_reserved &
  iommu/arm-smmu: implement iommu_get/put_single_reserved
- Attempted to address Marc's doubts about missing size/alignment
  at VFIO level (user-space knows the IOMMU page size and the number
  of IOVA pages to provision)

v2 -> v3:
- remove static implementation of iommu_get_single_reserved &
  iommu_put_single_reserved when CONFIG_IOMMU_API is not set

v1 -> v2:
- previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
---
 drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h |  38 ++++++++++
 2 files changed, 188 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index f6fa18e..426d339 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -135,6 +135,22 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
 
+/* called with domain's reserved_lock held */
+static void reserved_binding_release(struct kref *kref)
+{
+	struct iommu_reserved_binding *b =
+		container_of(kref, struct iommu_reserved_binding, kref);
+	struct iommu_domain *d = b->domain;
+	struct reserved_iova_domain *rid =
+		(struct reserved_iova_domain *)d->reserved_iova_cookie;
+	unsigned long order;
+
+	order = iova_shift(rid->iovad);
+	free_iova(rid->iovad, b->iova >> order);
+	unlink_reserved_binding(d, b);
+	kfree(b);
+}
+
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 {
 	struct reserved_iova_domain *rid;
@@ -160,3 +176,137 @@ unlock:
 	}
 }
 EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
+
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			      phys_addr_t addr, size_t size, int prot,
+			      dma_addr_t *iova)
+{
+	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	struct iommu_reserved_binding *b, *newb;
+	size_t iommu_page_size, binding_size;
+	phys_addr_t aligned_base, offset;
+	struct reserved_iova_domain *rid;
+	struct iova_domain *iovad;
+	struct iova *p_iova;
+	int ret = -EINVAL;
+
+	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
+	if (!newb)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto free_newb;
+
+	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
+	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
+		goto free_newb;
+
+	iovad = rid->iovad;
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b) {
+		*iova = b->iova + offset + aligned_base - b->addr;
+		kref_get(&b->kref);
+		ret = 0;
+		goto free_newb;
+	}
+
+	p_iova = alloc_iova(iovad, nb_iommu_pages,
+			    iovad->dma_32bit_pfn, true);
+	if (!p_iova) {
+		ret = -ENOMEM;
+		goto free_newb;
+	}
+
+	*iova = iova_dma_addr(iovad, p_iova);
+
+	/* unlock to call iommu_map which is not guaranteed to be atomic */
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+
+	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
+	if (!rid || (rid->iovad != iovad)) {
+		/* reserved iova domain was destroyed in our back */
+		ret = -EBUSY;
+		goto free_newb; /* iova already released */
+	}
+
+	/* no change in iova reserved domain but iommu_map failed */
+	if (ret)
+		goto free_iova;
+
+	/* everything is fine, add in the new node in the rb tree */
+	kref_init(&newb->kref);
+	newb->domain = domain;
+	newb->addr = aligned_base;
+	newb->iova = *iova;
+	newb->size = binding_size;
+
+	link_reserved_binding(domain, newb);
+
+	*iova += offset;
+	goto unlock;
+
+free_iova:
+	free_iova(rid->iovad, p_iova->pfn_lo);
+free_newb:
+	kfree(newb);
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
+
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
+{
+	phys_addr_t aligned_addr, page_size, mask;
+	struct iommu_reserved_binding *b;
+	struct reserved_iova_domain *rid;
+	unsigned long order, flags;
+	struct iommu_domain *d;
+	dma_addr_t iova;
+	size_t size;
+	int ret = 0;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
+	if (!rid)
+		goto unlock;
+
+	order = iova_shift(rid->iovad);
+	page_size = (uint64_t)1 << order;
+	mask = page_size - 1;
+	aligned_addr = addr & ~mask;
+
+	b = find_reserved_binding(domain, aligned_addr, page_size);
+	if (!b)
+		goto unlock;
+
+	iova = b->iova;
+	size = b->size;
+	d = b->domain;
+
+	ret = kref_put(&b->kref, reserved_binding_release);
+
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	if (ret)
+		iommu_unmap(d, iova, size);
+}
+EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
+
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 01ec385..8722131 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
  */
 void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
 
+/**
+ * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
+ * map them to the physical range defined by @addr and @size.
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address to bind
+ * @size: size of the binding
+ * @prot: mapping protection attribute
+ * @iova: returned iova
+ *
+ * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
+ * where order corresponds to the reserved iova domain order.
+ * This mapping is tracked and reference counted with the minimal granularity
+ * of @size.
+ */
+int iommu_get_reserved_iova(struct iommu_domain *domain,
+			    phys_addr_t addr, size_t size, int prot,
+			    dma_addr_t *iova);
+
+/**
+ * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
+ *
+ * @domain: iommu domain handle
+ * @addr: physical address whose binding ref count is decremented
+ *
+ * if the binding ref count is null, destroy the reserved mapping
+ */
+void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
 #else
 
 static inline int
@@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
 static inline void
 iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
 
+static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
+					  phys_addr_t addr, size_t size,
+					  int prot, dma_addr_t *iova)
+{
+	return -ENOENT;
+}
+
+static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
+					   phys_addr_t addr) {}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
  2016-04-19 16:56 ` Eric Auger
@ 2016-04-19 16:56   ` Eric Auger
  -1 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Now reserved bindings can exist, destroy them when destroying
the reserved iova domain. iommu_map is not supposed to be atomic,
hence the extra complexity in the locking.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v6 -> v7:
- remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
  destroy the bindings in iommu_free_reserved_iova_domain

v5 -> v6:
- use spin_lock instead of mutex

v3 -> v4:
- previously "iommu/arm-smmu: relinquish reserved resources on
  domain deletion"
---
 drivers/iommu/dma-reserved-iommu.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 426d339..2522235 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 	unsigned long flags;
 	int ret = 0;
 
-	spin_lock_irqsave(&domain->reserved_lock, flags);
-
-	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
-	if (!rid) {
-		ret = -EINVAL;
-		goto unlock;
+	while (1) {
+		struct iommu_reserved_binding *b;
+		struct rb_node *node;
+		dma_addr_t iova;
+		size_t size;
+
+		spin_lock_irqsave(&domain->reserved_lock, flags);
+
+		rid = (struct reserved_iova_domain *)
+				domain->reserved_iova_cookie;
+		if (!rid) {
+			ret = -EINVAL;
+			goto unlock;
+		}
+
+		node = rb_first(&domain->reserved_binding_list);
+		if (!node)
+			break;
+		b = rb_entry(node, struct iommu_reserved_binding, node);
+
+		iova = b->iova;
+		size = b->size;
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+		spin_unlock_irqrestore(&domain->reserved_lock, flags);
+		iommu_unmap(domain, iova, size);
 	}
 
+	domain->reserved_binding_list = RB_ROOT;
 	domain->reserved_iova_cookie = NULL;
 unlock:
 	spin_unlock_irqrestore(&domain->reserved_lock, flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

Now reserved bindings can exist, destroy them when destroying
the reserved iova domain. iommu_map is not supposed to be atomic,
hence the extra complexity in the locking.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---
v6 -> v7:
- remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
  destroy the bindings in iommu_free_reserved_iova_domain

v5 -> v6:
- use spin_lock instead of mutex

v3 -> v4:
- previously "iommu/arm-smmu: relinquish reserved resources on
  domain deletion"
---
 drivers/iommu/dma-reserved-iommu.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 426d339..2522235 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
 	unsigned long flags;
 	int ret = 0;
 
-	spin_lock_irqsave(&domain->reserved_lock, flags);
-
-	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
-	if (!rid) {
-		ret = -EINVAL;
-		goto unlock;
+	while (1) {
+		struct iommu_reserved_binding *b;
+		struct rb_node *node;
+		dma_addr_t iova;
+		size_t size;
+
+		spin_lock_irqsave(&domain->reserved_lock, flags);
+
+		rid = (struct reserved_iova_domain *)
+				domain->reserved_iova_cookie;
+		if (!rid) {
+			ret = -EINVAL;
+			goto unlock;
+		}
+
+		node = rb_first(&domain->reserved_binding_list);
+		if (!node)
+			break;
+		b = rb_entry(node, struct iommu_reserved_binding, node);
+
+		iova = b->iova;
+		size = b->size;
+
+		while (!kref_put(&b->kref, reserved_binding_release))
+			;
+		spin_unlock_irqrestore(&domain->reserved_lock, flags);
+		iommu_unmap(domain, iova, size);
 	}
 
+	domain->reserved_binding_list = RB_ROOT;
 	domain->reserved_iova_cookie = NULL;
 unlock:
 	spin_unlock_irqrestore(&domain->reserved_lock, flags);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

This function checks whether
- the device emitting the MSI belongs to a non default iommu domain
- the iommu domain requires the MSI address to be mapped.

If those conditions are met, the function returns the iommu domain
to be used for mapping the MSI doorbell; else it returns NULL.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2522235..907a17f 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -17,6 +17,7 @@
 
 #include <linux/iommu.h>
 #include <linux/iova.h>
+#include <linux/msi.h>
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -332,3 +333,21 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
 
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	struct device *dev;
+	struct iommu_domain *d;
+
+	dev = msi_desc_to_dev(desc);
+
+	d = iommu_get_domain_for_dev(dev);
+
+	if (!d || (d->type == IOMMU_DOMAIN_DMA))
+		return NULL;
+
+	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
+		return NULL;
+
+	return d;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8722131..8373929 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -19,6 +19,7 @@
 #include <linux/kernel.h>
 
 struct iommu_domain;
+struct msi_desc;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
  * if the binding ref count is null, destroy the reserved mapping
  */
 void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
+
+/**
+ * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
+ * upstream to an IOMMU and this IOMMU translates the MSI transaction,
+ * this function returns the iommu domain the MSI doorbell address must be
+ * mapped in. Else it returns NULL.
+ *
+ * @desc: msi desc handle
+ */
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
+
 #else
 
 static inline int
@@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
 static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
 					   phys_addr_t addr) {}
 
+static inline struct iommu_domain *
+iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	return NULL;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

This function checks whether
- the device emitting the MSI belongs to a non default iommu domain
- the iommu domain requires the MSI address to be mapped.

If those conditions are met, the function returns the iommu domain
to be used for mapping the MSI doorbell; else it returns NULL.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2522235..907a17f 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -17,6 +17,7 @@
 
 #include <linux/iommu.h>
 #include <linux/iova.h>
+#include <linux/msi.h>
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -332,3 +333,21 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
 
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	struct device *dev;
+	struct iommu_domain *d;
+
+	dev = msi_desc_to_dev(desc);
+
+	d = iommu_get_domain_for_dev(dev);
+
+	if (!d || (d->type == IOMMU_DOMAIN_DMA))
+		return NULL;
+
+	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
+		return NULL;
+
+	return d;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8722131..8373929 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -19,6 +19,7 @@
 #include <linux/kernel.h>
 
 struct iommu_domain;
+struct msi_desc;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
  * if the binding ref count is null, destroy the reserved mapping
  */
 void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
+
+/**
+ * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
+ * upstream to an IOMMU and this IOMMU translates the MSI transaction,
+ * this function returns the iommu domain the MSI doorbell address must be
+ * mapped in. Else it returns NULL.
+ *
+ * @desc: msi desc handle
+ */
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
+
 #else
 
 static inline int
@@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
 static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
 					   phys_addr_t addr) {}
 
+static inline struct iommu_domain *
+iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	return NULL;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

This function checks whether
- the device emitting the MSI belongs to a non default iommu domain
- the iommu domain requires the MSI address to be mapped.

If those conditions are met, the function returns the iommu domain
to be used for mapping the MSI doorbell; else it returns NULL.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 2522235..907a17f 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -17,6 +17,7 @@
 
 #include <linux/iommu.h>
 #include <linux/iova.h>
+#include <linux/msi.h>
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -332,3 +333,21 @@ unlock:
 }
 EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
 
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	struct device *dev;
+	struct iommu_domain *d;
+
+	dev = msi_desc_to_dev(desc);
+
+	d = iommu_get_domain_for_dev(dev);
+
+	if (!d || (d->type == IOMMU_DOMAIN_DMA))
+		return NULL;
+
+	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
+		return NULL;
+
+	return d;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8722131..8373929 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -19,6 +19,7 @@
 #include <linux/kernel.h>
 
 struct iommu_domain;
+struct msi_desc;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
  * if the binding ref count is null, destroy the reserved mapping
  */
 void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
+
+/**
+ * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
+ * upstream to an IOMMU and this IOMMU translates the MSI transaction,
+ * this function returns the iommu domain the MSI doorbell address must be
+ * mapped in. Else it returns NULL.
+ *
+ * @desc: msi desc handle
+ */
+struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
+
 #else
 
 static inline int
@@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
 static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
 					   phys_addr_t addr) {}
 
+static inline struct iommu_domain *
+iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
+{
+	return NULL;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Introduce iommu_msi_mapping_translate_msg whose role consists in
detecting whether the device's MSIs must to be mapped into an IOMMU.
It case it must, the function overrides the MSI msg originally composed
and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
IOVA. In case the corresponding PA region is not found, the function
returns an error.

This function is likely to be called in some code that cannot sleep. This
is the reason why the allocation/mapping is done separately.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v7: creation
---
 drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 907a17f..603ee45 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -18,6 +18,14 @@
 #include <linux/iommu.h>
 #include <linux/iova.h>
 #include <linux/msi.h>
+#include <linux/irq.h>
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define msg_to_phys_addr(msg) \
+	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
+#else
+#define msg_to_phys_addr(msg)	((msg)->address_lo)
+#endif
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return d;
 }
 EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
+
+static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
+				    phys_addr_t addr, size_t size)
+{
+	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	size_t iommu_page_size, binding_size;
+	struct iommu_reserved_binding *b;
+	phys_addr_t aligned_base, offset;
+	dma_addr_t iova = DMA_ERROR_CODE;
+	struct iova_domain *iovad;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
+
+	if (!iovad)
+		goto unlock;
+
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b && (b->addr <= aligned_base) &&
+		(aligned_base + binding_size <=  b->addr + b->size))
+		iova = b->iova + offset + aligned_base - b->addr;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return iova;
+}
+
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct iommu_domain *d;
+	struct msi_desc *desc;
+	dma_addr_t iova;
+
+	desc = irq_data_get_msi_desc(data);
+	if (!desc)
+		return -EINVAL;
+
+	d = iommu_msi_mapping_desc_to_domain(desc);
+	if (!d)
+		return 0;
+
+	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
+					sizeof(phys_addr_t));
+
+	if (iova == DMA_ERROR_CODE)
+		return -EINVAL;
+
+	msg->address_lo = lower_32_bits(iova);
+	msg->address_hi = upper_32_bits(iova);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8373929..04e1912f 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -20,6 +20,8 @@
 
 struct iommu_domain;
 struct msi_desc;
+struct irq_data;
+struct msi_msg;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
  */
 struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
 
+/**
+ * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
+ * by an IOMMU, the msg address must be an IOVA instead of a physical address.
+ * This function overwrites the original MSI message containing the doorbell
+ * physical address, result of the primary composition, with the doorbell IOVA.
+ *
+ * The doorbell physical address must be bound previously to an IOVA using
+ * iommu_get_reserved_iova
+ *
+ * @data: irq data handle
+ * @msg: original msi message containing the PA to be overwritten with
+ * the IOVA
+ *
+ * return 0 if the MSI does not need to be mapped or when the PA/IOVA
+ * were successfully swapped; return -EINVAL if the addresses need
+ * to be swapped but not IOMMU binding is found
+ */
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
+
 #else
 
 static inline int
@@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return NULL;
 }
 
+static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
+						  struct msi_msg *msg)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Introduce iommu_msi_mapping_translate_msg whose role consists in
detecting whether the device's MSIs must to be mapped into an IOMMU.
It case it must, the function overrides the MSI msg originally composed
and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
IOVA. In case the corresponding PA region is not found, the function
returns an error.

This function is likely to be called in some code that cannot sleep. This
is the reason why the allocation/mapping is done separately.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v7: creation
---
 drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 907a17f..603ee45 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -18,6 +18,14 @@
 #include <linux/iommu.h>
 #include <linux/iova.h>
 #include <linux/msi.h>
+#include <linux/irq.h>
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define msg_to_phys_addr(msg) \
+	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
+#else
+#define msg_to_phys_addr(msg)	((msg)->address_lo)
+#endif
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return d;
 }
 EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
+
+static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
+				    phys_addr_t addr, size_t size)
+{
+	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	size_t iommu_page_size, binding_size;
+	struct iommu_reserved_binding *b;
+	phys_addr_t aligned_base, offset;
+	dma_addr_t iova = DMA_ERROR_CODE;
+	struct iova_domain *iovad;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
+
+	if (!iovad)
+		goto unlock;
+
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b && (b->addr <= aligned_base) &&
+		(aligned_base + binding_size <=  b->addr + b->size))
+		iova = b->iova + offset + aligned_base - b->addr;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return iova;
+}
+
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct iommu_domain *d;
+	struct msi_desc *desc;
+	dma_addr_t iova;
+
+	desc = irq_data_get_msi_desc(data);
+	if (!desc)
+		return -EINVAL;
+
+	d = iommu_msi_mapping_desc_to_domain(desc);
+	if (!d)
+		return 0;
+
+	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
+					sizeof(phys_addr_t));
+
+	if (iova == DMA_ERROR_CODE)
+		return -EINVAL;
+
+	msg->address_lo = lower_32_bits(iova);
+	msg->address_hi = upper_32_bits(iova);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8373929..04e1912f 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -20,6 +20,8 @@
 
 struct iommu_domain;
 struct msi_desc;
+struct irq_data;
+struct msi_msg;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
  */
 struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
 
+/**
+ * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
+ * by an IOMMU, the msg address must be an IOVA instead of a physical address.
+ * This function overwrites the original MSI message containing the doorbell
+ * physical address, result of the primary composition, with the doorbell IOVA.
+ *
+ * The doorbell physical address must be bound previously to an IOVA using
+ * iommu_get_reserved_iova
+ *
+ * @data: irq data handle
+ * @msg: original msi message containing the PA to be overwritten with
+ * the IOVA
+ *
+ * return 0 if the MSI does not need to be mapped or when the PA/IOVA
+ * were successfully swapped; return -EINVAL if the addresses need
+ * to be swapped but not IOMMU binding is found
+ */
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
+
 #else
 
 static inline int
@@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return NULL;
 }
 
+static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
+						  struct msi_msg *msg)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

Introduce iommu_msi_mapping_translate_msg whose role consists in
detecting whether the device's MSIs must to be mapped into an IOMMU.
It case it must, the function overrides the MSI msg originally composed
and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
IOVA. In case the corresponding PA region is not found, the function
returns an error.

This function is likely to be called in some code that cannot sleep. This
is the reason why the allocation/mapping is done separately.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v7: creation
---
 drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
index 907a17f..603ee45 100644
--- a/drivers/iommu/dma-reserved-iommu.c
+++ b/drivers/iommu/dma-reserved-iommu.c
@@ -18,6 +18,14 @@
 #include <linux/iommu.h>
 #include <linux/iova.h>
 #include <linux/msi.h>
+#include <linux/irq.h>
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define msg_to_phys_addr(msg) \
+	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
+#else
+#define msg_to_phys_addr(msg)	((msg)->address_lo)
+#endif
 
 struct reserved_iova_domain {
 	struct iova_domain *iovad;
@@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return d;
 }
 EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
+
+static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
+				    phys_addr_t addr, size_t size)
+{
+	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
+	size_t iommu_page_size, binding_size;
+	struct iommu_reserved_binding *b;
+	phys_addr_t aligned_base, offset;
+	dma_addr_t iova = DMA_ERROR_CODE;
+	struct iova_domain *iovad;
+
+	spin_lock_irqsave(&domain->reserved_lock, flags);
+
+	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
+
+	if (!iovad)
+		goto unlock;
+
+	order = iova_shift(iovad);
+	base_pfn = addr >> order;
+	end_pfn = (addr + size - 1) >> order;
+	aligned_base = base_pfn << order;
+	offset = addr - aligned_base;
+	nb_iommu_pages = end_pfn - base_pfn + 1;
+	iommu_page_size = 1 << order;
+	binding_size = nb_iommu_pages * iommu_page_size;
+
+	b = find_reserved_binding(domain, aligned_base, binding_size);
+	if (b && (b->addr <= aligned_base) &&
+		(aligned_base + binding_size <=  b->addr + b->size))
+		iova = b->iova + offset + aligned_base - b->addr;
+unlock:
+	spin_unlock_irqrestore(&domain->reserved_lock, flags);
+	return iova;
+}
+
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct iommu_domain *d;
+	struct msi_desc *desc;
+	dma_addr_t iova;
+
+	desc = irq_data_get_msi_desc(data);
+	if (!desc)
+		return -EINVAL;
+
+	d = iommu_msi_mapping_desc_to_domain(desc);
+	if (!d)
+		return 0;
+
+	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
+					sizeof(phys_addr_t));
+
+	if (iova == DMA_ERROR_CODE)
+		return -EINVAL;
+
+	msg->address_lo = lower_32_bits(iova);
+	msg->address_hi = upper_32_bits(iova);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
index 8373929..04e1912f 100644
--- a/include/linux/dma-reserved-iommu.h
+++ b/include/linux/dma-reserved-iommu.h
@@ -20,6 +20,8 @@
 
 struct iommu_domain;
 struct msi_desc;
+struct irq_data;
+struct msi_msg;
 
 #ifdef CONFIG_IOMMU_DMA_RESERVED
 
@@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
  */
 struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
 
+/**
+ * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
+ * by an IOMMU, the msg address must be an IOVA instead of a physical address.
+ * This function overwrites the original MSI message containing the doorbell
+ * physical address, result of the primary composition, with the doorbell IOVA.
+ *
+ * The doorbell physical address must be bound previously to an IOVA using
+ * iommu_get_reserved_iova
+ *
+ * @data: irq data handle
+ * @msg: original msi message containing the PA to be overwritten with
+ * the IOVA
+ *
+ * return 0 if the MSI does not need to be mapped or when the PA/IOVA
+ * were successfully swapped; return -EINVAL if the addresses need
+ * to be swapped but not IOMMU binding is found
+ */
+int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
+
 #else
 
 static inline int
@@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
 	return NULL;
 }
 
+static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
+						  struct msi_msg *msg)
+{
+	return 0;
+}
+
 #endif	/* CONFIG_IOMMU_DMA_RESERVED */
 #endif	/* __DMA_RESERVED_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, marc.zyngier, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

When the domain gets destroyed, let's make sure all reserved iova
resources get released.

Choice is made to put that call in arm-smmu(-v3).c to do something similar
to what was done for iommu_put_dma_cookie.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v7: new
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index a077a35..afd0dac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -22,6 +22,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
 #include <linux/iommu.h>
@@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8cd7b8a..492339f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -30,6 +30,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
@@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	arm_smmu_destroy_domain_context(domain);
 	kfree(smmu_domain);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, eric.auger-QSEj5FYQhm4dnm+yROfE0A,
	robin.murphy-5wv7dgnIgG8, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA,
	will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	tglx-hfZtesqFncYOwBW4kG4KsQ, jason-NLaQJdtUoK4Be96aLqz0jA,
	marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

When the domain gets destroyed, let's make sure all reserved iova
resources get released.

Choice is made to put that call in arm-smmu(-v3).c to do something similar
to what was done for iommu_put_dma_cookie.

Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

---

v7: new
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index a077a35..afd0dac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -22,6 +22,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
 #include <linux/iommu.h>
@@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8cd7b8a..492339f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -30,6 +30,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
@@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	arm_smmu_destroy_domain_context(domain);
 	kfree(smmu_domain);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-19 16:56   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-19 16:56 UTC (permalink / raw)
  To: linux-arm-kernel

When the domain gets destroyed, let's make sure all reserved iova
resources get released.

Choice is made to put that call in arm-smmu(-v3).c to do something similar
to what was done for iommu_put_dma_cookie.

Signed-off-by: Eric Auger <eric.auger@linaro.org>

---

v7: new
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index a077a35..afd0dac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -22,6 +22,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
 #include <linux/iommu.h>
@@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
 
 	/* Free the CD and ASID, if we allocated them */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8cd7b8a..492339f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -30,6 +30,7 @@
 
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
+#include <linux/dma-reserved-iommu.h>
 #include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/interrupt.h>
@@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	 * already been detached.
 	 */
 	iommu_put_dma_cookie(domain);
+	iommu_free_reserved_iova_domain(domain);
 	arm_smmu_destroy_domain_context(domain);
 	kfree(smmu_domain);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
  2016-04-19 16:56   ` Eric Auger
@ 2016-04-20  9:38     ` Marc Zyngier
  -1 siblings, 0 replies; 127+ messages in thread
From: Marc Zyngier @ 2016-04-20  9:38 UTC (permalink / raw)
  To: Eric Auger, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> Introduce iommu_msi_mapping_translate_msg whose role consists in
> detecting whether the device's MSIs must to be mapped into an IOMMU.
> It case it must, the function overrides the MSI msg originally composed
> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
> IOVA. In case the corresponding PA region is not found, the function
> returns an error.
> 
> This function is likely to be called in some code that cannot sleep. This
> is the reason why the allocation/mapping is done separately.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v7: creation
> ---
>  drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>  2 files changed, 96 insertions(+)
> 
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 907a17f..603ee45 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -18,6 +18,14 @@
>  #include <linux/iommu.h>
>  #include <linux/iova.h>
>  #include <linux/msi.h>
> +#include <linux/irq.h>
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define msg_to_phys_addr(msg) \
> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
> +#else
> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
> +#endif
>  
>  struct reserved_iova_domain {
>  	struct iova_domain *iovad;
> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>  	return d;
>  }
>  EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> +
> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
> +				    phys_addr_t addr, size_t size)
> +{
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	size_t iommu_page_size, binding_size;
> +	struct iommu_reserved_binding *b;
> +	phys_addr_t aligned_base, offset;
> +	dma_addr_t iova = DMA_ERROR_CODE;
> +	struct iova_domain *iovad;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
> +
> +	if (!iovad)
> +		goto unlock;
> +
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;
> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b && (b->addr <= aligned_base) &&
> +		(aligned_base + binding_size <=  b->addr + b->size))
> +		iova = b->iova + offset + aligned_base - b->addr;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return iova;
> +}
> +
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
> +{

I'm really not keen on passing a full irq_data to something that doesn't
really care about it. What this really needs is the device that
generates the MSIs, which makes a lot more sense (you get the device and
the address from the msi_msg).

Or am I getting it wrong?

> +	struct iommu_domain *d;
> +	struct msi_desc *desc;
> +	dma_addr_t iova;
> +
> +	desc = irq_data_get_msi_desc(data);
> +	if (!desc)
> +		return -EINVAL;
> +
> +	d = iommu_msi_mapping_desc_to_domain(desc);
> +	if (!d)
> +		return 0;
> +
> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
> +					sizeof(phys_addr_t));
> +
> +	if (iova == DMA_ERROR_CODE)
> +		return -EINVAL;
> +
> +	msg->address_lo = lower_32_bits(iova);
> +	msg->address_hi = upper_32_bits(iova);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8373929..04e1912f 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -20,6 +20,8 @@
>  
>  struct iommu_domain;
>  struct msi_desc;
> +struct irq_data;
> +struct msi_msg;
>  
>  #ifdef CONFIG_IOMMU_DMA_RESERVED
>  
> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>   */
>  struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>  
> +/**
> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
> + * This function overwrites the original MSI message containing the doorbell
> + * physical address, result of the primary composition, with the doorbell IOVA.
> + *
> + * The doorbell physical address must be bound previously to an IOVA using
> + * iommu_get_reserved_iova
> + *
> + * @data: irq data handle
> + * @msg: original msi message containing the PA to be overwritten with
> + * the IOVA
> + *
> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
> + * were successfully swapped; return -EINVAL if the addresses need
> + * to be swapped but not IOMMU binding is found
> + */
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
> +
>  #else
>  
>  static inline int
> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>  	return NULL;
>  }
>  
> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
> +						  struct msi_msg *msg)
> +{
> +	return 0;
> +}
> +
>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>  #endif	/* __DMA_RESERVED_IOMMU_H */
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20  9:38     ` Marc Zyngier
  0 siblings, 0 replies; 127+ messages in thread
From: Marc Zyngier @ 2016-04-20  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> Introduce iommu_msi_mapping_translate_msg whose role consists in
> detecting whether the device's MSIs must to be mapped into an IOMMU.
> It case it must, the function overrides the MSI msg originally composed
> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
> IOVA. In case the corresponding PA region is not found, the function
> returns an error.
> 
> This function is likely to be called in some code that cannot sleep. This
> is the reason why the allocation/mapping is done separately.
> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> 
> ---
> 
> v7: creation
> ---
>  drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>  2 files changed, 96 insertions(+)
> 
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 907a17f..603ee45 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -18,6 +18,14 @@
>  #include <linux/iommu.h>
>  #include <linux/iova.h>
>  #include <linux/msi.h>
> +#include <linux/irq.h>
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define msg_to_phys_addr(msg) \
> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
> +#else
> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
> +#endif
>  
>  struct reserved_iova_domain {
>  	struct iova_domain *iovad;
> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>  	return d;
>  }
>  EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> +
> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
> +				    phys_addr_t addr, size_t size)
> +{
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	size_t iommu_page_size, binding_size;
> +	struct iommu_reserved_binding *b;
> +	phys_addr_t aligned_base, offset;
> +	dma_addr_t iova = DMA_ERROR_CODE;
> +	struct iova_domain *iovad;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
> +
> +	if (!iovad)
> +		goto unlock;
> +
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;
> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b && (b->addr <= aligned_base) &&
> +		(aligned_base + binding_size <=  b->addr + b->size))
> +		iova = b->iova + offset + aligned_base - b->addr;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return iova;
> +}
> +
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
> +{

I'm really not keen on passing a full irq_data to something that doesn't
really care about it. What this really needs is the device that
generates the MSIs, which makes a lot more sense (you get the device and
the address from the msi_msg).

Or am I getting it wrong?

> +	struct iommu_domain *d;
> +	struct msi_desc *desc;
> +	dma_addr_t iova;
> +
> +	desc = irq_data_get_msi_desc(data);
> +	if (!desc)
> +		return -EINVAL;
> +
> +	d = iommu_msi_mapping_desc_to_domain(desc);
> +	if (!d)
> +		return 0;
> +
> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
> +					sizeof(phys_addr_t));
> +
> +	if (iova == DMA_ERROR_CODE)
> +		return -EINVAL;
> +
> +	msg->address_lo = lower_32_bits(iova);
> +	msg->address_hi = upper_32_bits(iova);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8373929..04e1912f 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -20,6 +20,8 @@
>  
>  struct iommu_domain;
>  struct msi_desc;
> +struct irq_data;
> +struct msi_msg;
>  
>  #ifdef CONFIG_IOMMU_DMA_RESERVED
>  
> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>   */
>  struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>  
> +/**
> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
> + * This function overwrites the original MSI message containing the doorbell
> + * physical address, result of the primary composition, with the doorbell IOVA.
> + *
> + * The doorbell physical address must be bound previously to an IOVA using
> + * iommu_get_reserved_iova
> + *
> + * @data: irq data handle
> + * @msg: original msi message containing the PA to be overwritten with
> + * the IOVA
> + *
> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
> + * were successfully swapped; return -EINVAL if the addresses need
> + * to be swapped but not IOMMU binding is found
> + */
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
> +
>  #else
>  
>  static inline int
> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>  	return NULL;
>  }
>  
> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
> +						  struct msi_msg *msg)
> +{
> +	return 0;
> +}
> +
>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>  #endif	/* __DMA_RESERVED_IOMMU_H */
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 12:47     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:47 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Eric,

On 19/04/16 17:56, Eric Auger wrote:
> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
> this means the MSI addresses need to be mapped in the IOMMU.
>
> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
> transaction addresses always are within the 1MB PA region [FEE0_0000h -
> FEF0_000h] window which directly targets the APIC configuration space and
> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
> conveyed through the IOMMU.

What's stopping us from simply inferring this from the domain's IOMMU 
not advertising interrupt remapping capabilities?

Robin.

> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v4 -> v5:
> - introduce the user in the next patch
>
> RFC v1 -> v1:
> - the data field is not used
> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>    capability if needed or <0 if not.
> - removed struct iommu_domain_msi_maps
> ---
>   include/linux/iommu.h | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 62a5eae..b3e8c5b 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -113,6 +113,7 @@ enum iommu_attr {
>   	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>   	DOMAIN_ATTR_FSL_PAMUV1,
>   	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> +	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>   	DOMAIN_ATTR_MAX,
>   };
>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 12:47     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:47 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Eric,

On 19/04/16 17:56, Eric Auger wrote:
> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
> this means the MSI addresses need to be mapped in the IOMMU.
>
> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
> transaction addresses always are within the 1MB PA region [FEE0_0000h -
> FEF0_000h] window which directly targets the APIC configuration space and
> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
> conveyed through the IOMMU.

What's stopping us from simply inferring this from the domain's IOMMU 
not advertising interrupt remapping capabilities?

Robin.

> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
>
> v4 -> v5:
> - introduce the user in the next patch
>
> RFC v1 -> v1:
> - the data field is not used
> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>    capability if needed or <0 if not.
> - removed struct iommu_domain_msi_maps
> ---
>   include/linux/iommu.h | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 62a5eae..b3e8c5b 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -113,6 +113,7 @@ enum iommu_attr {
>   	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>   	DOMAIN_ATTR_FSL_PAMUV1,
>   	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> +	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>   	DOMAIN_ATTR_MAX,
>   };
>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 12:47     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Eric,

On 19/04/16 17:56, Eric Auger wrote:
> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
> this means the MSI addresses need to be mapped in the IOMMU.
>
> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
> transaction addresses always are within the 1MB PA region [FEE0_0000h -
> FEF0_000h] window which directly targets the APIC configuration space and
> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
> conveyed through the IOMMU.

What's stopping us from simply inferring this from the domain's IOMMU 
not advertising interrupt remapping capabilities?

Robin.

> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v4 -> v5:
> - introduce the user in the next patch
>
> RFC v1 -> v1:
> - the data field is not used
> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>    capability if needed or <0 if not.
> - removed struct iommu_domain_msi_maps
> ---
>   include/linux/iommu.h | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 62a5eae..b3e8c5b 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -113,6 +113,7 @@ enum iommu_attr {
>   	DOMAIN_ATTR_FSL_PAMU_ENABLE,
>   	DOMAIN_ATTR_FSL_PAMUV1,
>   	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> +	DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>   	DOMAIN_ATTR_MAX,
>   };
>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 12:50       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 12:50 UTC (permalink / raw)
  To: Marc Zyngier, eric.auger, robin.murphy, alex.williamson,
	will.deacon, joro, tglx, jason, christoffer.dall,
	linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 04/20/2016 11:38 AM, Marc Zyngier wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>  drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>  2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>  #include <linux/iommu.h>
>>  #include <linux/iova.h>
>>  #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
>> +#endif
>>  
>>  struct reserved_iova_domain {
>>  	struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return d;
>>  }
>>  EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +				    phys_addr_t addr, size_t size)
>> +{
>> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +	size_t iommu_page_size, binding_size;
>> +	struct iommu_reserved_binding *b;
>> +	phys_addr_t aligned_base, offset;
>> +	dma_addr_t iova = DMA_ERROR_CODE;
>> +	struct iova_domain *iovad;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +	if (!iovad)
>> +		goto unlock;
>> +
>> +	order = iova_shift(iovad);
>> +	base_pfn = addr >> order;
>> +	end_pfn = (addr + size - 1) >> order;
>> +	aligned_base = base_pfn << order;
>> +	offset = addr - aligned_base;
>> +	nb_iommu_pages = end_pfn - base_pfn + 1;
>> +	iommu_page_size = 1 << order;
>> +	binding_size = nb_iommu_pages * iommu_page_size;
>> +
>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>> +	if (b && (b->addr <= aligned_base) &&
>> +		(aligned_base + binding_size <=  b->addr + b->size))
>> +		iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
>> +{
> 
> I'm really not keen on passing a full irq_data to something that doesn't
> really care about it. What this really needs is the device that
> generates the MSIs, which makes a lot more sense (you get the device and
> the address from the msi_msg).
> 
> Or am I getting it wrong?

No I have no objection. I eventually decided to keep the irq_data*
parameter to be homegeneous with irq_chip_compose_msi_msg and I found
the function name understandable. However I also understand it looks
weird to get this irq-data incursion in this API

will change into iommu_msi_msg_pa_to_va, as you proposed.

Thank you for your time

Eric


> 
>> +	struct iommu_domain *d;
>> +	struct msi_desc *desc;
>> +	dma_addr_t iova;
>> +
>> +	desc = irq_data_get_msi_desc(data);
>> +	if (!desc)
>> +		return -EINVAL;
>> +
>> +	d = iommu_msi_mapping_desc_to_domain(desc);
>> +	if (!d)
>> +		return 0;
>> +
>> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +					sizeof(phys_addr_t));
>> +
>> +	if (iova == DMA_ERROR_CODE)
>> +		return -EINVAL;
>> +
>> +	msg->address_lo = lower_32_bits(iova);
>> +	msg->address_hi = upper_32_bits(iova);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>  
>>  struct iommu_domain;
>>  struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>  
>>  #ifdef CONFIG_IOMMU_DMA_RESERVED
>>  
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>>   */
>>  struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>>  
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
>> + * This function overwrites the original MSI message containing the doorbell
>> + * physical address, result of the primary composition, with the doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
>> +
>>  #else
>>  
>>  static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return NULL;
>>  }
>>  
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +						  struct msi_msg *msg)
>> +{
>> +	return 0;
>> +}
>> +
>>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>>  #endif	/* __DMA_RESERVED_IOMMU_H */
>>
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 12:50       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 12:50 UTC (permalink / raw)
  To: Marc Zyngier, eric.auger-qxv4g6HH51o, robin.murphy-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 04/20/2016 11:38 AM, Marc Zyngier wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>  drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>  2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>  #include <linux/iommu.h>
>>  #include <linux/iova.h>
>>  #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
>> +#endif
>>  
>>  struct reserved_iova_domain {
>>  	struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return d;
>>  }
>>  EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +				    phys_addr_t addr, size_t size)
>> +{
>> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +	size_t iommu_page_size, binding_size;
>> +	struct iommu_reserved_binding *b;
>> +	phys_addr_t aligned_base, offset;
>> +	dma_addr_t iova = DMA_ERROR_CODE;
>> +	struct iova_domain *iovad;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +	if (!iovad)
>> +		goto unlock;
>> +
>> +	order = iova_shift(iovad);
>> +	base_pfn = addr >> order;
>> +	end_pfn = (addr + size - 1) >> order;
>> +	aligned_base = base_pfn << order;
>> +	offset = addr - aligned_base;
>> +	nb_iommu_pages = end_pfn - base_pfn + 1;
>> +	iommu_page_size = 1 << order;
>> +	binding_size = nb_iommu_pages * iommu_page_size;
>> +
>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>> +	if (b && (b->addr <= aligned_base) &&
>> +		(aligned_base + binding_size <=  b->addr + b->size))
>> +		iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
>> +{
> 
> I'm really not keen on passing a full irq_data to something that doesn't
> really care about it. What this really needs is the device that
> generates the MSIs, which makes a lot more sense (you get the device and
> the address from the msi_msg).
> 
> Or am I getting it wrong?

No I have no objection. I eventually decided to keep the irq_data*
parameter to be homegeneous with irq_chip_compose_msi_msg and I found
the function name understandable. However I also understand it looks
weird to get this irq-data incursion in this API

will change into iommu_msi_msg_pa_to_va, as you proposed.

Thank you for your time

Eric


> 
>> +	struct iommu_domain *d;
>> +	struct msi_desc *desc;
>> +	dma_addr_t iova;
>> +
>> +	desc = irq_data_get_msi_desc(data);
>> +	if (!desc)
>> +		return -EINVAL;
>> +
>> +	d = iommu_msi_mapping_desc_to_domain(desc);
>> +	if (!d)
>> +		return 0;
>> +
>> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +					sizeof(phys_addr_t));
>> +
>> +	if (iova == DMA_ERROR_CODE)
>> +		return -EINVAL;
>> +
>> +	msg->address_lo = lower_32_bits(iova);
>> +	msg->address_hi = upper_32_bits(iova);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>  
>>  struct iommu_domain;
>>  struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>  
>>  #ifdef CONFIG_IOMMU_DMA_RESERVED
>>  
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>>   */
>>  struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>>  
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
>> + * This function overwrites the original MSI message containing the doorbell
>> + * physical address, result of the primary composition, with the doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
>> +
>>  #else
>>  
>>  static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return NULL;
>>  }
>>  
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +						  struct msi_msg *msg)
>> +{
>> +	return 0;
>> +}
>> +
>>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>>  #endif	/* __DMA_RESERVED_IOMMU_H */
>>
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 12:50       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 12:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/20/2016 11:38 AM, Marc Zyngier wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>  drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>  2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>  #include <linux/iommu.h>
>>  #include <linux/iova.h>
>>  #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
>> +#endif
>>  
>>  struct reserved_iova_domain {
>>  	struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return d;
>>  }
>>  EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +				    phys_addr_t addr, size_t size)
>> +{
>> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +	size_t iommu_page_size, binding_size;
>> +	struct iommu_reserved_binding *b;
>> +	phys_addr_t aligned_base, offset;
>> +	dma_addr_t iova = DMA_ERROR_CODE;
>> +	struct iova_domain *iovad;
>> +
>> +	spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +	if (!iovad)
>> +		goto unlock;
>> +
>> +	order = iova_shift(iovad);
>> +	base_pfn = addr >> order;
>> +	end_pfn = (addr + size - 1) >> order;
>> +	aligned_base = base_pfn << order;
>> +	offset = addr - aligned_base;
>> +	nb_iommu_pages = end_pfn - base_pfn + 1;
>> +	iommu_page_size = 1 << order;
>> +	binding_size = nb_iommu_pages * iommu_page_size;
>> +
>> +	b = find_reserved_binding(domain, aligned_base, binding_size);
>> +	if (b && (b->addr <= aligned_base) &&
>> +		(aligned_base + binding_size <=  b->addr + b->size))
>> +		iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +	return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
>> +{
> 
> I'm really not keen on passing a full irq_data to something that doesn't
> really care about it. What this really needs is the device that
> generates the MSIs, which makes a lot more sense (you get the device and
> the address from the msi_msg).
> 
> Or am I getting it wrong?

No I have no objection. I eventually decided to keep the irq_data*
parameter to be homegeneous with irq_chip_compose_msi_msg and I found
the function name understandable. However I also understand it looks
weird to get this irq-data incursion in this API

will change into iommu_msi_msg_pa_to_va, as you proposed.

Thank you for your time

Eric


> 
>> +	struct iommu_domain *d;
>> +	struct msi_desc *desc;
>> +	dma_addr_t iova;
>> +
>> +	desc = irq_data_get_msi_desc(data);
>> +	if (!desc)
>> +		return -EINVAL;
>> +
>> +	d = iommu_msi_mapping_desc_to_domain(desc);
>> +	if (!d)
>> +		return 0;
>> +
>> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +					sizeof(phys_addr_t));
>> +
>> +	if (iova == DMA_ERROR_CODE)
>> +		return -EINVAL;
>> +
>> +	msg->address_lo = lower_32_bits(iova);
>> +	msg->address_hi = upper_32_bits(iova);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>  
>>  struct iommu_domain;
>>  struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>  
>>  #ifdef CONFIG_IOMMU_DMA_RESERVED
>>  
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>>   */
>>  struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>>  
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
>> + * This function overwrites the original MSI message containing the doorbell
>> + * physical address, result of the primary composition, with the doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
>> +
>>  #else
>>  
>>  static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>  	return NULL;
>>  }
>>  
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +						  struct msi_msg *msg)
>> +{
>> +	return 0;
>> +}
>> +
>>  #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>>  #endif	/* __DMA_RESERVED_IOMMU_H */
>>
> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 12:55     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:55 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces some new fields in the iommu_domain struct,
> dedicated to reserved iova management.
>
> In a similar way as DMA mapping IOVA window, we need to store
> information related to a reserved IOVA window.
>
> The reserved_iova_cookie will store the reserved iova_domain
> handle. An RB tree indexed by physical address is introduced to
> store the host physical addresses bound to reserved IOVAs.
>
> Those physical addresses will correspond to MSI frame base
> addresses, also referred to as doorbells. Their number should be
> quite limited per domain.
>
> Also a spin_lock is introduced to protect accesses to the iova_domain
> and RB tree. The choice of a spin_lock is driven by the fact the RB
> tree will need to be accessed in MSI controller code not allowed to
> sleep.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v5 -> v6:
> - initialize reserved_binding_list
> - use a spinlock instead of a mutex
> ---
>   drivers/iommu/iommu.c | 2 ++
>   include/linux/iommu.h | 6 ++++++
>   2 files changed, 8 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index b9df141..f70ef3b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>
>   	domain->ops  = bus->iommu_ops;
>   	domain->type = type;
> +	spin_lock_init(&domain->reserved_lock);
> +	domain->reserved_binding_list = RB_ROOT;
>
>   	return domain;
>   }
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3e8c5b..60999db 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -24,6 +24,7 @@
>   #include <linux/of.h>
>   #include <linux/types.h>
>   #include <linux/scatterlist.h>
> +#include <linux/spinlock.h>
>   #include <trace/events/iommu.h>
>
>   #define IOMMU_READ	(1 << 0)
> @@ -83,6 +84,11 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	void *reserved_iova_cookie;

Why exactly do we need this? From your description, it's for the user of 
the domain to keep track of IOVA allocations in, but then that's 
precisely what the iova_cookie exists for.

> +	/* rb tree indexed by PA, for reserved bindings only */
> +	struct rb_root reserved_binding_list;

Nit: that's more puzzling than helpful - "reserved binding" is 
particularly vague and nondescript, and makes me think of anything but 
MSI descriptors. Plus it's called a list but isn't a list (that said, 
given that we'd typically only expect a handful of entries, and lookups 
are hardly going to be a performance-critical bottleneck, would a simple 
list not suffice?)

> +	/* protects reserved cookie and rbtree manipulation */
> +	spinlock_t reserved_lock;

A cookie is an opaque structure, so any locking it needs would normally 
be hidden within. If on the other hand it's not meant to be opaque at 
this level, then it should probably be something more specific than a 
void * (if at all, as above).

Robin.

>   };
>
>   enum iommu_cap {
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 12:55     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:55 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces some new fields in the iommu_domain struct,
> dedicated to reserved iova management.
>
> In a similar way as DMA mapping IOVA window, we need to store
> information related to a reserved IOVA window.
>
> The reserved_iova_cookie will store the reserved iova_domain
> handle. An RB tree indexed by physical address is introduced to
> store the host physical addresses bound to reserved IOVAs.
>
> Those physical addresses will correspond to MSI frame base
> addresses, also referred to as doorbells. Their number should be
> quite limited per domain.
>
> Also a spin_lock is introduced to protect accesses to the iova_domain
> and RB tree. The choice of a spin_lock is driven by the fact the RB
> tree will need to be accessed in MSI controller code not allowed to
> sleep.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
> v5 -> v6:
> - initialize reserved_binding_list
> - use a spinlock instead of a mutex
> ---
>   drivers/iommu/iommu.c | 2 ++
>   include/linux/iommu.h | 6 ++++++
>   2 files changed, 8 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index b9df141..f70ef3b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>
>   	domain->ops  = bus->iommu_ops;
>   	domain->type = type;
> +	spin_lock_init(&domain->reserved_lock);
> +	domain->reserved_binding_list = RB_ROOT;
>
>   	return domain;
>   }
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3e8c5b..60999db 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -24,6 +24,7 @@
>   #include <linux/of.h>
>   #include <linux/types.h>
>   #include <linux/scatterlist.h>
> +#include <linux/spinlock.h>
>   #include <trace/events/iommu.h>
>
>   #define IOMMU_READ	(1 << 0)
> @@ -83,6 +84,11 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	void *reserved_iova_cookie;

Why exactly do we need this? From your description, it's for the user of 
the domain to keep track of IOVA allocations in, but then that's 
precisely what the iova_cookie exists for.

> +	/* rb tree indexed by PA, for reserved bindings only */
> +	struct rb_root reserved_binding_list;

Nit: that's more puzzling than helpful - "reserved binding" is 
particularly vague and nondescript, and makes me think of anything but 
MSI descriptors. Plus it's called a list but isn't a list (that said, 
given that we'd typically only expect a handful of entries, and lookups 
are hardly going to be a performance-critical bottleneck, would a simple 
list not suffice?)

> +	/* protects reserved cookie and rbtree manipulation */
> +	spinlock_t reserved_lock;

A cookie is an opaque structure, so any locking it needs would normally 
be hidden within. If on the other hand it's not meant to be opaque at 
this level, then it should probably be something more specific than a 
void * (if at all, as above).

Robin.

>   };
>
>   enum iommu_cap {
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 12:55     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 12:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces some new fields in the iommu_domain struct,
> dedicated to reserved iova management.
>
> In a similar way as DMA mapping IOVA window, we need to store
> information related to a reserved IOVA window.
>
> The reserved_iova_cookie will store the reserved iova_domain
> handle. An RB tree indexed by physical address is introduced to
> store the host physical addresses bound to reserved IOVAs.
>
> Those physical addresses will correspond to MSI frame base
> addresses, also referred to as doorbells. Their number should be
> quite limited per domain.
>
> Also a spin_lock is introduced to protect accesses to the iova_domain
> and RB tree. The choice of a spin_lock is driven by the fact the RB
> tree will need to be accessed in MSI controller code not allowed to
> sleep.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v5 -> v6:
> - initialize reserved_binding_list
> - use a spinlock instead of a mutex
> ---
>   drivers/iommu/iommu.c | 2 ++
>   include/linux/iommu.h | 6 ++++++
>   2 files changed, 8 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index b9df141..f70ef3b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1073,6 +1073,8 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>
>   	domain->ops  = bus->iommu_ops;
>   	domain->type = type;
> +	spin_lock_init(&domain->reserved_lock);
> +	domain->reserved_binding_list = RB_ROOT;
>
>   	return domain;
>   }
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b3e8c5b..60999db 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -24,6 +24,7 @@
>   #include <linux/of.h>
>   #include <linux/types.h>
>   #include <linux/scatterlist.h>
> +#include <linux/spinlock.h>
>   #include <trace/events/iommu.h>
>
>   #define IOMMU_READ	(1 << 0)
> @@ -83,6 +84,11 @@ struct iommu_domain {
>   	void *handler_token;
>   	struct iommu_domain_geometry geometry;
>   	void *iova_cookie;
> +	void *reserved_iova_cookie;

Why exactly do we need this? From your description, it's for the user of 
the domain to keep track of IOVA allocations in, but then that's 
precisely what the iova_cookie exists for.

> +	/* rb tree indexed by PA, for reserved bindings only */
> +	struct rb_root reserved_binding_list;

Nit: that's more puzzling than helpful - "reserved binding" is 
particularly vague and nondescript, and makes me think of anything but 
MSI descriptors. Plus it's called a list but isn't a list (that said, 
given that we'd typically only expect a handful of entries, and lookups 
are hardly going to be a performance-critical bottleneck, would a simple 
list not suffice?)

> +	/* protects reserved cookie and rbtree manipulation */
> +	spinlock_t reserved_lock;

A cookie is an opaque structure, so any locking it needs would normally 
be hidden within. If on the other hand it's not meant to be opaque at 
this level, then it should probably be something more specific than a 
void * (if at all, as above).

Robin.

>   };
>
>   enum iommu_cap {
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:03     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:03 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
> alloc_reserved_iova_domain initializes an iova domain at a given
> iova base address and with a given size. This iova domain will
> be used to allocate iova within that window. Those IOVAs will be reserved
> for special purpose, typically MSI frame binding. Allocation function
> within the reserved iova domain will be introduced in subsequent patches.

Am I the only one thinking "Yo dawg, I heard you like IOVA allocators, 
so we put an IOVA allocator in your IOVA allocator so you can allocate 
IOVAs while you allocate IOVAs."

> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
> is set.

DMA? I thought we were going to be dealing with MSI descriptors?

> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v7:
> - fix locking
> - add iova_cache_get/put
> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
> - introduce struct reserved_iova_domain to encapsulate prot info &
>    add prot parameter in alloc_reserved_iova_domain
>
> v5 -> v6:
> - use spin lock instead of mutex
>
> v3 -> v4:
> - formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
>    "iommu: add alloc/free_reserved_iova_domain"
>
> v2 -> v3:
> - remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
>    static implementation in case CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - moved from vfio API to IOMMU API
> ---
>   drivers/iommu/Kconfig              |  8 +++
>   drivers/iommu/Makefile             |  1 +
>   drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>   4 files changed, 167 insertions(+)
>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>   create mode 100644 include/linux/dma-reserved-iommu.h
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd1dc39..a5a1530 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -74,6 +74,12 @@ config IOMMU_DMA
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
>
> +# IOMMU reserved IOVA mapping (MSI doorbell)

The only reference to MSIs I can see, but I can't make head nor tail of 
it. Sorry, I don't understand this patch at all :(

Robin.

> +config IOMMU_DMA_RESERVED
> +	bool
> +	select IOMMU_API
> +	select IOMMU_IOVA
> +
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
>   	depends on PPC32
> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>   config ARM_SMMU
>   	bool "ARM Ltd. System MMU (SMMU) Support"
>   	depends on (ARM64 || ARM) && MMU
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select ARM_DMA_USE_IOMMU if ARM
> @@ -320,6 +327,7 @@ config ARM_SMMU
>   config ARM_SMMU_V3
>   	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>   	depends on ARM64 && PCI
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select GENERIC_MSI_IRQ_DOMAIN
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index c6edb31..6c9ae99 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> new file mode 100644
> index 0000000..2562af0
> --- /dev/null
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -0,0 +1,99 @@
> +/*
> + * Reserved IOVA Management
> + *
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +
> +struct reserved_iova_domain {
> +	struct iova_domain *iovad;
> +	int prot; /* iommu protection attributes to be obeyed */
> +};
> +
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order)
> +{
> +	unsigned long granule, mask, flags;
> +	struct reserved_iova_domain *rid;
> +	int ret = 0;
> +
> +	granule = 1UL << order;
> +	mask = granule - 1;
> +	if (iova & mask || (!size) || (size & mask))
> +		return -EINVAL;
> +
> +	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
> +	if (!rid)
> +		return -ENOMEM;
> +
> +	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
> +	if (!rid->iovad) {
> +		kfree(rid);
> +		return -ENOMEM;
> +	}
> +
> +	iova_cache_get();
> +
> +	init_iova_domain(rid->iovad, granule,
> +			 iova >> order, (iova + size - 1) >> order);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	if (domain->reserved_iova_cookie) {
> +		ret = -EEXIST;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = rid;
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
> +
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> +{
> +	struct reserved_iova_domain *rid;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid) {
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = NULL;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (!ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +}
> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> new file mode 100644
> index 0000000..01ec385
> --- /dev/null
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +#ifndef __DMA_RESERVED_IOMMU_H
> +#define __DMA_RESERVED_IOMMU_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +struct iommu_domain;
> +
> +#ifdef CONFIG_IOMMU_DMA_RESERVED
> +
> +/**
> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + * @iova: base iova address
> + * @size: iova window size
> + * @prot: protection attribute flags
> + * @order: page order
> + */
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order);
> +
> +/**
> + * iommu_free_reserved_iova_domain: free the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + */
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
> +
> +#else
> +
> +static inline int
> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				 dma_addr_t iova, size_t size, int prot,
> +				 unsigned long order)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void
> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
> +
> +#endif	/* CONFIG_IOMMU_DMA_RESERVED */
> +#endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:03     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:03 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
> alloc_reserved_iova_domain initializes an iova domain at a given
> iova base address and with a given size. This iova domain will
> be used to allocate iova within that window. Those IOVAs will be reserved
> for special purpose, typically MSI frame binding. Allocation function
> within the reserved iova domain will be introduced in subsequent patches.

Am I the only one thinking "Yo dawg, I heard you like IOVA allocators, 
so we put an IOVA allocator in your IOVA allocator so you can allocate 
IOVAs while you allocate IOVAs."

> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
> is set.

DMA? I thought we were going to be dealing with MSI descriptors?

> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
> v7:
> - fix locking
> - add iova_cache_get/put
> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
> - introduce struct reserved_iova_domain to encapsulate prot info &
>    add prot parameter in alloc_reserved_iova_domain
>
> v5 -> v6:
> - use spin lock instead of mutex
>
> v3 -> v4:
> - formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
>    "iommu: add alloc/free_reserved_iova_domain"
>
> v2 -> v3:
> - remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
>    static implementation in case CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - moved from vfio API to IOMMU API
> ---
>   drivers/iommu/Kconfig              |  8 +++
>   drivers/iommu/Makefile             |  1 +
>   drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>   4 files changed, 167 insertions(+)
>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>   create mode 100644 include/linux/dma-reserved-iommu.h
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd1dc39..a5a1530 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -74,6 +74,12 @@ config IOMMU_DMA
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
>
> +# IOMMU reserved IOVA mapping (MSI doorbell)

The only reference to MSIs I can see, but I can't make head nor tail of 
it. Sorry, I don't understand this patch at all :(

Robin.

> +config IOMMU_DMA_RESERVED
> +	bool
> +	select IOMMU_API
> +	select IOMMU_IOVA
> +
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
>   	depends on PPC32
> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>   config ARM_SMMU
>   	bool "ARM Ltd. System MMU (SMMU) Support"
>   	depends on (ARM64 || ARM) && MMU
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select ARM_DMA_USE_IOMMU if ARM
> @@ -320,6 +327,7 @@ config ARM_SMMU
>   config ARM_SMMU_V3
>   	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>   	depends on ARM64 && PCI
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select GENERIC_MSI_IRQ_DOMAIN
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index c6edb31..6c9ae99 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> new file mode 100644
> index 0000000..2562af0
> --- /dev/null
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -0,0 +1,99 @@
> +/*
> + * Reserved IOVA Management
> + *
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +
> +struct reserved_iova_domain {
> +	struct iova_domain *iovad;
> +	int prot; /* iommu protection attributes to be obeyed */
> +};
> +
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order)
> +{
> +	unsigned long granule, mask, flags;
> +	struct reserved_iova_domain *rid;
> +	int ret = 0;
> +
> +	granule = 1UL << order;
> +	mask = granule - 1;
> +	if (iova & mask || (!size) || (size & mask))
> +		return -EINVAL;
> +
> +	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
> +	if (!rid)
> +		return -ENOMEM;
> +
> +	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
> +	if (!rid->iovad) {
> +		kfree(rid);
> +		return -ENOMEM;
> +	}
> +
> +	iova_cache_get();
> +
> +	init_iova_domain(rid->iovad, granule,
> +			 iova >> order, (iova + size - 1) >> order);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	if (domain->reserved_iova_cookie) {
> +		ret = -EEXIST;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = rid;
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
> +
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> +{
> +	struct reserved_iova_domain *rid;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid) {
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = NULL;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (!ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +}
> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> new file mode 100644
> index 0000000..01ec385
> --- /dev/null
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +#ifndef __DMA_RESERVED_IOMMU_H
> +#define __DMA_RESERVED_IOMMU_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +struct iommu_domain;
> +
> +#ifdef CONFIG_IOMMU_DMA_RESERVED
> +
> +/**
> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + * @iova: base iova address
> + * @size: iova window size
> + * @prot: protection attribute flags
> + * @order: page order
> + */
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order);
> +
> +/**
> + * iommu_free_reserved_iova_domain: free the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + */
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
> +
> +#else
> +
> +static inline int
> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				 dma_addr_t iova, size_t size, int prot,
> +				 unsigned long order)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void
> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
> +
> +#endif	/* CONFIG_IOMMU_DMA_RESERVED */
> +#endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:03     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
> alloc_reserved_iova_domain initializes an iova domain at a given
> iova base address and with a given size. This iova domain will
> be used to allocate iova within that window. Those IOVAs will be reserved
> for special purpose, typically MSI frame binding. Allocation function
> within the reserved iova domain will be introduced in subsequent patches.

Am I the only one thinking "Yo dawg, I heard you like IOVA allocators, 
so we put an IOVA allocator in your IOVA allocator so you can allocate 
IOVAs while you allocate IOVAs."

> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
> is set.

DMA? I thought we were going to be dealing with MSI descriptors?

> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v7:
> - fix locking
> - add iova_cache_get/put
> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
> - introduce struct reserved_iova_domain to encapsulate prot info &
>    add prot parameter in alloc_reserved_iova_domain
>
> v5 -> v6:
> - use spin lock instead of mutex
>
> v3 -> v4:
> - formerly in "iommu/arm-smmu: implement alloc/free_reserved_iova_domain" &
>    "iommu: add alloc/free_reserved_iova_domain"
>
> v2 -> v3:
> - remove iommu_alloc_reserved_iova_domain & iommu_free_reserved_iova_domain
>    static implementation in case CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - moved from vfio API to IOMMU API
> ---
>   drivers/iommu/Kconfig              |  8 +++
>   drivers/iommu/Makefile             |  1 +
>   drivers/iommu/dma-reserved-iommu.c | 99 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>   4 files changed, 167 insertions(+)
>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>   create mode 100644 include/linux/dma-reserved-iommu.h
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index dd1dc39..a5a1530 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -74,6 +74,12 @@ config IOMMU_DMA
>   	select IOMMU_IOVA
>   	select NEED_SG_DMA_LENGTH
>
> +# IOMMU reserved IOVA mapping (MSI doorbell)

The only reference to MSIs I can see, but I can't make head nor tail of 
it. Sorry, I don't understand this patch at all :(

Robin.

> +config IOMMU_DMA_RESERVED
> +	bool
> +	select IOMMU_API
> +	select IOMMU_IOVA
> +
>   config FSL_PAMU
>   	bool "Freescale IOMMU support"
>   	depends on PPC32
> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>   config ARM_SMMU
>   	bool "ARM Ltd. System MMU (SMMU) Support"
>   	depends on (ARM64 || ARM) && MMU
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select ARM_DMA_USE_IOMMU if ARM
> @@ -320,6 +327,7 @@ config ARM_SMMU
>   config ARM_SMMU_V3
>   	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>   	depends on ARM64 && PCI
> +	select IOMMU_DMA_RESERVED
>   	select IOMMU_API
>   	select IOMMU_IO_PGTABLE_LPAE
>   	select GENERIC_MSI_IRQ_DOMAIN
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index c6edb31..6c9ae99 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> new file mode 100644
> index 0000000..2562af0
> --- /dev/null
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -0,0 +1,99 @@
> +/*
> + * Reserved IOVA Management
> + *
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <linux/iommu.h>
> +#include <linux/iova.h>
> +
> +struct reserved_iova_domain {
> +	struct iova_domain *iovad;
> +	int prot; /* iommu protection attributes to be obeyed */
> +};
> +
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order)
> +{
> +	unsigned long granule, mask, flags;
> +	struct reserved_iova_domain *rid;
> +	int ret = 0;
> +
> +	granule = 1UL << order;
> +	mask = granule - 1;
> +	if (iova & mask || (!size) || (size & mask))
> +		return -EINVAL;
> +
> +	rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
> +	if (!rid)
> +		return -ENOMEM;
> +
> +	rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
> +	if (!rid->iovad) {
> +		kfree(rid);
> +		return -ENOMEM;
> +	}
> +
> +	iova_cache_get();
> +
> +	init_iova_domain(rid->iovad, granule,
> +			 iova >> order, (iova + size - 1) >> order);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	if (domain->reserved_iova_cookie) {
> +		ret = -EEXIST;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = rid;
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
> +
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> +{
> +	struct reserved_iova_domain *rid;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid) {
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	domain->reserved_iova_cookie = NULL;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (!ret) {
> +		put_iova_domain(rid->iovad);
> +		kfree(rid->iovad);
> +		kfree(rid);
> +		iova_cache_put();
> +	}
> +}
> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> new file mode 100644
> index 0000000..01ec385
> --- /dev/null
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -0,0 +1,59 @@
> +/*
> + * Copyright (c) 2015 Linaro Ltd.
> + *              www.linaro.org
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +#ifndef __DMA_RESERVED_IOMMU_H
> +#define __DMA_RESERVED_IOMMU_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +struct iommu_domain;
> +
> +#ifdef CONFIG_IOMMU_DMA_RESERVED
> +
> +/**
> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + * @iova: base iova address
> + * @size: iova window size
> + * @prot: protection attribute flags
> + * @order: page order
> + */
> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				     dma_addr_t iova, size_t size, int prot,
> +				     unsigned long order);
> +
> +/**
> + * iommu_free_reserved_iova_domain: free the reserved iova domain
> + *
> + * @domain: iommu domain handle
> + */
> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
> +
> +#else
> +
> +static inline int
> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
> +				 dma_addr_t iova, size_t size, int prot,
> +				 unsigned long order)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void
> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
> +
> +#endif	/* CONFIG_IOMMU_DMA_RESERVED */
> +#endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:11       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 13:11 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 04/20/2016 03:03 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
>> alloc_reserved_iova_domain initializes an iova domain at a given
>> iova base address and with a given size. This iova domain will
>> be used to allocate iova within that window. Those IOVAs will be reserved
>> for special purpose, typically MSI frame binding. Allocation function
>> within the reserved iova domain will be introduced in subsequent patches.
> 
> Am I the only one thinking "Yo dawg, I heard you like IOVA allocators,
> so we put an IOVA allocator in your IOVA allocator so you can allocate
> IOVAs while you allocate IOVAs."

I will try to rephrase ;-)
> 
>> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
>> is set.
> 
> DMA? I thought we were going to be dealing with MSI descriptors?

Initially this was aiming at mapping MSI frames but some people
requested it becomes a generic API to allocate so called "reserved IOVA"
usable to map special physical pages, such as MSI frames.

Hope it clarifies a little bit :-(

Eric
> 
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v7:
>> - fix locking
>> - add iova_cache_get/put
>> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
>> - introduce struct reserved_iova_domain to encapsulate prot info &
>>    add prot parameter in alloc_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin lock instead of mutex
>>
>> v3 -> v4:
>> - formerly in "iommu/arm-smmu: implement
>> alloc/free_reserved_iova_domain" &
>>    "iommu: add alloc/free_reserved_iova_domain"
>>
>> v2 -> v3:
>> - remove iommu_alloc_reserved_iova_domain &
>> iommu_free_reserved_iova_domain
>>    static implementation in case CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - moved from vfio API to IOMMU API
>> ---
>>   drivers/iommu/Kconfig              |  8 +++
>>   drivers/iommu/Makefile             |  1 +
>>   drivers/iommu/dma-reserved-iommu.c | 99
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>>   4 files changed, 167 insertions(+)
>>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>>   create mode 100644 include/linux/dma-reserved-iommu.h
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index dd1dc39..a5a1530 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -74,6 +74,12 @@ config IOMMU_DMA
>>       select IOMMU_IOVA
>>       select NEED_SG_DMA_LENGTH
>>
>> +# IOMMU reserved IOVA mapping (MSI doorbell)
> 
> The only reference to MSIs I can see, but I can't make head nor tail of
> it. Sorry, I don't understand this patch at all :(
> 
> Robin.
> 
>> +config IOMMU_DMA_RESERVED
>> +    bool
>> +    select IOMMU_API
>> +    select IOMMU_IOVA
>> +
>>   config FSL_PAMU
>>       bool "Freescale IOMMU support"
>>       depends on PPC32
>> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>>   config ARM_SMMU
>>       bool "ARM Ltd. System MMU (SMMU) Support"
>>       depends on (ARM64 || ARM) && MMU
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select ARM_DMA_USE_IOMMU if ARM
>> @@ -320,6 +327,7 @@ config ARM_SMMU
>>   config ARM_SMMU_V3
>>       bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>>       depends on ARM64 && PCI
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select GENERIC_MSI_IRQ_DOMAIN
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index c6edb31..6c9ae99 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> new file mode 100644
>> index 0000000..2562af0
>> --- /dev/null
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -0,0 +1,99 @@
>> +/*
>> + * Reserved IOVA Management
>> + *
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include <linux/iommu.h>
>> +#include <linux/iova.h>
>> +
>> +struct reserved_iova_domain {
>> +    struct iova_domain *iovad;
>> +    int prot; /* iommu protection attributes to be obeyed */
>> +};
>> +
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order)
>> +{
>> +    unsigned long granule, mask, flags;
>> +    struct reserved_iova_domain *rid;
>> +    int ret = 0;
>> +
>> +    granule = 1UL << order;
>> +    mask = granule - 1;
>> +    if (iova & mask || (!size) || (size & mask))
>> +        return -EINVAL;
>> +
>> +    rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
>> +    if (!rid)
>> +        return -ENOMEM;
>> +
>> +    rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
>> +    if (!rid->iovad) {
>> +        kfree(rid);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    iova_cache_get();
>> +
>> +    init_iova_domain(rid->iovad, granule,
>> +             iova >> order, (iova + size - 1) >> order);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    if (domain->reserved_iova_cookie) {
>> +        ret = -EEXIST;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = rid;
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>> +
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>> +{
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long flags;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid) {
>> +        ret = -EINVAL;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = NULL;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (!ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> new file mode 100644
>> index 0000000..01ec385
>> --- /dev/null
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -0,0 +1,59 @@
>> +/*
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +#ifndef __DMA_RESERVED_IOMMU_H
>> +#define __DMA_RESERVED_IOMMU_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +
>> +struct iommu_domain;
>> +
>> +#ifdef CONFIG_IOMMU_DMA_RESERVED
>> +
>> +/**
>> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + * @iova: base iova address
>> + * @size: iova window size
>> + * @prot: protection attribute flags
>> + * @order: page order
>> + */
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order);
>> +
>> +/**
>> + * iommu_free_reserved_iova_domain: free the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + */
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>> +
>> +#else
>> +
>> +static inline int
>> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                 dma_addr_t iova, size_t size, int prot,
>> +                 unsigned long order)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void
>> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>> +
>> +#endif    /* CONFIG_IOMMU_DMA_RESERVED */
>> +#endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:11       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 13:11 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 04/20/2016 03:03 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
>> alloc_reserved_iova_domain initializes an iova domain at a given
>> iova base address and with a given size. This iova domain will
>> be used to allocate iova within that window. Those IOVAs will be reserved
>> for special purpose, typically MSI frame binding. Allocation function
>> within the reserved iova domain will be introduced in subsequent patches.
> 
> Am I the only one thinking "Yo dawg, I heard you like IOVA allocators,
> so we put an IOVA allocator in your IOVA allocator so you can allocate
> IOVAs while you allocate IOVAs."

I will try to rephrase ;-)
> 
>> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
>> is set.
> 
> DMA? I thought we were going to be dealing with MSI descriptors?

Initially this was aiming at mapping MSI frames but some people
requested it becomes a generic API to allocate so called "reserved IOVA"
usable to map special physical pages, such as MSI frames.

Hope it clarifies a little bit :-(

Eric
> 
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>> v7:
>> - fix locking
>> - add iova_cache_get/put
>> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
>> - introduce struct reserved_iova_domain to encapsulate prot info &
>>    add prot parameter in alloc_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin lock instead of mutex
>>
>> v3 -> v4:
>> - formerly in "iommu/arm-smmu: implement
>> alloc/free_reserved_iova_domain" &
>>    "iommu: add alloc/free_reserved_iova_domain"
>>
>> v2 -> v3:
>> - remove iommu_alloc_reserved_iova_domain &
>> iommu_free_reserved_iova_domain
>>    static implementation in case CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - moved from vfio API to IOMMU API
>> ---
>>   drivers/iommu/Kconfig              |  8 +++
>>   drivers/iommu/Makefile             |  1 +
>>   drivers/iommu/dma-reserved-iommu.c | 99
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>>   4 files changed, 167 insertions(+)
>>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>>   create mode 100644 include/linux/dma-reserved-iommu.h
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index dd1dc39..a5a1530 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -74,6 +74,12 @@ config IOMMU_DMA
>>       select IOMMU_IOVA
>>       select NEED_SG_DMA_LENGTH
>>
>> +# IOMMU reserved IOVA mapping (MSI doorbell)
> 
> The only reference to MSIs I can see, but I can't make head nor tail of
> it. Sorry, I don't understand this patch at all :(
> 
> Robin.
> 
>> +config IOMMU_DMA_RESERVED
>> +    bool
>> +    select IOMMU_API
>> +    select IOMMU_IOVA
>> +
>>   config FSL_PAMU
>>       bool "Freescale IOMMU support"
>>       depends on PPC32
>> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>>   config ARM_SMMU
>>       bool "ARM Ltd. System MMU (SMMU) Support"
>>       depends on (ARM64 || ARM) && MMU
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select ARM_DMA_USE_IOMMU if ARM
>> @@ -320,6 +327,7 @@ config ARM_SMMU
>>   config ARM_SMMU_V3
>>       bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>>       depends on ARM64 && PCI
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select GENERIC_MSI_IRQ_DOMAIN
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index c6edb31..6c9ae99 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> new file mode 100644
>> index 0000000..2562af0
>> --- /dev/null
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -0,0 +1,99 @@
>> +/*
>> + * Reserved IOVA Management
>> + *
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include <linux/iommu.h>
>> +#include <linux/iova.h>
>> +
>> +struct reserved_iova_domain {
>> +    struct iova_domain *iovad;
>> +    int prot; /* iommu protection attributes to be obeyed */
>> +};
>> +
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order)
>> +{
>> +    unsigned long granule, mask, flags;
>> +    struct reserved_iova_domain *rid;
>> +    int ret = 0;
>> +
>> +    granule = 1UL << order;
>> +    mask = granule - 1;
>> +    if (iova & mask || (!size) || (size & mask))
>> +        return -EINVAL;
>> +
>> +    rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
>> +    if (!rid)
>> +        return -ENOMEM;
>> +
>> +    rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
>> +    if (!rid->iovad) {
>> +        kfree(rid);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    iova_cache_get();
>> +
>> +    init_iova_domain(rid->iovad, granule,
>> +             iova >> order, (iova + size - 1) >> order);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    if (domain->reserved_iova_cookie) {
>> +        ret = -EEXIST;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = rid;
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>> +
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>> +{
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long flags;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid) {
>> +        ret = -EINVAL;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = NULL;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (!ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> new file mode 100644
>> index 0000000..01ec385
>> --- /dev/null
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -0,0 +1,59 @@
>> +/*
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +#ifndef __DMA_RESERVED_IOMMU_H
>> +#define __DMA_RESERVED_IOMMU_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +
>> +struct iommu_domain;
>> +
>> +#ifdef CONFIG_IOMMU_DMA_RESERVED
>> +
>> +/**
>> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + * @iova: base iova address
>> + * @size: iova window size
>> + * @prot: protection attribute flags
>> + * @order: page order
>> + */
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order);
>> +
>> +/**
>> + * iommu_free_reserved_iova_domain: free the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + */
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>> +
>> +#else
>> +
>> +static inline int
>> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                 dma_addr_t iova, size_t size, int prot,
>> +                 unsigned long order)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void
>> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>> +
>> +#endif    /* CONFIG_IOMMU_DMA_RESERVED */
>> +#endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
@ 2016-04-20 13:11       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/20/2016 03:03 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce alloc/free_reserved_iova_domain in the IOMMU API.
>> alloc_reserved_iova_domain initializes an iova domain at a given
>> iova base address and with a given size. This iova domain will
>> be used to allocate iova within that window. Those IOVAs will be reserved
>> for special purpose, typically MSI frame binding. Allocation function
>> within the reserved iova domain will be introduced in subsequent patches.
> 
> Am I the only one thinking "Yo dawg, I heard you like IOVA allocators,
> so we put an IOVA allocator in your IOVA allocator so you can allocate
> IOVAs while you allocate IOVAs."

I will try to rephrase ;-)
> 
>> Those functions are fully implemented if CONFIG_IOMMU_DMA_RESERVED
>> is set.
> 
> DMA? I thought we were going to be dealing with MSI descriptors?

Initially this was aiming at mapping MSI frames but some people
requested it becomes a generic API to allocate so called "reserved IOVA"
usable to map special physical pages, such as MSI frames.

Hope it clarifies a little bit :-(

Eric
> 
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v7:
>> - fix locking
>> - add iova_cache_get/put
>> - static inline functions when CONFIG_IOMMU_DMA_RESERVED is not set
>> - introduce struct reserved_iova_domain to encapsulate prot info &
>>    add prot parameter in alloc_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin lock instead of mutex
>>
>> v3 -> v4:
>> - formerly in "iommu/arm-smmu: implement
>> alloc/free_reserved_iova_domain" &
>>    "iommu: add alloc/free_reserved_iova_domain"
>>
>> v2 -> v3:
>> - remove iommu_alloc_reserved_iova_domain &
>> iommu_free_reserved_iova_domain
>>    static implementation in case CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - moved from vfio API to IOMMU API
>> ---
>>   drivers/iommu/Kconfig              |  8 +++
>>   drivers/iommu/Makefile             |  1 +
>>   drivers/iommu/dma-reserved-iommu.c | 99
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 59 +++++++++++++++++++++++
>>   4 files changed, 167 insertions(+)
>>   create mode 100644 drivers/iommu/dma-reserved-iommu.c
>>   create mode 100644 include/linux/dma-reserved-iommu.h
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index dd1dc39..a5a1530 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -74,6 +74,12 @@ config IOMMU_DMA
>>       select IOMMU_IOVA
>>       select NEED_SG_DMA_LENGTH
>>
>> +# IOMMU reserved IOVA mapping (MSI doorbell)
> 
> The only reference to MSIs I can see, but I can't make head nor tail of
> it. Sorry, I don't understand this patch at all :(
> 
> Robin.
> 
>> +config IOMMU_DMA_RESERVED
>> +    bool
>> +    select IOMMU_API
>> +    select IOMMU_IOVA
>> +
>>   config FSL_PAMU
>>       bool "Freescale IOMMU support"
>>       depends on PPC32
>> @@ -307,6 +313,7 @@ config SPAPR_TCE_IOMMU
>>   config ARM_SMMU
>>       bool "ARM Ltd. System MMU (SMMU) Support"
>>       depends on (ARM64 || ARM) && MMU
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select ARM_DMA_USE_IOMMU if ARM
>> @@ -320,6 +327,7 @@ config ARM_SMMU
>>   config ARM_SMMU_V3
>>       bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>>       depends on ARM64 && PCI
>> +    select IOMMU_DMA_RESERVED
>>       select IOMMU_API
>>       select IOMMU_IO_PGTABLE_LPAE
>>       select GENERIC_MSI_IRQ_DOMAIN
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index c6edb31..6c9ae99 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -2,6 +2,7 @@ obj-$(CONFIG_IOMMU_API) += iommu.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>>   obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
>>   obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>> +obj-$(CONFIG_IOMMU_DMA_RESERVED) += dma-reserved-iommu.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>>   obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> new file mode 100644
>> index 0000000..2562af0
>> --- /dev/null
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -0,0 +1,99 @@
>> +/*
>> + * Reserved IOVA Management
>> + *
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +
>> +#include <linux/iommu.h>
>> +#include <linux/iova.h>
>> +
>> +struct reserved_iova_domain {
>> +    struct iova_domain *iovad;
>> +    int prot; /* iommu protection attributes to be obeyed */
>> +};
>> +
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order)
>> +{
>> +    unsigned long granule, mask, flags;
>> +    struct reserved_iova_domain *rid;
>> +    int ret = 0;
>> +
>> +    granule = 1UL << order;
>> +    mask = granule - 1;
>> +    if (iova & mask || (!size) || (size & mask))
>> +        return -EINVAL;
>> +
>> +    rid = kzalloc(sizeof(struct reserved_iova_domain), GFP_KERNEL);
>> +    if (!rid)
>> +        return -ENOMEM;
>> +
>> +    rid->iovad = kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
>> +    if (!rid->iovad) {
>> +        kfree(rid);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    iova_cache_get();
>> +
>> +    init_iova_domain(rid->iovad, granule,
>> +             iova >> order, (iova + size - 1) >> order);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    if (domain->reserved_iova_cookie) {
>> +        ret = -EEXIST;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = rid;
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>> +
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>> +{
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long flags;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid) {
>> +        ret = -EINVAL;
>> +        goto unlock;
>> +    }
>> +
>> +    domain->reserved_iova_cookie = NULL;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (!ret) {
>> +        put_iova_domain(rid->iovad);
>> +        kfree(rid->iovad);
>> +        kfree(rid);
>> +        iova_cache_put();
>> +    }
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> new file mode 100644
>> index 0000000..01ec385
>> --- /dev/null
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -0,0 +1,59 @@
>> +/*
>> + * Copyright (c) 2015 Linaro Ltd.
>> + *              www.linaro.org
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + */
>> +#ifndef __DMA_RESERVED_IOMMU_H
>> +#define __DMA_RESERVED_IOMMU_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +
>> +struct iommu_domain;
>> +
>> +#ifdef CONFIG_IOMMU_DMA_RESERVED
>> +
>> +/**
>> + * iommu_alloc_reserved_iova_domain: allocate the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + * @iova: base iova address
>> + * @size: iova window size
>> + * @prot: protection attribute flags
>> + * @order: page order
>> + */
>> +int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                     dma_addr_t iova, size_t size, int prot,
>> +                     unsigned long order);
>> +
>> +/**
>> + * iommu_free_reserved_iova_domain: free the reserved iova domain
>> + *
>> + * @domain: iommu domain handle
>> + */
>> +void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>> +
>> +#else
>> +
>> +static inline int
>> +iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>> +                 dma_addr_t iova, size_t size, int prot,
>> +                 unsigned long order)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void
>> +iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>> +
>> +#endif    /* CONFIG_IOMMU_DMA_RESERVED */
>> +#endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 13:12     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:12 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> we will need to track which host physical addresses are mapped to
> reserved IOVA. In that prospect we introduce a new RB tree indexed
> by physical address. This RB tree only is used for reserved IOVA
> bindings.
>
> It is expected this RB tree will contain very few bindings.

Sounds like a good reason in favour of using a list, and thus having 
rather less code here ;)

>  Those
> generally correspond to single page mapping one MSI frame (GICv2m
> frame or ITS GITS_TRANSLATER frame).
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v5 -> v6:
> - add comment about @d->reserved_lock to be held
>
> v3 -> v4:
> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2562af0..f6fa18e 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>   	int prot; /* iommu protection attributes to be obeyed */
>   };
>
> +struct iommu_reserved_binding {
> +	struct kref		kref;
> +	struct rb_node		node;
> +	struct iommu_domain	*domain;

Hang on, the tree these are in is already embedded in a domain. Ergo we 
can't look them up without first knowing the domain they belong to, so 
what purpose does this guy serve?

Robin.

> +	phys_addr_t		addr;
> +	dma_addr_t		iova;
> +	size_t			size;
> +};
> +
> +/* Reserved binding RB-tree manipulation */
> +
> +/* @d->reserved_lock must be held */
> +static struct iommu_reserved_binding *find_reserved_binding(
> +				    struct iommu_domain *d,
> +				    phys_addr_t start, size_t size)
> +{
> +	struct rb_node *node = d->reserved_binding_list.rb_node;
> +
> +	while (node) {
> +		struct iommu_reserved_binding *binding =
> +			rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		if (start + size <= binding->addr)
> +			node = node->rb_left;
> +		else if (start >= binding->addr + binding->size)
> +			node = node->rb_right;
> +		else
> +			return binding;
> +	}
> +
> +	return NULL;
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void link_reserved_binding(struct iommu_domain *d,
> +				  struct iommu_reserved_binding *new)
> +{
> +	struct rb_node **link = &d->reserved_binding_list.rb_node;
> +	struct rb_node *parent = NULL;
> +	struct iommu_reserved_binding *binding;
> +
> +	while (*link) {
> +		parent = *link;
> +		binding = rb_entry(parent, struct iommu_reserved_binding,
> +				   node);
> +
> +		if (new->addr + new->size <= binding->addr)
> +			link = &(*link)->rb_left;
> +		else
> +			link = &(*link)->rb_right;
> +	}
> +
> +	rb_link_node(&new->node, parent, link);
> +	rb_insert_color(&new->node, &d->reserved_binding_list);
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void unlink_reserved_binding(struct iommu_domain *d,
> +				    struct iommu_reserved_binding *old)
> +{
> +	rb_erase(&old->node, &d->reserved_binding_list);
> +}
> +
>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   				     dma_addr_t iova, size_t size, int prot,
>   				     unsigned long order)
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 13:12     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:12 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> we will need to track which host physical addresses are mapped to
> reserved IOVA. In that prospect we introduce a new RB tree indexed
> by physical address. This RB tree only is used for reserved IOVA
> bindings.
>
> It is expected this RB tree will contain very few bindings.

Sounds like a good reason in favour of using a list, and thus having 
rather less code here ;)

>  Those
> generally correspond to single page mapping one MSI frame (GICv2m
> frame or ITS GITS_TRANSLATER frame).
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
> v5 -> v6:
> - add comment about @d->reserved_lock to be held
>
> v3 -> v4:
> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2562af0..f6fa18e 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>   	int prot; /* iommu protection attributes to be obeyed */
>   };
>
> +struct iommu_reserved_binding {
> +	struct kref		kref;
> +	struct rb_node		node;
> +	struct iommu_domain	*domain;

Hang on, the tree these are in is already embedded in a domain. Ergo we 
can't look them up without first knowing the domain they belong to, so 
what purpose does this guy serve?

Robin.

> +	phys_addr_t		addr;
> +	dma_addr_t		iova;
> +	size_t			size;
> +};
> +
> +/* Reserved binding RB-tree manipulation */
> +
> +/* @d->reserved_lock must be held */
> +static struct iommu_reserved_binding *find_reserved_binding(
> +				    struct iommu_domain *d,
> +				    phys_addr_t start, size_t size)
> +{
> +	struct rb_node *node = d->reserved_binding_list.rb_node;
> +
> +	while (node) {
> +		struct iommu_reserved_binding *binding =
> +			rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		if (start + size <= binding->addr)
> +			node = node->rb_left;
> +		else if (start >= binding->addr + binding->size)
> +			node = node->rb_right;
> +		else
> +			return binding;
> +	}
> +
> +	return NULL;
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void link_reserved_binding(struct iommu_domain *d,
> +				  struct iommu_reserved_binding *new)
> +{
> +	struct rb_node **link = &d->reserved_binding_list.rb_node;
> +	struct rb_node *parent = NULL;
> +	struct iommu_reserved_binding *binding;
> +
> +	while (*link) {
> +		parent = *link;
> +		binding = rb_entry(parent, struct iommu_reserved_binding,
> +				   node);
> +
> +		if (new->addr + new->size <= binding->addr)
> +			link = &(*link)->rb_left;
> +		else
> +			link = &(*link)->rb_right;
> +	}
> +
> +	rb_link_node(&new->node, parent, link);
> +	rb_insert_color(&new->node, &d->reserved_binding_list);
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void unlink_reserved_binding(struct iommu_domain *d,
> +				    struct iommu_reserved_binding *old)
> +{
> +	rb_erase(&old->node, &d->reserved_binding_list);
> +}
> +
>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   				     dma_addr_t iova, size_t size, int prot,
>   				     unsigned long order)
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 13:12     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> we will need to track which host physical addresses are mapped to
> reserved IOVA. In that prospect we introduce a new RB tree indexed
> by physical address. This RB tree only is used for reserved IOVA
> bindings.
>
> It is expected this RB tree will contain very few bindings.

Sounds like a good reason in favour of using a list, and thus having 
rather less code here ;)

>  Those
> generally correspond to single page mapping one MSI frame (GICv2m
> frame or ITS GITS_TRANSLATER frame).
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v5 -> v6:
> - add comment about @d->reserved_lock to be held
>
> v3 -> v4:
> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB tree"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 63 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2562af0..f6fa18e 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>   	int prot; /* iommu protection attributes to be obeyed */
>   };
>
> +struct iommu_reserved_binding {
> +	struct kref		kref;
> +	struct rb_node		node;
> +	struct iommu_domain	*domain;

Hang on, the tree these are in is already embedded in a domain. Ergo we 
can't look them up without first knowing the domain they belong to, so 
what purpose does this guy serve?

Robin.

> +	phys_addr_t		addr;
> +	dma_addr_t		iova;
> +	size_t			size;
> +};
> +
> +/* Reserved binding RB-tree manipulation */
> +
> +/* @d->reserved_lock must be held */
> +static struct iommu_reserved_binding *find_reserved_binding(
> +				    struct iommu_domain *d,
> +				    phys_addr_t start, size_t size)
> +{
> +	struct rb_node *node = d->reserved_binding_list.rb_node;
> +
> +	while (node) {
> +		struct iommu_reserved_binding *binding =
> +			rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		if (start + size <= binding->addr)
> +			node = node->rb_left;
> +		else if (start >= binding->addr + binding->size)
> +			node = node->rb_right;
> +		else
> +			return binding;
> +	}
> +
> +	return NULL;
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void link_reserved_binding(struct iommu_domain *d,
> +				  struct iommu_reserved_binding *new)
> +{
> +	struct rb_node **link = &d->reserved_binding_list.rb_node;
> +	struct rb_node *parent = NULL;
> +	struct iommu_reserved_binding *binding;
> +
> +	while (*link) {
> +		parent = *link;
> +		binding = rb_entry(parent, struct iommu_reserved_binding,
> +				   node);
> +
> +		if (new->addr + new->size <= binding->addr)
> +			link = &(*link)->rb_left;
> +		else
> +			link = &(*link)->rb_right;
> +	}
> +
> +	rb_link_node(&new->node, parent, link);
> +	rb_insert_color(&new->node, &d->reserved_binding_list);
> +}
> +
> +/* @d->reserved_lock must be held */
> +static void unlink_reserved_binding(struct iommu_domain *d,
> +				    struct iommu_reserved_binding *old)
> +{
> +	rb_erase(&old->node, &d->reserved_binding_list);
> +}
> +
>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   				     dma_addr_t iova, size_t size, int prot,
>   				     unsigned long order)
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 15:58       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 15:58 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/20/2016 02:47 PM, Robin Murphy wrote:
> Hi Eric,
> 
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>> this means the MSI addresses need to be mapped in the IOMMU.
>>
>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>> FEF0_000h] window which directly targets the APIC configuration space and
>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>> conveyed through the IOMMU.
> 
> What's stopping us from simply inferring this from the domain's IOMMU
> not advertising interrupt remapping capabilities?
My current understanding is it is not possible:
on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
disabled) and MSIs are never mapped in the IOMMU I think.

Best Regards

Eric
> 
> Robin.
> 
>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v4 -> v5:
>> - introduce the user in the next patch
>>
>> RFC v1 -> v1:
>> - the data field is not used
>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>    capability if needed or <0 if not.
>> - removed struct iommu_domain_msi_maps
>> ---
>>   include/linux/iommu.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 62a5eae..b3e8c5b 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>       DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>       DOMAIN_ATTR_FSL_PAMUV1,
>>       DOMAIN_ATTR_NESTING,    /* two stages of translation */
>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>       DOMAIN_ATTR_MAX,
>>   };
>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 15:58       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 15:58 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/20/2016 02:47 PM, Robin Murphy wrote:
> Hi Eric,
> 
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>> this means the MSI addresses need to be mapped in the IOMMU.
>>
>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>> FEF0_000h] window which directly targets the APIC configuration space and
>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>> conveyed through the IOMMU.
> 
> What's stopping us from simply inferring this from the domain's IOMMU
> not advertising interrupt remapping capabilities?
My current understanding is it is not possible:
on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
disabled) and MSIs are never mapped in the IOMMU I think.

Best Regards

Eric
> 
> Robin.
> 
>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>>
>> v4 -> v5:
>> - introduce the user in the next patch
>>
>> RFC v1 -> v1:
>> - the data field is not used
>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>    capability if needed or <0 if not.
>> - removed struct iommu_domain_msi_maps
>> ---
>>   include/linux/iommu.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 62a5eae..b3e8c5b 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>       DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>       DOMAIN_ATTR_FSL_PAMUV1,
>>       DOMAIN_ATTR_NESTING,    /* two stages of translation */
>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>       DOMAIN_ATTR_MAX,
>>   };
>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-20 15:58       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/20/2016 02:47 PM, Robin Murphy wrote:
> Hi Eric,
> 
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>> this means the MSI addresses need to be mapped in the IOMMU.
>>
>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>> FEF0_000h] window which directly targets the APIC configuration space and
>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>> conveyed through the IOMMU.
> 
> What's stopping us from simply inferring this from the domain's IOMMU
> not advertising interrupt remapping capabilities?
My current understanding is it is not possible:
on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
disabled) and MSIs are never mapped in the IOMMU I think.

Best Regards

Eric
> 
> Robin.
> 
>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v4 -> v5:
>> - introduce the user in the next patch
>>
>> RFC v1 -> v1:
>> - the data field is not used
>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>    capability if needed or <0 if not.
>> - removed struct iommu_domain_msi_maps
>> ---
>>   include/linux/iommu.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 62a5eae..b3e8c5b 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>       DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>       DOMAIN_ATTR_FSL_PAMUV1,
>>       DOMAIN_ATTR_NESTING,    /* two stages of translation */
>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>       DOMAIN_ATTR_MAX,
>>   };
>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 16:14       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:14 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/20/2016 02:55 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces some new fields in the iommu_domain struct,
>> dedicated to reserved iova management.
>>
>> In a similar way as DMA mapping IOVA window, we need to store
>> information related to a reserved IOVA window.
>>
>> The reserved_iova_cookie will store the reserved iova_domain
>> handle. An RB tree indexed by physical address is introduced to
>> store the host physical addresses bound to reserved IOVAs.
>>
>> Those physical addresses will correspond to MSI frame base
>> addresses, also referred to as doorbells. Their number should be
>> quite limited per domain.
>>
>> Also a spin_lock is introduced to protect accesses to the iova_domain
>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>> tree will need to be accessed in MSI controller code not allowed to
>> sleep.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v5 -> v6:
>> - initialize reserved_binding_list
>> - use a spinlock instead of a mutex
>> ---
>>   drivers/iommu/iommu.c | 2 ++
>>   include/linux/iommu.h | 6 ++++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index b9df141..f70ef3b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>> *__iommu_domain_alloc(struct bus_type *bus,
>>
>>       domain->ops  = bus->iommu_ops;
>>       domain->type = type;
>> +    spin_lock_init(&domain->reserved_lock);
>> +    domain->reserved_binding_list = RB_ROOT;
>>
>>       return domain;
>>   }
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b3e8c5b..60999db 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -24,6 +24,7 @@
>>   #include <linux/of.h>
>>   #include <linux/types.h>
>>   #include <linux/scatterlist.h>
>> +#include <linux/spinlock.h>
>>   #include <trace/events/iommu.h>
>>
>>   #define IOMMU_READ    (1 << 0)
>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>       void *handler_token;
>>       struct iommu_domain_geometry geometry;
>>       void *iova_cookie;
>> +    void *reserved_iova_cookie;
> 
> Why exactly do we need this? From your description, it's for the user of
> the domain to keep track of IOVA allocations in, but then that's
> precisely what the iova_cookie exists for.

I was not sure whether both APIs could not be used concurrently, hence a
separate cookie. If we only consider MSI mapping use case I guess we are
either with a DMA domain or with a domain for VFIO and I would agree
with you, ie. we can reuse the same cookie.
> 
>> +    /* rb tree indexed by PA, for reserved bindings only */
>> +    struct rb_root reserved_binding_list;
> 
> Nit: that's more puzzling than helpful - "reserved binding" is
> particularly vague and nondescript, and makes me think of anything but
> MSI descriptors.
my heart is torn between advised genericity and MSI use case. My natural
short-sighted inclination would head me for an MSI mapping dedicated API
but I am following advices. As discussed with Alex there are
implementation details pretty related to MSI problematics I think (the
fact we store the "bindings" in an rb-tree/list, locking)

If Marc & Alex I can retarget this API to be less generic.

 Plus it's called a list but isn't a list (that said,
> given that we'd typically only expect a handful of entries, and lookups
> are hardly going to be a performance-critical bottleneck, would a simple
> list not suffice?)
I fully agree on that point. An rb-tree is overkill today for MSI use
case. Again if we were to use this API for anything else, this may
change the decision. But sure we can refactor afterwards upon needs. TBH
the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
was originally located.
> 
>> +    /* protects reserved cookie and rbtree manipulation */
>> +    spinlock_t reserved_lock;
> 
> A cookie is an opaque structure, so any locking it needs would normally
> be hidden within. If on the other hand it's not meant to be opaque at
> this level, then it should probably be something more specific than a
> void * (if at all, as above).
agreed

Thanks

Eric
> 
> Robin.
> 
>>   };
>>
>>   enum iommu_cap {
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 16:14       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:14 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/20/2016 02:55 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces some new fields in the iommu_domain struct,
>> dedicated to reserved iova management.
>>
>> In a similar way as DMA mapping IOVA window, we need to store
>> information related to a reserved IOVA window.
>>
>> The reserved_iova_cookie will store the reserved iova_domain
>> handle. An RB tree indexed by physical address is introduced to
>> store the host physical addresses bound to reserved IOVAs.
>>
>> Those physical addresses will correspond to MSI frame base
>> addresses, also referred to as doorbells. Their number should be
>> quite limited per domain.
>>
>> Also a spin_lock is introduced to protect accesses to the iova_domain
>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>> tree will need to be accessed in MSI controller code not allowed to
>> sleep.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>> v5 -> v6:
>> - initialize reserved_binding_list
>> - use a spinlock instead of a mutex
>> ---
>>   drivers/iommu/iommu.c | 2 ++
>>   include/linux/iommu.h | 6 ++++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index b9df141..f70ef3b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>> *__iommu_domain_alloc(struct bus_type *bus,
>>
>>       domain->ops  = bus->iommu_ops;
>>       domain->type = type;
>> +    spin_lock_init(&domain->reserved_lock);
>> +    domain->reserved_binding_list = RB_ROOT;
>>
>>       return domain;
>>   }
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b3e8c5b..60999db 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -24,6 +24,7 @@
>>   #include <linux/of.h>
>>   #include <linux/types.h>
>>   #include <linux/scatterlist.h>
>> +#include <linux/spinlock.h>
>>   #include <trace/events/iommu.h>
>>
>>   #define IOMMU_READ    (1 << 0)
>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>       void *handler_token;
>>       struct iommu_domain_geometry geometry;
>>       void *iova_cookie;
>> +    void *reserved_iova_cookie;
> 
> Why exactly do we need this? From your description, it's for the user of
> the domain to keep track of IOVA allocations in, but then that's
> precisely what the iova_cookie exists for.

I was not sure whether both APIs could not be used concurrently, hence a
separate cookie. If we only consider MSI mapping use case I guess we are
either with a DMA domain or with a domain for VFIO and I would agree
with you, ie. we can reuse the same cookie.
> 
>> +    /* rb tree indexed by PA, for reserved bindings only */
>> +    struct rb_root reserved_binding_list;
> 
> Nit: that's more puzzling than helpful - "reserved binding" is
> particularly vague and nondescript, and makes me think of anything but
> MSI descriptors.
my heart is torn between advised genericity and MSI use case. My natural
short-sighted inclination would head me for an MSI mapping dedicated API
but I am following advices. As discussed with Alex there are
implementation details pretty related to MSI problematics I think (the
fact we store the "bindings" in an rb-tree/list, locking)

If Marc & Alex I can retarget this API to be less generic.

 Plus it's called a list but isn't a list (that said,
> given that we'd typically only expect a handful of entries, and lookups
> are hardly going to be a performance-critical bottleneck, would a simple
> list not suffice?)
I fully agree on that point. An rb-tree is overkill today for MSI use
case. Again if we were to use this API for anything else, this may
change the decision. But sure we can refactor afterwards upon needs. TBH
the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
was originally located.
> 
>> +    /* protects reserved cookie and rbtree manipulation */
>> +    spinlock_t reserved_lock;
> 
> A cookie is an opaque structure, so any locking it needs would normally
> be hidden within. If on the other hand it's not meant to be opaque at
> this level, then it should probably be something more specific than a
> void * (if at all, as above).
agreed

Thanks

Eric
> 
> Robin.
> 
>>   };
>>
>>   enum iommu_cap {
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-20 16:14       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:14 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/20/2016 02:55 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces some new fields in the iommu_domain struct,
>> dedicated to reserved iova management.
>>
>> In a similar way as DMA mapping IOVA window, we need to store
>> information related to a reserved IOVA window.
>>
>> The reserved_iova_cookie will store the reserved iova_domain
>> handle. An RB tree indexed by physical address is introduced to
>> store the host physical addresses bound to reserved IOVAs.
>>
>> Those physical addresses will correspond to MSI frame base
>> addresses, also referred to as doorbells. Their number should be
>> quite limited per domain.
>>
>> Also a spin_lock is introduced to protect accesses to the iova_domain
>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>> tree will need to be accessed in MSI controller code not allowed to
>> sleep.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v5 -> v6:
>> - initialize reserved_binding_list
>> - use a spinlock instead of a mutex
>> ---
>>   drivers/iommu/iommu.c | 2 ++
>>   include/linux/iommu.h | 6 ++++++
>>   2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index b9df141..f70ef3b 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>> *__iommu_domain_alloc(struct bus_type *bus,
>>
>>       domain->ops  = bus->iommu_ops;
>>       domain->type = type;
>> +    spin_lock_init(&domain->reserved_lock);
>> +    domain->reserved_binding_list = RB_ROOT;
>>
>>       return domain;
>>   }
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index b3e8c5b..60999db 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -24,6 +24,7 @@
>>   #include <linux/of.h>
>>   #include <linux/types.h>
>>   #include <linux/scatterlist.h>
>> +#include <linux/spinlock.h>
>>   #include <trace/events/iommu.h>
>>
>>   #define IOMMU_READ    (1 << 0)
>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>       void *handler_token;
>>       struct iommu_domain_geometry geometry;
>>       void *iova_cookie;
>> +    void *reserved_iova_cookie;
> 
> Why exactly do we need this? From your description, it's for the user of
> the domain to keep track of IOVA allocations in, but then that's
> precisely what the iova_cookie exists for.

I was not sure whether both APIs could not be used concurrently, hence a
separate cookie. If we only consider MSI mapping use case I guess we are
either with a DMA domain or with a domain for VFIO and I would agree
with you, ie. we can reuse the same cookie.
> 
>> +    /* rb tree indexed by PA, for reserved bindings only */
>> +    struct rb_root reserved_binding_list;
> 
> Nit: that's more puzzling than helpful - "reserved binding" is
> particularly vague and nondescript, and makes me think of anything but
> MSI descriptors.
my heart is torn between advised genericity and MSI use case. My natural
short-sighted inclination would head me for an MSI mapping dedicated API
but I am following advices. As discussed with Alex there are
implementation details pretty related to MSI problematics I think (the
fact we store the "bindings" in an rb-tree/list, locking)

If Marc & Alex I can retarget this API to be less generic.

 Plus it's called a list but isn't a list (that said,
> given that we'd typically only expect a handful of entries, and lookups
> are hardly going to be a performance-critical bottleneck, would a simple
> list not suffice?)
I fully agree on that point. An rb-tree is overkill today for MSI use
case. Again if we were to use this API for anything else, this may
change the decision. But sure we can refactor afterwards upon needs. TBH
the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
was originally located.
> 
>> +    /* protects reserved cookie and rbtree manipulation */
>> +    spinlock_t reserved_lock;
> 
> A cookie is an opaque structure, so any locking it needs would normally
> be hidden within. If on the other hand it's not meant to be opaque at
> this level, then it should probably be something more specific than a
> void * (if at all, as above).
agreed

Thanks

Eric
> 
> Robin.
> 
>>   };
>>
>>   enum iommu_cap {
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 16:18       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:18 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Robin,
On 04/20/2016 03:12 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> we will need to track which host physical addresses are mapped to
>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>> by physical address. This RB tree only is used for reserved IOVA
>> bindings.
>>
>> It is expected this RB tree will contain very few bindings.
> 
> Sounds like a good reason in favour of using a list, and thus having
> rather less code here ;)

OK will move to a simple list.
> 
>>  Those
>> generally correspond to single page mapping one MSI frame (GICv2m
>> frame or ITS GITS_TRANSLATER frame).
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v5 -> v6:
>> - add comment about @d->reserved_lock to be held
>>
>> v3 -> v4:
>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>> tree"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 63
>> ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 63 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2562af0..f6fa18e 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>       int prot; /* iommu protection attributes to be obeyed */
>>   };
>>
>> +struct iommu_reserved_binding {
>> +    struct kref        kref;
>> +    struct rb_node        node;
>> +    struct iommu_domain    *domain;
> 
> Hang on, the tree these are in is already embedded in a domain. Ergo we
> can't look them up without first knowing the domain they belong to, so
> what purpose does this guy serve?
this is used on the kref_put. The release function takes a kref; then we
get the container to retrieve the binding and storing the domain here
enables to unlink the node.

Best Regards

Eric
> 
> Robin.
> 
>> +    phys_addr_t        addr;
>> +    dma_addr_t        iova;
>> +    size_t            size;
>> +};
>> +
>> +/* Reserved binding RB-tree manipulation */
>> +
>> +/* @d->reserved_lock must be held */
>> +static struct iommu_reserved_binding *find_reserved_binding(
>> +                    struct iommu_domain *d,
>> +                    phys_addr_t start, size_t size)
>> +{
>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>> +
>> +    while (node) {
>> +        struct iommu_reserved_binding *binding =
>> +            rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        if (start + size <= binding->addr)
>> +            node = node->rb_left;
>> +        else if (start >= binding->addr + binding->size)
>> +            node = node->rb_right;
>> +        else
>> +            return binding;
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void link_reserved_binding(struct iommu_domain *d,
>> +                  struct iommu_reserved_binding *new)
>> +{
>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>> +    struct rb_node *parent = NULL;
>> +    struct iommu_reserved_binding *binding;
>> +
>> +    while (*link) {
>> +        parent = *link;
>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>> +                   node);
>> +
>> +        if (new->addr + new->size <= binding->addr)
>> +            link = &(*link)->rb_left;
>> +        else
>> +            link = &(*link)->rb_right;
>> +    }
>> +
>> +    rb_link_node(&new->node, parent, link);
>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void unlink_reserved_binding(struct iommu_domain *d,
>> +                    struct iommu_reserved_binding *old)
>> +{
>> +    rb_erase(&old->node, &d->reserved_binding_list);
>> +}
>> +
>>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>                        dma_addr_t iova, size_t size, int prot,
>>                        unsigned long order)
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 16:18       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:18 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Robin,
On 04/20/2016 03:12 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> we will need to track which host physical addresses are mapped to
>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>> by physical address. This RB tree only is used for reserved IOVA
>> bindings.
>>
>> It is expected this RB tree will contain very few bindings.
> 
> Sounds like a good reason in favour of using a list, and thus having
> rather less code here ;)

OK will move to a simple list.
> 
>>  Those
>> generally correspond to single page mapping one MSI frame (GICv2m
>> frame or ITS GITS_TRANSLATER frame).
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>> v5 -> v6:
>> - add comment about @d->reserved_lock to be held
>>
>> v3 -> v4:
>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>> tree"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 63
>> ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 63 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2562af0..f6fa18e 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>       int prot; /* iommu protection attributes to be obeyed */
>>   };
>>
>> +struct iommu_reserved_binding {
>> +    struct kref        kref;
>> +    struct rb_node        node;
>> +    struct iommu_domain    *domain;
> 
> Hang on, the tree these are in is already embedded in a domain. Ergo we
> can't look them up without first knowing the domain they belong to, so
> what purpose does this guy serve?
this is used on the kref_put. The release function takes a kref; then we
get the container to retrieve the binding and storing the domain here
enables to unlink the node.

Best Regards

Eric
> 
> Robin.
> 
>> +    phys_addr_t        addr;
>> +    dma_addr_t        iova;
>> +    size_t            size;
>> +};
>> +
>> +/* Reserved binding RB-tree manipulation */
>> +
>> +/* @d->reserved_lock must be held */
>> +static struct iommu_reserved_binding *find_reserved_binding(
>> +                    struct iommu_domain *d,
>> +                    phys_addr_t start, size_t size)
>> +{
>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>> +
>> +    while (node) {
>> +        struct iommu_reserved_binding *binding =
>> +            rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        if (start + size <= binding->addr)
>> +            node = node->rb_left;
>> +        else if (start >= binding->addr + binding->size)
>> +            node = node->rb_right;
>> +        else
>> +            return binding;
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void link_reserved_binding(struct iommu_domain *d,
>> +                  struct iommu_reserved_binding *new)
>> +{
>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>> +    struct rb_node *parent = NULL;
>> +    struct iommu_reserved_binding *binding;
>> +
>> +    while (*link) {
>> +        parent = *link;
>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>> +                   node);
>> +
>> +        if (new->addr + new->size <= binding->addr)
>> +            link = &(*link)->rb_left;
>> +        else
>> +            link = &(*link)->rb_right;
>> +    }
>> +
>> +    rb_link_node(&new->node, parent, link);
>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void unlink_reserved_binding(struct iommu_domain *d,
>> +                    struct iommu_reserved_binding *old)
>> +{
>> +    rb_erase(&old->node, &d->reserved_binding_list);
>> +}
>> +
>>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>                        dma_addr_t iova, size_t size, int prot,
>>                        unsigned long order)
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-20 16:18       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-20 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

Robin,
On 04/20/2016 03:12 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> we will need to track which host physical addresses are mapped to
>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>> by physical address. This RB tree only is used for reserved IOVA
>> bindings.
>>
>> It is expected this RB tree will contain very few bindings.
> 
> Sounds like a good reason in favour of using a list, and thus having
> rather less code here ;)

OK will move to a simple list.
> 
>>  Those
>> generally correspond to single page mapping one MSI frame (GICv2m
>> frame or ITS GITS_TRANSLATER frame).
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v5 -> v6:
>> - add comment about @d->reserved_lock to be held
>>
>> v3 -> v4:
>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>> tree"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 63
>> ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 63 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2562af0..f6fa18e 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>       int prot; /* iommu protection attributes to be obeyed */
>>   };
>>
>> +struct iommu_reserved_binding {
>> +    struct kref        kref;
>> +    struct rb_node        node;
>> +    struct iommu_domain    *domain;
> 
> Hang on, the tree these are in is already embedded in a domain. Ergo we
> can't look them up without first knowing the domain they belong to, so
> what purpose does this guy serve?
this is used on the kref_put. The release function takes a kref; then we
get the container to retrieve the binding and storing the domain here
enables to unlink the node.

Best Regards

Eric
> 
> Robin.
> 
>> +    phys_addr_t        addr;
>> +    dma_addr_t        iova;
>> +    size_t            size;
>> +};
>> +
>> +/* Reserved binding RB-tree manipulation */
>> +
>> +/* @d->reserved_lock must be held */
>> +static struct iommu_reserved_binding *find_reserved_binding(
>> +                    struct iommu_domain *d,
>> +                    phys_addr_t start, size_t size)
>> +{
>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>> +
>> +    while (node) {
>> +        struct iommu_reserved_binding *binding =
>> +            rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        if (start + size <= binding->addr)
>> +            node = node->rb_left;
>> +        else if (start >= binding->addr + binding->size)
>> +            node = node->rb_right;
>> +        else
>> +            return binding;
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void link_reserved_binding(struct iommu_domain *d,
>> +                  struct iommu_reserved_binding *new)
>> +{
>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>> +    struct rb_node *parent = NULL;
>> +    struct iommu_reserved_binding *binding;
>> +
>> +    while (*link) {
>> +        parent = *link;
>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>> +                   node);
>> +
>> +        if (new->addr + new->size <= binding->addr)
>> +            link = &(*link)->rb_left;
>> +        else
>> +            link = &(*link)->rb_right;
>> +    }
>> +
>> +    rb_link_node(&new->node, parent, link);
>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>> +}
>> +
>> +/* @d->reserved_lock must be held */
>> +static void unlink_reserved_binding(struct iommu_domain *d,
>> +                    struct iommu_reserved_binding *old)
>> +{
>> +    rb_erase(&old->node, &d->reserved_binding_list);
>> +}
>> +
>>   int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>                        dma_addr_t iova, size_t size, int prot,
>>                        unsigned long order)
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-20 16:58     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 16:58 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces iommu_get/put_reserved_iova.
>
> iommu_get_reserved_iova allows to iommu map a contiguous physical region
> onto a reserved contiguous IOVA region. The physical region base address
> does not need to be iommu page size aligned. iova pages are allocated and
> mapped so that they cover all the physical region. This mapping is
> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>
> In case a mapping already exists for the physical pages, the IOVA mapped
> to the PA base is directly returned.
>
> Each time the get succeeds a binding ref count is incremented.
>
> iommu_put_reserved_iova decrements the ref count and when this latter
> is null, the mapping is destroyed and the iovas are released.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v7:
> - change title and rework commit message with new name of the functions
>    and size parameter
> - fix locking
> - rework header doc comments
> - put now takes a phys_addr_t
> - check prot argument against reserved_iova_domain prot flags
>
> v5 -> v6:
> - revisit locking with spin_lock instead of mutex
> - do not kref_get on 1st get
> - add size parameter to the get function following Marc's request
> - use the iova domain shift instead of using the smallest supported page size
>
> v3 -> v4:
> - formerly in iommu: iommu_get/put_single_reserved &
>    iommu/arm-smmu: implement iommu_get/put_single_reserved
> - Attempted to address Marc's doubts about missing size/alignment
>    at VFIO level (user-space knows the IOMMU page size and the number
>    of IOVA pages to provision)
>
> v2 -> v3:
> - remove static implementation of iommu_get_single_reserved &
>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> ---
>   drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>   2 files changed, 188 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index f6fa18e..426d339 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -135,6 +135,22 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>
> +/* called with domain's reserved_lock held */
> +static void reserved_binding_release(struct kref *kref)
> +{
> +	struct iommu_reserved_binding *b =
> +		container_of(kref, struct iommu_reserved_binding, kref);
> +	struct iommu_domain *d = b->domain;
> +	struct reserved_iova_domain *rid =
> +		(struct reserved_iova_domain *)d->reserved_iova_cookie;

Either it's a void *, in which case you don't need to cast it, or it 
should be the appropriate type as I mentioned earlier, in which case you 
still wouldn't need to cast it.

> +	unsigned long order;
> +
> +	order = iova_shift(rid->iovad);
> +	free_iova(rid->iovad, b->iova >> order);

iova_pfn() ?

> +	unlink_reserved_binding(d, b);
> +	kfree(b);
> +}
> +
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   {
>   	struct reserved_iova_domain *rid;
> @@ -160,3 +176,137 @@ unlock:
>   	}
>   }
>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> +
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			      phys_addr_t addr, size_t size, int prot,
> +			      dma_addr_t *iova)
> +{
> +	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	struct iommu_reserved_binding *b, *newb;
> +	size_t iommu_page_size, binding_size;
> +	phys_addr_t aligned_base, offset;
> +	struct reserved_iova_domain *rid;
> +	struct iova_domain *iovad;
> +	struct iova *p_iova;
> +	int ret = -EINVAL;
> +
> +	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
> +	if (!newb)
> +		return -ENOMEM;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto free_newb;
> +
> +	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
> +	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))

Are devices wanting to read from MSI doorbells really a thing?

> +		goto free_newb;
> +
> +	iovad = rid->iovad;
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

offset = iova_offset(iovad, addr);
aligned_base = addr - offset;
binding_size = iova_align(iovad, size + offset);

Am I right?

> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b) {
> +		*iova = b->iova + offset + aligned_base - b->addr;
> +		kref_get(&b->kref);
> +		ret = 0;
> +		goto free_newb;
> +	}
> +
> +	p_iova = alloc_iova(iovad, nb_iommu_pages,
> +			    iovad->dma_32bit_pfn, true);
> +	if (!p_iova) {
> +		ret = -ENOMEM;
> +		goto free_newb;
> +	}
> +
> +	*iova = iova_dma_addr(iovad, p_iova);
> +
> +	/* unlock to call iommu_map which is not guaranteed to be atomic */

Hmm, that's concerning, because the ARM DMA mapping code, and 
consequently the iommu-dma layer, has always relied on it being so. On 
brief inspection, it looks to be only the AMD IOMMU doing something 
obviously non-atomic (taking a mutex) in its map callback, but then that 
has a separate DMA ops implementation. It doesn't look like it would be 
too intrusive to change, either, but that's an idea for its own thread.

> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +
> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
> +	if (!rid || (rid->iovad != iovad)) {
> +		/* reserved iova domain was destroyed in our back */

That that could happen at all is terrifying! Surely the reserved domain 
should be set up immediately after iommu_domain_alloc() and torn down 
immediately before iommu_domain_free(). Things going missing while a 
domain is live smacks of horrible brokenness.

> +		ret = -EBUSY;
> +		goto free_newb; /* iova already released */
> +	}
> +
> +	/* no change in iova reserved domain but iommu_map failed */
> +	if (ret)
> +		goto free_iova;
> +
> +	/* everything is fine, add in the new node in the rb tree */
> +	kref_init(&newb->kref);
> +	newb->domain = domain;
> +	newb->addr = aligned_base;
> +	newb->iova = *iova;
> +	newb->size = binding_size;
> +
> +	link_reserved_binding(domain, newb);
> +
> +	*iova += offset;
> +	goto unlock;
> +
> +free_iova:
> +	free_iova(rid->iovad, p_iova->pfn_lo);
> +free_newb:
> +	kfree(newb);
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
> +
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
> +{
> +	phys_addr_t aligned_addr, page_size, mask;
> +	struct iommu_reserved_binding *b;
> +	struct reserved_iova_domain *rid;
> +	unsigned long order, flags;
> +	struct iommu_domain *d;
> +	dma_addr_t iova;
> +	size_t size;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto unlock;
> +
> +	order = iova_shift(rid->iovad);
> +	page_size = (uint64_t)1 << order;
> +	mask = page_size - 1;
> +	aligned_addr = addr & ~mask;

addr & ~iova_mask(rid->iovad)

> +
> +	b = find_reserved_binding(domain, aligned_addr, page_size);
> +	if (!b)
> +		goto unlock;
> +
> +	iova = b->iova;
> +	size = b->size;
> +	d = b->domain;
> +
> +	ret = kref_put(&b->kref, reserved_binding_release);
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret)
> +		iommu_unmap(d, iova, size);
> +}
> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
> +
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 01ec385..8722131 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>    */
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>
> +/**
> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
> + * map them to the physical range defined by @addr and @size.
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address to bind
> + * @size: size of the binding
> + * @prot: mapping protection attribute
> + * @iova: returned iova
> + *
> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
> + * where order corresponds to the reserved iova domain order.
> + * This mapping is tracked and reference counted with the minimal granularity
> + * of @size.
> + */
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			    phys_addr_t addr, size_t size, int prot,
> +			    dma_addr_t *iova);
> +
> +/**
> + * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address whose binding ref count is decremented
> + *
> + * if the binding ref count is null, destroy the reserved mapping
> + */
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>   #else
>
>   static inline int
> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   static inline void
>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>
> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
> +					  phys_addr_t addr, size_t size,
> +					  int prot, dma_addr_t *iova)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
> +					   phys_addr_t addr) {}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

I worry that this all falls into the trap of trying too hard to abstract 
something which doesn't need abstracting. AFAICS all we need is 
something for VFIO to keep track of its own IOVA usage vs. userspace's, 
plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging 
off the iommu_domain, with a handful of functions to manage them. The 
former is as good as solved already - stick an iova_domain or even just 
a bitmap in the iova_cookie and use it directly - and the latter would 
actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm 
seeing here is layers upon layers of complexity with no immediate 
justification, that's 'generic' enough to not directly solve the problem 
at hand, but in a way that still makes it more or less unusable for 
solving equivalent problems elsewhere.

Since I don't like that everything I have to say about this series so 
far seems negative, I'll plan to spend some time next week having a go 
at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I 
can offer code instead of criticism :)

Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-20 16:58     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 16:58 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces iommu_get/put_reserved_iova.
>
> iommu_get_reserved_iova allows to iommu map a contiguous physical region
> onto a reserved contiguous IOVA region. The physical region base address
> does not need to be iommu page size aligned. iova pages are allocated and
> mapped so that they cover all the physical region. This mapping is
> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>
> In case a mapping already exists for the physical pages, the IOVA mapped
> to the PA base is directly returned.
>
> Each time the get succeeds a binding ref count is incremented.
>
> iommu_put_reserved_iova decrements the ref count and when this latter
> is null, the mapping is destroyed and the iovas are released.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
> v7:
> - change title and rework commit message with new name of the functions
>    and size parameter
> - fix locking
> - rework header doc comments
> - put now takes a phys_addr_t
> - check prot argument against reserved_iova_domain prot flags
>
> v5 -> v6:
> - revisit locking with spin_lock instead of mutex
> - do not kref_get on 1st get
> - add size parameter to the get function following Marc's request
> - use the iova domain shift instead of using the smallest supported page size
>
> v3 -> v4:
> - formerly in iommu: iommu_get/put_single_reserved &
>    iommu/arm-smmu: implement iommu_get/put_single_reserved
> - Attempted to address Marc's doubts about missing size/alignment
>    at VFIO level (user-space knows the IOMMU page size and the number
>    of IOVA pages to provision)
>
> v2 -> v3:
> - remove static implementation of iommu_get_single_reserved &
>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> ---
>   drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>   2 files changed, 188 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index f6fa18e..426d339 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -135,6 +135,22 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>
> +/* called with domain's reserved_lock held */
> +static void reserved_binding_release(struct kref *kref)
> +{
> +	struct iommu_reserved_binding *b =
> +		container_of(kref, struct iommu_reserved_binding, kref);
> +	struct iommu_domain *d = b->domain;
> +	struct reserved_iova_domain *rid =
> +		(struct reserved_iova_domain *)d->reserved_iova_cookie;

Either it's a void *, in which case you don't need to cast it, or it 
should be the appropriate type as I mentioned earlier, in which case you 
still wouldn't need to cast it.

> +	unsigned long order;
> +
> +	order = iova_shift(rid->iovad);
> +	free_iova(rid->iovad, b->iova >> order);

iova_pfn() ?

> +	unlink_reserved_binding(d, b);
> +	kfree(b);
> +}
> +
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   {
>   	struct reserved_iova_domain *rid;
> @@ -160,3 +176,137 @@ unlock:
>   	}
>   }
>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> +
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			      phys_addr_t addr, size_t size, int prot,
> +			      dma_addr_t *iova)
> +{
> +	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	struct iommu_reserved_binding *b, *newb;
> +	size_t iommu_page_size, binding_size;
> +	phys_addr_t aligned_base, offset;
> +	struct reserved_iova_domain *rid;
> +	struct iova_domain *iovad;
> +	struct iova *p_iova;
> +	int ret = -EINVAL;
> +
> +	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
> +	if (!newb)
> +		return -ENOMEM;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto free_newb;
> +
> +	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
> +	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))

Are devices wanting to read from MSI doorbells really a thing?

> +		goto free_newb;
> +
> +	iovad = rid->iovad;
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

offset = iova_offset(iovad, addr);
aligned_base = addr - offset;
binding_size = iova_align(iovad, size + offset);

Am I right?

> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b) {
> +		*iova = b->iova + offset + aligned_base - b->addr;
> +		kref_get(&b->kref);
> +		ret = 0;
> +		goto free_newb;
> +	}
> +
> +	p_iova = alloc_iova(iovad, nb_iommu_pages,
> +			    iovad->dma_32bit_pfn, true);
> +	if (!p_iova) {
> +		ret = -ENOMEM;
> +		goto free_newb;
> +	}
> +
> +	*iova = iova_dma_addr(iovad, p_iova);
> +
> +	/* unlock to call iommu_map which is not guaranteed to be atomic */

Hmm, that's concerning, because the ARM DMA mapping code, and 
consequently the iommu-dma layer, has always relied on it being so. On 
brief inspection, it looks to be only the AMD IOMMU doing something 
obviously non-atomic (taking a mutex) in its map callback, but then that 
has a separate DMA ops implementation. It doesn't look like it would be 
too intrusive to change, either, but that's an idea for its own thread.

> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +
> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
> +	if (!rid || (rid->iovad != iovad)) {
> +		/* reserved iova domain was destroyed in our back */

That that could happen at all is terrifying! Surely the reserved domain 
should be set up immediately after iommu_domain_alloc() and torn down 
immediately before iommu_domain_free(). Things going missing while a 
domain is live smacks of horrible brokenness.

> +		ret = -EBUSY;
> +		goto free_newb; /* iova already released */
> +	}
> +
> +	/* no change in iova reserved domain but iommu_map failed */
> +	if (ret)
> +		goto free_iova;
> +
> +	/* everything is fine, add in the new node in the rb tree */
> +	kref_init(&newb->kref);
> +	newb->domain = domain;
> +	newb->addr = aligned_base;
> +	newb->iova = *iova;
> +	newb->size = binding_size;
> +
> +	link_reserved_binding(domain, newb);
> +
> +	*iova += offset;
> +	goto unlock;
> +
> +free_iova:
> +	free_iova(rid->iovad, p_iova->pfn_lo);
> +free_newb:
> +	kfree(newb);
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
> +
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
> +{
> +	phys_addr_t aligned_addr, page_size, mask;
> +	struct iommu_reserved_binding *b;
> +	struct reserved_iova_domain *rid;
> +	unsigned long order, flags;
> +	struct iommu_domain *d;
> +	dma_addr_t iova;
> +	size_t size;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto unlock;
> +
> +	order = iova_shift(rid->iovad);
> +	page_size = (uint64_t)1 << order;
> +	mask = page_size - 1;
> +	aligned_addr = addr & ~mask;

addr & ~iova_mask(rid->iovad)

> +
> +	b = find_reserved_binding(domain, aligned_addr, page_size);
> +	if (!b)
> +		goto unlock;
> +
> +	iova = b->iova;
> +	size = b->size;
> +	d = b->domain;
> +
> +	ret = kref_put(&b->kref, reserved_binding_release);
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret)
> +		iommu_unmap(d, iova, size);
> +}
> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
> +
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 01ec385..8722131 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>    */
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>
> +/**
> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
> + * map them to the physical range defined by @addr and @size.
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address to bind
> + * @size: size of the binding
> + * @prot: mapping protection attribute
> + * @iova: returned iova
> + *
> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
> + * where order corresponds to the reserved iova domain order.
> + * This mapping is tracked and reference counted with the minimal granularity
> + * of @size.
> + */
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			    phys_addr_t addr, size_t size, int prot,
> +			    dma_addr_t *iova);
> +
> +/**
> + * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address whose binding ref count is decremented
> + *
> + * if the binding ref count is null, destroy the reserved mapping
> + */
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>   #else
>
>   static inline int
> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   static inline void
>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>
> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
> +					  phys_addr_t addr, size_t size,
> +					  int prot, dma_addr_t *iova)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
> +					   phys_addr_t addr) {}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

I worry that this all falls into the trap of trying too hard to abstract 
something which doesn't need abstracting. AFAICS all we need is 
something for VFIO to keep track of its own IOVA usage vs. userspace's, 
plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging 
off the iommu_domain, with a handful of functions to manage them. The 
former is as good as solved already - stick an iova_domain or even just 
a bitmap in the iova_cookie and use it directly - and the latter would 
actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm 
seeing here is layers upon layers of complexity with no immediate 
justification, that's 'generic' enough to not directly solve the problem 
at hand, but in a way that still makes it more or less unusable for 
solving equivalent problems elsewhere.

Since I don't like that everything I have to say about this series so 
far seems negative, I'll plan to spend some time next week having a go 
at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I 
can offer code instead of criticism :)

Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-20 16:58     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 16:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> This patch introduces iommu_get/put_reserved_iova.
>
> iommu_get_reserved_iova allows to iommu map a contiguous physical region
> onto a reserved contiguous IOVA region. The physical region base address
> does not need to be iommu page size aligned. iova pages are allocated and
> mapped so that they cover all the physical region. This mapping is
> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>
> In case a mapping already exists for the physical pages, the IOVA mapped
> to the PA base is directly returned.
>
> Each time the get succeeds a binding ref count is incremented.
>
> iommu_put_reserved_iova decrements the ref count and when this latter
> is null, the mapping is destroyed and the iovas are released.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v7:
> - change title and rework commit message with new name of the functions
>    and size parameter
> - fix locking
> - rework header doc comments
> - put now takes a phys_addr_t
> - check prot argument against reserved_iova_domain prot flags
>
> v5 -> v6:
> - revisit locking with spin_lock instead of mutex
> - do not kref_get on 1st get
> - add size parameter to the get function following Marc's request
> - use the iova domain shift instead of using the smallest supported page size
>
> v3 -> v4:
> - formerly in iommu: iommu_get/put_single_reserved &
>    iommu/arm-smmu: implement iommu_get/put_single_reserved
> - Attempted to address Marc's doubts about missing size/alignment
>    at VFIO level (user-space knows the IOMMU page size and the number
>    of IOVA pages to provision)
>
> v2 -> v3:
> - remove static implementation of iommu_get_single_reserved &
>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>
> v1 -> v2:
> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> ---
>   drivers/iommu/dma-reserved-iommu.c | 150 +++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>   2 files changed, 188 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index f6fa18e..426d339 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -135,6 +135,22 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>
> +/* called with domain's reserved_lock held */
> +static void reserved_binding_release(struct kref *kref)
> +{
> +	struct iommu_reserved_binding *b =
> +		container_of(kref, struct iommu_reserved_binding, kref);
> +	struct iommu_domain *d = b->domain;
> +	struct reserved_iova_domain *rid =
> +		(struct reserved_iova_domain *)d->reserved_iova_cookie;

Either it's a void *, in which case you don't need to cast it, or it 
should be the appropriate type as I mentioned earlier, in which case you 
still wouldn't need to cast it.

> +	unsigned long order;
> +
> +	order = iova_shift(rid->iovad);
> +	free_iova(rid->iovad, b->iova >> order);

iova_pfn() ?

> +	unlink_reserved_binding(d, b);
> +	kfree(b);
> +}
> +
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   {
>   	struct reserved_iova_domain *rid;
> @@ -160,3 +176,137 @@ unlock:
>   	}
>   }
>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> +
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			      phys_addr_t addr, size_t size, int prot,
> +			      dma_addr_t *iova)
> +{
> +	unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	struct iommu_reserved_binding *b, *newb;
> +	size_t iommu_page_size, binding_size;
> +	phys_addr_t aligned_base, offset;
> +	struct reserved_iova_domain *rid;
> +	struct iova_domain *iovad;
> +	struct iova *p_iova;
> +	int ret = -EINVAL;
> +
> +	newb = kzalloc(sizeof(*newb), GFP_KERNEL);
> +	if (!newb)
> +		return -ENOMEM;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto free_newb;
> +
> +	if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
> +	    (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))

Are devices wanting to read from MSI doorbells really a thing?

> +		goto free_newb;
> +
> +	iovad = rid->iovad;
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

offset = iova_offset(iovad, addr);
aligned_base = addr - offset;
binding_size = iova_align(iovad, size + offset);

Am I right?

> +
> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> +	if (b) {
> +		*iova = b->iova + offset + aligned_base - b->addr;
> +		kref_get(&b->kref);
> +		ret = 0;
> +		goto free_newb;
> +	}
> +
> +	p_iova = alloc_iova(iovad, nb_iommu_pages,
> +			    iovad->dma_32bit_pfn, true);
> +	if (!p_iova) {
> +		ret = -ENOMEM;
> +		goto free_newb;
> +	}
> +
> +	*iova = iova_dma_addr(iovad, p_iova);
> +
> +	/* unlock to call iommu_map which is not guaranteed to be atomic */

Hmm, that's concerning, because the ARM DMA mapping code, and 
consequently the iommu-dma layer, has always relied on it being so. On 
brief inspection, it looks to be only the AMD IOMMU doing something 
obviously non-atomic (taking a mutex) in its map callback, but then that 
has a separate DMA ops implementation. It doesn't look like it would be 
too intrusive to change, either, but that's an idea for its own thread.

> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +
> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
> +	if (!rid || (rid->iovad != iovad)) {
> +		/* reserved iova domain was destroyed in our back */

That that could happen at all is terrifying! Surely the reserved domain 
should be set up immediately after iommu_domain_alloc() and torn down 
immediately before iommu_domain_free(). Things going missing while a 
domain is live smacks of horrible brokenness.

> +		ret = -EBUSY;
> +		goto free_newb; /* iova already released */
> +	}
> +
> +	/* no change in iova reserved domain but iommu_map failed */
> +	if (ret)
> +		goto free_iova;
> +
> +	/* everything is fine, add in the new node in the rb tree */
> +	kref_init(&newb->kref);
> +	newb->domain = domain;
> +	newb->addr = aligned_base;
> +	newb->iova = *iova;
> +	newb->size = binding_size;
> +
> +	link_reserved_binding(domain, newb);
> +
> +	*iova += offset;
> +	goto unlock;
> +
> +free_iova:
> +	free_iova(rid->iovad, p_iova->pfn_lo);
> +free_newb:
> +	kfree(newb);
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
> +
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr)
> +{
> +	phys_addr_t aligned_addr, page_size, mask;
> +	struct iommu_reserved_binding *b;
> +	struct reserved_iova_domain *rid;
> +	unsigned long order, flags;
> +	struct iommu_domain *d;
> +	dma_addr_t iova;
> +	size_t size;
> +	int ret = 0;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> +	if (!rid)
> +		goto unlock;
> +
> +	order = iova_shift(rid->iovad);
> +	page_size = (uint64_t)1 << order;
> +	mask = page_size - 1;
> +	aligned_addr = addr & ~mask;

addr & ~iova_mask(rid->iovad)

> +
> +	b = find_reserved_binding(domain, aligned_addr, page_size);
> +	if (!b)
> +		goto unlock;
> +
> +	iova = b->iova;
> +	size = b->size;
> +	d = b->domain;
> +
> +	ret = kref_put(&b->kref, reserved_binding_release);
> +
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	if (ret)
> +		iommu_unmap(d, iova, size);
> +}
> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
> +
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 01ec385..8722131 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>    */
>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>
> +/**
> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
> + * map them to the physical range defined by @addr and @size.
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address to bind
> + * @size: size of the binding
> + * @prot: mapping protection attribute
> + * @iova: returned iova
> + *
> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1) >> order]
> + * where order corresponds to the reserved iova domain order.
> + * This mapping is tracked and reference counted with the minimal granularity
> + * of @size.
> + */
> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> +			    phys_addr_t addr, size_t size, int prot,
> +			    dma_addr_t *iova);
> +
> +/**
> + * iommu_put_reserved_iova: decrement a ref count of the reserved mapping
> + *
> + * @domain: iommu domain handle
> + * @addr: physical address whose binding ref count is decremented
> + *
> + * if the binding ref count is null, destroy the reserved mapping
> + */
> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>   #else
>
>   static inline int
> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>   static inline void
>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>
> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
> +					  phys_addr_t addr, size_t size,
> +					  int prot, dma_addr_t *iova)
> +{
> +	return -ENOENT;
> +}
> +
> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
> +					   phys_addr_t addr) {}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

I worry that this all falls into the trap of trying too hard to abstract 
something which doesn't need abstracting. AFAICS all we need is 
something for VFIO to keep track of its own IOVA usage vs. userspace's, 
plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging 
off the iommu_domain, with a handful of functions to manage them. The 
former is as good as solved already - stick an iova_domain or even just 
a bitmap in the iova_cookie and use it directly - and the latter would 
actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm 
seeing here is layers upon layers of complexity with no immediate 
justification, that's 'generic' enough to not directly solve the problem 
at hand, but in a way that still makes it more or less unusable for 
solving equivalent problems elsewhere.

Since I don't like that everything I have to say about this series so 
far seems negative, I'll plan to spend some time next week having a go 
at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I 
can offer code instead of criticism :)

Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-20 17:05     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:05 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> Now reserved bindings can exist, destroy them when destroying
> the reserved iova domain. iommu_map is not supposed to be atomic,
> hence the extra complexity in the locking.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v6 -> v7:
> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>    destroy the bindings in iommu_free_reserved_iova_domain
>
> v5 -> v6:
> - use spin_lock instead of mutex
>
> v3 -> v4:
> - previously "iommu/arm-smmu: relinquish reserved resources on
>    domain deletion"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 34 ++++++++++++++++++++++++++++------
>   1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 426d339..2522235 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   	unsigned long flags;
>   	int ret = 0;
>
> -	spin_lock_irqsave(&domain->reserved_lock, flags);
> -
> -	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> -	if (!rid) {
> -		ret = -EINVAL;
> -		goto unlock;
> +	while (1) {
> +		struct iommu_reserved_binding *b;
> +		struct rb_node *node;
> +		dma_addr_t iova;
> +		size_t size;
> +
> +		spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +		rid = (struct reserved_iova_domain *)
> +				domain->reserved_iova_cookie;

Same comment about casting as before.

> +		if (!rid) {
> +			ret = -EINVAL;
> +			goto unlock;
> +		}
> +
> +		node = rb_first(&domain->reserved_binding_list);
> +		if (!node)
> +			break;
> +		b = rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		iova = b->iova;
> +		size = b->size;
> +
> +		while (!kref_put(&b->kref, reserved_binding_release))
> +			;

Since you're freeing the thing anyway, why not just call the function 
directly?

> +		spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +		iommu_unmap(domain, iova, size);
>   	}
>
> +	domain->reserved_binding_list = RB_ROOT;
>   	domain->reserved_iova_cookie = NULL;
>   unlock:
>   	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-20 17:05     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:05 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> Now reserved bindings can exist, destroy them when destroying
> the reserved iova domain. iommu_map is not supposed to be atomic,
> hence the extra complexity in the locking.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
> v6 -> v7:
> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>    destroy the bindings in iommu_free_reserved_iova_domain
>
> v5 -> v6:
> - use spin_lock instead of mutex
>
> v3 -> v4:
> - previously "iommu/arm-smmu: relinquish reserved resources on
>    domain deletion"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 34 ++++++++++++++++++++++++++++------
>   1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 426d339..2522235 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   	unsigned long flags;
>   	int ret = 0;
>
> -	spin_lock_irqsave(&domain->reserved_lock, flags);
> -
> -	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> -	if (!rid) {
> -		ret = -EINVAL;
> -		goto unlock;
> +	while (1) {
> +		struct iommu_reserved_binding *b;
> +		struct rb_node *node;
> +		dma_addr_t iova;
> +		size_t size;
> +
> +		spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +		rid = (struct reserved_iova_domain *)
> +				domain->reserved_iova_cookie;

Same comment about casting as before.

> +		if (!rid) {
> +			ret = -EINVAL;
> +			goto unlock;
> +		}
> +
> +		node = rb_first(&domain->reserved_binding_list);
> +		if (!node)
> +			break;
> +		b = rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		iova = b->iova;
> +		size = b->size;
> +
> +		while (!kref_put(&b->kref, reserved_binding_release))
> +			;

Since you're freeing the thing anyway, why not just call the function 
directly?

> +		spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +		iommu_unmap(domain, iova, size);
>   	}
>
> +	domain->reserved_binding_list = RB_ROOT;
>   	domain->reserved_iova_cookie = NULL;
>   unlock:
>   	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-20 17:05     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> Now reserved bindings can exist, destroy them when destroying
> the reserved iova domain. iommu_map is not supposed to be atomic,
> hence the extra complexity in the locking.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
> v6 -> v7:
> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>    destroy the bindings in iommu_free_reserved_iova_domain
>
> v5 -> v6:
> - use spin_lock instead of mutex
>
> v3 -> v4:
> - previously "iommu/arm-smmu: relinquish reserved resources on
>    domain deletion"
> ---
>   drivers/iommu/dma-reserved-iommu.c | 34 ++++++++++++++++++++++++++++------
>   1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 426d339..2522235 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>   	unsigned long flags;
>   	int ret = 0;
>
> -	spin_lock_irqsave(&domain->reserved_lock, flags);
> -
> -	rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
> -	if (!rid) {
> -		ret = -EINVAL;
> -		goto unlock;
> +	while (1) {
> +		struct iommu_reserved_binding *b;
> +		struct rb_node *node;
> +		dma_addr_t iova;
> +		size_t size;
> +
> +		spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +		rid = (struct reserved_iova_domain *)
> +				domain->reserved_iova_cookie;

Same comment about casting as before.

> +		if (!rid) {
> +			ret = -EINVAL;
> +			goto unlock;
> +		}
> +
> +		node = rb_first(&domain->reserved_binding_list);
> +		if (!node)
> +			break;
> +		b = rb_entry(node, struct iommu_reserved_binding, node);
> +
> +		iova = b->iova;
> +		size = b->size;
> +
> +		while (!kref_put(&b->kref, reserved_binding_release))
> +			;

Since you're freeing the thing anyway, why not just call the function 
directly?

> +		spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +		iommu_unmap(domain, iova, size);
>   	}
>
> +	domain->reserved_binding_list = RB_ROOT;
>   	domain->reserved_iova_cookie = NULL;
>   unlock:
>   	spin_unlock_irqrestore(&domain->reserved_lock, flags);
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-20 17:19     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:19 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> This function checks whether
> - the device emitting the MSI belongs to a non default iommu domain
> - the iommu domain requires the MSI address to be mapped.
>
> If those conditions are met, the function returns the iommu domain
> to be used for mapping the MSI doorbell; else it returns NULL.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> ---
>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>   2 files changed, 37 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2522235..907a17f 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -17,6 +17,7 @@
>
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
> +#include <linux/msi.h>
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -332,3 +333,21 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	struct device *dev;
> +	struct iommu_domain *d;
> +
> +	dev = msi_desc_to_dev(desc);
> +
> +	d = iommu_get_domain_for_dev(dev);
> +
> +	if (!d || (d->type == IOMMU_DOMAIN_DMA))
> +		return NULL;
> +
> +	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
> +		return NULL;

Yeah, I don't see why we couldn't just use

if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
	return NULL

there instead.

> +
> +	return d;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8722131..8373929 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -19,6 +19,7 @@
>   #include <linux/kernel.h>
>
>   struct iommu_domain;
> +struct msi_desc;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
>    * if the binding ref count is null, destroy the reserved mapping
>    */
>   void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
> +
> +/**
> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
> + * this function returns the iommu domain the MSI doorbell address must be
> + * mapped in. Else it returns NULL.
> + *
> + * @desc: msi desc handle
> + */
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
> +
>   #else
>
>   static inline int
> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>   					   phys_addr_t addr) {}
>
> +static inline struct iommu_domain *
> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	return NULL;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-20 17:19     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:19 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> This function checks whether
> - the device emitting the MSI belongs to a non default iommu domain
> - the iommu domain requires the MSI address to be mapped.
>
> If those conditions are met, the function returns the iommu domain
> to be used for mapping the MSI doorbell; else it returns NULL.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>   2 files changed, 37 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2522235..907a17f 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -17,6 +17,7 @@
>
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
> +#include <linux/msi.h>
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -332,3 +333,21 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	struct device *dev;
> +	struct iommu_domain *d;
> +
> +	dev = msi_desc_to_dev(desc);
> +
> +	d = iommu_get_domain_for_dev(dev);
> +
> +	if (!d || (d->type == IOMMU_DOMAIN_DMA))
> +		return NULL;
> +
> +	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
> +		return NULL;

Yeah, I don't see why we couldn't just use

if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
	return NULL

there instead.

> +
> +	return d;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8722131..8373929 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -19,6 +19,7 @@
>   #include <linux/kernel.h>
>
>   struct iommu_domain;
> +struct msi_desc;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
>    * if the binding ref count is null, destroy the reserved mapping
>    */
>   void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
> +
> +/**
> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
> + * this function returns the iommu domain the MSI doorbell address must be
> + * mapped in. Else it returns NULL.
> + *
> + * @desc: msi desc handle
> + */
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
> +
>   #else
>
>   static inline int
> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>   					   phys_addr_t addr) {}
>
> +static inline struct iommu_domain *
> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	return NULL;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-20 17:19     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> This function checks whether
> - the device emitting the MSI belongs to a non default iommu domain
> - the iommu domain requires the MSI address to be mapped.
>
> If those conditions are met, the function returns the iommu domain
> to be used for mapping the MSI doorbell; else it returns NULL.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> ---
>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>   2 files changed, 37 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 2522235..907a17f 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -17,6 +17,7 @@
>
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
> +#include <linux/msi.h>
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -332,3 +333,21 @@ unlock:
>   }
>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	struct device *dev;
> +	struct iommu_domain *d;
> +
> +	dev = msi_desc_to_dev(desc);
> +
> +	d = iommu_get_domain_for_dev(dev);
> +
> +	if (!d || (d->type == IOMMU_DOMAIN_DMA))
> +		return NULL;
> +
> +	if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
> +		return NULL;

Yeah, I don't see why we couldn't just use

if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
	return NULL

there instead.

> +
> +	return d;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8722131..8373929 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -19,6 +19,7 @@
>   #include <linux/kernel.h>
>
>   struct iommu_domain;
> +struct msi_desc;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain *domain,
>    * if the binding ref count is null, destroy the reserved mapping
>    */
>   void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
> +
> +/**
> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from a device
> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
> + * this function returns the iommu domain the MSI doorbell address must be
> + * mapped in. Else it returns NULL.
> + *
> + * @desc: msi desc handle
> + */
> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
> +
>   #else
>
>   static inline int
> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>   					   phys_addr_t addr) {}
>
> +static inline struct iommu_domain *
> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
> +{
> +	return NULL;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 17:28     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:28 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> Introduce iommu_msi_mapping_translate_msg whose role consists in
> detecting whether the device's MSIs must to be mapped into an IOMMU.
> It case it must, the function overrides the MSI msg originally composed
> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
> IOVA. In case the corresponding PA region is not found, the function
> returns an error.
>
> This function is likely to be called in some code that cannot sleep. This
> is the reason why the allocation/mapping is done separately.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v7: creation
> ---
>   drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>   2 files changed, 96 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 907a17f..603ee45 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -18,6 +18,14 @@
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
>   #include <linux/msi.h>
> +#include <linux/irq.h>
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define msg_to_phys_addr(msg) \
> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
> +#else
> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
> +#endif
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return d;
>   }
>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> +
> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
> +				    phys_addr_t addr, size_t size)
> +{
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	size_t iommu_page_size, binding_size;
> +	struct iommu_reserved_binding *b;
> +	phys_addr_t aligned_base, offset;
> +	dma_addr_t iova = DMA_ERROR_CODE;
> +	struct iova_domain *iovad;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
> +
> +	if (!iovad)
> +		goto unlock;
> +
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

This all looks rather familiar...

> +	b = find_reserved_binding(domain, aligned_base, binding_size);

...which implies that at least some of it should be factored into that guy.

> +	if (b && (b->addr <= aligned_base) &&
> +		(aligned_base + binding_size <=  b->addr + b->size))
> +		iova = b->iova + offset + aligned_base - b->addr;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return iova;
> +}
> +
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
> +{
> +	struct iommu_domain *d;
> +	struct msi_desc *desc;
> +	dma_addr_t iova;
> +
> +	desc = irq_data_get_msi_desc(data);
> +	if (!desc)
> +		return -EINVAL;
> +
> +	d = iommu_msi_mapping_desc_to_domain(desc);
> +	if (!d)
> +		return 0;
> +
> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
> +					sizeof(phys_addr_t));
> +
> +	if (iova == DMA_ERROR_CODE)
> +		return -EINVAL;
> +
> +	msg->address_lo = lower_32_bits(iova);
> +	msg->address_hi = upper_32_bits(iova);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8373929..04e1912f 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -20,6 +20,8 @@
>
>   struct iommu_domain;
>   struct msi_desc;
> +struct irq_data;
> +struct msi_msg;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>    */
>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>
> +/**
> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
> + * This function overwrites the original MSI message containing the doorbell
> + * physical address, result of the primary composition, with the doorbell IOVA.
> + *
> + * The doorbell physical address must be bound previously to an IOVA using
> + * iommu_get_reserved_iova
> + *
> + * @data: irq data handle
> + * @msg: original msi message containing the PA to be overwritten with
> + * the IOVA
> + *
> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
> + * were successfully swapped; return -EINVAL if the addresses need
> + * to be swapped but not IOMMU binding is found
> + */
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
> +
>   #else
>
>   static inline int
> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return NULL;
>   }
>
> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
> +						  struct msi_msg *msg)
> +{
> +	return 0;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 17:28     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:28 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> Introduce iommu_msi_mapping_translate_msg whose role consists in
> detecting whether the device's MSIs must to be mapped into an IOMMU.
> It case it must, the function overrides the MSI msg originally composed
> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
> IOVA. In case the corresponding PA region is not found, the function
> returns an error.
>
> This function is likely to be called in some code that cannot sleep. This
> is the reason why the allocation/mapping is done separately.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
>
> v7: creation
> ---
>   drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>   2 files changed, 96 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 907a17f..603ee45 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -18,6 +18,14 @@
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
>   #include <linux/msi.h>
> +#include <linux/irq.h>
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define msg_to_phys_addr(msg) \
> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
> +#else
> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
> +#endif
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return d;
>   }
>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> +
> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
> +				    phys_addr_t addr, size_t size)
> +{
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	size_t iommu_page_size, binding_size;
> +	struct iommu_reserved_binding *b;
> +	phys_addr_t aligned_base, offset;
> +	dma_addr_t iova = DMA_ERROR_CODE;
> +	struct iova_domain *iovad;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
> +
> +	if (!iovad)
> +		goto unlock;
> +
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

This all looks rather familiar...

> +	b = find_reserved_binding(domain, aligned_base, binding_size);

...which implies that at least some of it should be factored into that guy.

> +	if (b && (b->addr <= aligned_base) &&
> +		(aligned_base + binding_size <=  b->addr + b->size))
> +		iova = b->iova + offset + aligned_base - b->addr;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return iova;
> +}
> +
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
> +{
> +	struct iommu_domain *d;
> +	struct msi_desc *desc;
> +	dma_addr_t iova;
> +
> +	desc = irq_data_get_msi_desc(data);
> +	if (!desc)
> +		return -EINVAL;
> +
> +	d = iommu_msi_mapping_desc_to_domain(desc);
> +	if (!d)
> +		return 0;
> +
> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
> +					sizeof(phys_addr_t));
> +
> +	if (iova == DMA_ERROR_CODE)
> +		return -EINVAL;
> +
> +	msg->address_lo = lower_32_bits(iova);
> +	msg->address_hi = upper_32_bits(iova);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8373929..04e1912f 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -20,6 +20,8 @@
>
>   struct iommu_domain;
>   struct msi_desc;
> +struct irq_data;
> +struct msi_msg;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>    */
>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>
> +/**
> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
> + * This function overwrites the original MSI message containing the doorbell
> + * physical address, result of the primary composition, with the doorbell IOVA.
> + *
> + * The doorbell physical address must be bound previously to an IOVA using
> + * iommu_get_reserved_iova
> + *
> + * @data: irq data handle
> + * @msg: original msi message containing the PA to be overwritten with
> + * the IOVA
> + *
> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
> + * were successfully swapped; return -EINVAL if the addresses need
> + * to be swapped but not IOMMU binding is found
> + */
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
> +
>   #else
>
>   static inline int
> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return NULL;
>   }
>
> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
> +						  struct msi_msg *msg)
> +{
> +	return 0;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-20 17:28     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> Introduce iommu_msi_mapping_translate_msg whose role consists in
> detecting whether the device's MSIs must to be mapped into an IOMMU.
> It case it must, the function overrides the MSI msg originally composed
> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
> IOVA. In case the corresponding PA region is not found, the function
> returns an error.
>
> This function is likely to be called in some code that cannot sleep. This
> is the reason why the allocation/mapping is done separately.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v7: creation
> ---
>   drivers/iommu/dma-reserved-iommu.c | 69 ++++++++++++++++++++++++++++++++++++++
>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>   2 files changed, 96 insertions(+)
>
> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> index 907a17f..603ee45 100644
> --- a/drivers/iommu/dma-reserved-iommu.c
> +++ b/drivers/iommu/dma-reserved-iommu.c
> @@ -18,6 +18,14 @@
>   #include <linux/iommu.h>
>   #include <linux/iova.h>
>   #include <linux/msi.h>
> +#include <linux/irq.h>
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> +#define msg_to_phys_addr(msg) \
> +	(((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
> +#else
> +#define msg_to_phys_addr(msg)	((msg)->address_lo)
> +#endif
>
>   struct reserved_iova_domain {
>   	struct iova_domain *iovad;
> @@ -351,3 +359,64 @@ struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return d;
>   }
>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
> +
> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
> +				    phys_addr_t addr, size_t size)
> +{
> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
> +	size_t iommu_page_size, binding_size;
> +	struct iommu_reserved_binding *b;
> +	phys_addr_t aligned_base, offset;
> +	dma_addr_t iova = DMA_ERROR_CODE;
> +	struct iova_domain *iovad;
> +
> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> +
> +	iovad = (struct iova_domain *)domain->reserved_iova_cookie;
> +
> +	if (!iovad)
> +		goto unlock;
> +
> +	order = iova_shift(iovad);
> +	base_pfn = addr >> order;
> +	end_pfn = (addr + size - 1) >> order;
> +	aligned_base = base_pfn << order;
> +	offset = addr - aligned_base;
> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> +	iommu_page_size = 1 << order;
> +	binding_size = nb_iommu_pages * iommu_page_size;

This all looks rather familiar...

> +	b = find_reserved_binding(domain, aligned_base, binding_size);

...which implies that at least some of it should be factored into that guy.

> +	if (b && (b->addr <= aligned_base) &&
> +		(aligned_base + binding_size <=  b->addr + b->size))
> +		iova = b->iova + offset + aligned_base - b->addr;
> +unlock:
> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> +	return iova;
> +}
> +
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg)
> +{
> +	struct iommu_domain *d;
> +	struct msi_desc *desc;
> +	dma_addr_t iova;
> +
> +	desc = irq_data_get_msi_desc(data);
> +	if (!desc)
> +		return -EINVAL;
> +
> +	d = iommu_msi_mapping_desc_to_domain(desc);
> +	if (!d)
> +		return 0;
> +
> +	iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
> +					sizeof(phys_addr_t));
> +
> +	if (iova == DMA_ERROR_CODE)
> +		return -EINVAL;
> +
> +	msg->address_lo = lower_32_bits(iova);
> +	msg->address_hi = upper_32_bits(iova);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
> diff --git a/include/linux/dma-reserved-iommu.h b/include/linux/dma-reserved-iommu.h
> index 8373929..04e1912f 100644
> --- a/include/linux/dma-reserved-iommu.h
> +++ b/include/linux/dma-reserved-iommu.h
> @@ -20,6 +20,8 @@
>
>   struct iommu_domain;
>   struct msi_desc;
> +struct irq_data;
> +struct msi_msg;
>
>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>
> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t addr);
>    */
>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc);
>
> +/**
> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is translated
> + * by an IOMMU, the msg address must be an IOVA instead of a physical address.
> + * This function overwrites the original MSI message containing the doorbell
> + * physical address, result of the primary composition, with the doorbell IOVA.
> + *
> + * The doorbell physical address must be bound previously to an IOVA using
> + * iommu_get_reserved_iova
> + *
> + * @data: irq data handle
> + * @msg: original msi message containing the PA to be overwritten with
> + * the IOVA
> + *
> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
> + * were successfully swapped; return -EINVAL if the addresses need
> + * to be swapped but not IOMMU binding is found
> + */
> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct msi_msg *msg);
> +
>   #else
>
>   static inline int
> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>   	return NULL;
>   }
>
> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
> +						  struct msi_msg *msg)
> +{
> +	return 0;
> +}
> +
>   #endif	/* CONFIG_IOMMU_DMA_RESERVED */
>   #endif	/* __DMA_RESERVED_IOMMU_H */
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-20 17:35     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:35 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 19/04/16 17:56, Eric Auger wrote:
> When the domain gets destroyed, let's make sure all reserved iova
> resources get released.
>
> Choice is made to put that call in arm-smmu(-v3).c to do something similar
> to what was done for iommu_put_dma_cookie.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v7: new
> ---
>   drivers/iommu/arm-smmu-v3.c | 2 ++
>   drivers/iommu/arm-smmu.c    | 2 ++
>   2 files changed, 4 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index a077a35..afd0dac 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -22,6 +22,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
>   #include <linux/iommu.h>
> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

Yikes! No, drivers shouldn't be randomly freeing things they didn't 
allocate - the owner of the domain, who presumably allocated the thing, 
can call that right _before_ they call iommu_domain_free().

>   	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>
>   	/* Free the CD and ASID, if we allocated them */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 8cd7b8a..492339f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -30,6 +30,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	 * already been detached.
>   	 */
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

...which has the added bonus of preventing needless duplication everywhere.

Robin.

>   	arm_smmu_destroy_domain_context(domain);
>   	kfree(smmu_domain);
>   }
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-20 17:35     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:35 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 19/04/16 17:56, Eric Auger wrote:
> When the domain gets destroyed, let's make sure all reserved iova
> resources get released.
>
> Choice is made to put that call in arm-smmu(-v3).c to do something similar
> to what was done for iommu_put_dma_cookie.
>
> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>
> ---
>
> v7: new
> ---
>   drivers/iommu/arm-smmu-v3.c | 2 ++
>   drivers/iommu/arm-smmu.c    | 2 ++
>   2 files changed, 4 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index a077a35..afd0dac 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -22,6 +22,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
>   #include <linux/iommu.h>
> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

Yikes! No, drivers shouldn't be randomly freeing things they didn't 
allocate - the owner of the domain, who presumably allocated the thing, 
can call that right _before_ they call iommu_domain_free().

>   	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>
>   	/* Free the CD and ASID, if we allocated them */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 8cd7b8a..492339f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -30,6 +30,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	 * already been detached.
>   	 */
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

...which has the added bonus of preventing needless duplication everywhere.

Robin.

>   	arm_smmu_destroy_domain_context(domain);
>   	kfree(smmu_domain);
>   }
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-20 17:35     ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-20 17:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/04/16 17:56, Eric Auger wrote:
> When the domain gets destroyed, let's make sure all reserved iova
> resources get released.
>
> Choice is made to put that call in arm-smmu(-v3).c to do something similar
> to what was done for iommu_put_dma_cookie.
>
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>
> ---
>
> v7: new
> ---
>   drivers/iommu/arm-smmu-v3.c | 2 ++
>   drivers/iommu/arm-smmu.c    | 2 ++
>   2 files changed, 4 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index a077a35..afd0dac 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -22,6 +22,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
>   #include <linux/iommu.h>
> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

Yikes! No, drivers shouldn't be randomly freeing things they didn't 
allocate - the owner of the domain, who presumably allocated the thing, 
can call that right _before_ they call iommu_domain_free().

>   	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>
>   	/* Free the CD and ASID, if we allocated them */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 8cd7b8a..492339f 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -30,6 +30,7 @@
>
>   #include <linux/delay.h>
>   #include <linux/dma-iommu.h>
> +#include <linux/dma-reserved-iommu.h>
>   #include <linux/dma-mapping.h>
>   #include <linux/err.h>
>   #include <linux/interrupt.h>
> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>   	 * already been detached.
>   	 */
>   	iommu_put_dma_cookie(domain);
> +	iommu_free_reserved_iova_domain(domain);

...which has the added bonus of preventing needless duplication everywhere.

Robin.

>   	arm_smmu_destroy_domain_context(domain);
>   	kfree(smmu_domain);
>   }
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-21  8:39       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:39 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/20/2016 07:35 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> When the domain gets destroyed, let's make sure all reserved iova
>> resources get released.
>>
>> Choice is made to put that call in arm-smmu(-v3).c to do something
>> similar
>> to what was done for iommu_put_dma_cookie.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: new
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 2 ++
>>   drivers/iommu/arm-smmu.c    | 2 ++
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index a077a35..afd0dac 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -22,6 +22,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>>   #include <linux/iommu.h>
>> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>       struct arm_smmu_device *smmu = smmu_domain->smmu;
>>
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> Yikes! No, drivers shouldn't be randomly freeing things they didn't
> allocate - the owner of the domain, who presumably allocated the thing,
> can call that right _before_ they call iommu_domain_free().
OK I move that back to vfio_iommu_type1.c.

Thanks

Eric
> 
>>       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>>
>>       /* Free the CD and ASID, if we allocated them */
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 8cd7b8a..492339f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -30,6 +30,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>        * already been detached.
>>        */
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> ...which has the added bonus of preventing needless duplication everywhere.
> 
> Robin.
> 
>>       arm_smmu_destroy_domain_context(domain);
>>       kfree(smmu_domain);
>>   }
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-21  8:39       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:39 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/20/2016 07:35 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> When the domain gets destroyed, let's make sure all reserved iova
>> resources get released.
>>
>> Choice is made to put that call in arm-smmu(-v3).c to do something
>> similar
>> to what was done for iommu_put_dma_cookie.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>>
>> v7: new
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 2 ++
>>   drivers/iommu/arm-smmu.c    | 2 ++
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index a077a35..afd0dac 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -22,6 +22,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>>   #include <linux/iommu.h>
>> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>       struct arm_smmu_device *smmu = smmu_domain->smmu;
>>
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> Yikes! No, drivers shouldn't be randomly freeing things they didn't
> allocate - the owner of the domain, who presumably allocated the thing,
> can call that right _before_ they call iommu_domain_free().
OK I move that back to vfio_iommu_type1.c.

Thanks

Eric
> 
>>       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>>
>>       /* Free the CD and ASID, if we allocated them */
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 8cd7b8a..492339f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -30,6 +30,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>        * already been detached.
>>        */
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> ...which has the added bonus of preventing needless duplication everywhere.
> 
> Robin.
> 
>>       arm_smmu_destroy_domain_context(domain);
>>       kfree(smmu_domain);
>>   }
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction
@ 2016-04-21  8:39       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/20/2016 07:35 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> When the domain gets destroyed, let's make sure all reserved iova
>> resources get released.
>>
>> Choice is made to put that call in arm-smmu(-v3).c to do something
>> similar
>> to what was done for iommu_put_dma_cookie.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: new
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 2 ++
>>   drivers/iommu/arm-smmu.c    | 2 ++
>>   2 files changed, 4 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index a077a35..afd0dac 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -22,6 +22,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>>   #include <linux/iommu.h>
>> @@ -1444,6 +1445,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>       struct arm_smmu_device *smmu = smmu_domain->smmu;
>>
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> Yikes! No, drivers shouldn't be randomly freeing things they didn't
> allocate - the owner of the domain, who presumably allocated the thing,
> can call that right _before_ they call iommu_domain_free().
OK I move that back to vfio_iommu_type1.c.

Thanks

Eric
> 
>>       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>>
>>       /* Free the CD and ASID, if we allocated them */
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 8cd7b8a..492339f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -30,6 +30,7 @@
>>
>>   #include <linux/delay.h>
>>   #include <linux/dma-iommu.h>
>> +#include <linux/dma-reserved-iommu.h>
>>   #include <linux/dma-mapping.h>
>>   #include <linux/err.h>
>>   #include <linux/interrupt.h>
>> @@ -1009,6 +1010,7 @@ static void arm_smmu_domain_free(struct
>> iommu_domain *domain)
>>        * already been detached.
>>        */
>>       iommu_put_dma_cookie(domain);
>> +    iommu_free_reserved_iova_domain(domain);
> 
> ...which has the added bonus of preventing needless duplication everywhere.
> 
> Robin.
> 
>>       arm_smmu_destroy_domain_context(domain);
>>       kfree(smmu_domain);
>>   }
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 04/20/2016 07:28 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 69
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>   2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>>   #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +    (((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)    ((msg)->address_lo)
>> +#endif
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain
>> *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>       return d;
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +                    phys_addr_t addr, size_t size)
>> +{
>> +    unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    size_t iommu_page_size, binding_size;
>> +    struct iommu_reserved_binding *b;
>> +    phys_addr_t aligned_base, offset;
>> +    dma_addr_t iova = DMA_ERROR_CODE;
>> +    struct iova_domain *iovad;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +    if (!iovad)
>> +        goto unlock;
>> +
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> This all looks rather familiar...
> 
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
> 
> ...which implies that at least some of it should be factored into that guy.
ok. Besides, with your compact rewriting proposal, maybe it is not worth
anymore ;-)

Eric
> 
>> +    if (b && (b->addr <= aligned_base) &&
>> +        (aligned_base t + binding_size <=  b->addr + b->size))
>> +        iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg)
>> +{
>> +    struct iommu_domain *d;
>> +    struct msi_desc *desc;
>> +    dma_addr_t iova;
>> +
>> +    desc = irq_data_get_msi_desc(data);
>> +    if (!desc)
>> +        return -EINVAL;
>> +
>> +    d = iommu_msi_mapping_desc_to_domain(desc);
>> +    if (!d)
>> +        return 0;
>> +
>> +    iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +                    sizeof(phys_addr_t));
>> +
>> +    if (iova == DMA_ERROR_CODE)
>> +        return -EINVAL;
>> +
>> +    msg->address_lo = lower_32_bits(iova);
>> +    msg->address_hi = upper_32_bits(iova);
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>
>>   struct iommu_domain;
>>   struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain
>> *domain, phys_addr_t addr);
>>    */
>>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct
>> msi_desc *desc);
>>
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is
>> translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical
>> address.
>> + * This function overwrites the original MSI message containing the
>> doorbell
>> + * physical address, result of the primary composition, with the
>> doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA
>> using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg);
>> +
>>   #else
>>
>>   static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>>       return NULL;
>>   }
>>
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +                          struct msi_msg *msg)
>> +{
>> +    return 0;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 04/20/2016 07:28 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 69
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>   2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>>   #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +    (((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)    ((msg)->address_lo)
>> +#endif
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain
>> *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>       return d;
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +                    phys_addr_t addr, size_t size)
>> +{
>> +    unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    size_t iommu_page_size, binding_size;
>> +    struct iommu_reserved_binding *b;
>> +    phys_addr_t aligned_base, offset;
>> +    dma_addr_t iova = DMA_ERROR_CODE;
>> +    struct iova_domain *iovad;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +    if (!iovad)
>> +        goto unlock;
>> +
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> This all looks rather familiar...
> 
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
> 
> ...which implies that at least some of it should be factored into that guy.
ok. Besides, with your compact rewriting proposal, maybe it is not worth
anymore ;-)

Eric
> 
>> +    if (b && (b->addr <= aligned_base) &&
>> +        (aligned_base t + binding_size <=  b->addr + b->size))
>> +        iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg)
>> +{
>> +    struct iommu_domain *d;
>> +    struct msi_desc *desc;
>> +    dma_addr_t iova;
>> +
>> +    desc = irq_data_get_msi_desc(data);
>> +    if (!desc)
>> +        return -EINVAL;
>> +
>> +    d = iommu_msi_mapping_desc_to_domain(desc);
>> +    if (!d)
>> +        return 0;
>> +
>> +    iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +                    sizeof(phys_addr_t));
>> +
>> +    if (iova == DMA_ERROR_CODE)
>> +        return -EINVAL;
>> +
>> +    msg->address_lo = lower_32_bits(iova);
>> +    msg->address_hi = upper_32_bits(iova);
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>
>>   struct iommu_domain;
>>   struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain
>> *domain, phys_addr_t addr);
>>    */
>>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct
>> msi_desc *desc);
>>
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is
>> translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical
>> address.
>> + * This function overwrites the original MSI message containing the
>> doorbell
>> + * physical address, result of the primary composition, with the
>> doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA
>> using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg);
>> +
>>   #else
>>
>>   static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>>       return NULL;
>>   }
>>
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +                          struct msi_msg *msg)
>> +{
>> +    return 0;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/20/2016 07:28 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Introduce iommu_msi_mapping_translate_msg whose role consists in
>> detecting whether the device's MSIs must to be mapped into an IOMMU.
>> It case it must, the function overrides the MSI msg originally composed
>> and replaces the doorbell's PA by a pre-allocated and pre-mapped reserved
>> IOVA. In case the corresponding PA region is not found, the function
>> returns an error.
>>
>> This function is likely to be called in some code that cannot sleep. This
>> is the reason why the allocation/mapping is done separately.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>>
>> v7: creation
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 69
>> ++++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 27 +++++++++++++++
>>   2 files changed, 96 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 907a17f..603ee45 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -18,6 +18,14 @@
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>>   #include <linux/msi.h>
>> +#include <linux/irq.h>
>> +
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +#define msg_to_phys_addr(msg) \
>> +    (((phys_addr_t)((msg)->address_hi) << 32) | (msg)->address_lo)
>> +#else
>> +#define msg_to_phys_addr(msg)    ((msg)->address_lo)
>> +#endif
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -351,3 +359,64 @@ struct iommu_domain
>> *iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>>       return d;
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> +
>> +static dma_addr_t iommu_find_reserved_iova(struct iommu_domain *domain,
>> +                    phys_addr_t addr, size_t size)
>> +{
>> +    unsigned long  base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    size_t iommu_page_size, binding_size;
>> +    struct iommu_reserved_binding *b;
>> +    phys_addr_t aligned_base, offset;
>> +    dma_addr_t iova = DMA_ERROR_CODE;
>> +    struct iova_domain *iovad;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    iovad = (struct iova_domain *)domain->reserved_iova_cookie;
>> +
>> +    if (!iovad)
>> +        goto unlock;
>> +
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> This all looks rather familiar...
> 
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
> 
> ...which implies that at least some of it should be factored into that guy.
ok. Besides, with your compact rewriting proposal, maybe it is not worth
anymore ;-)

Eric
> 
>> +    if (b && (b->addr <= aligned_base) &&
>> +        (aligned_base t + binding_size <=  b->addr + b->size))
>> +        iova = b->iova + offset + aligned_base - b->addr;
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return iova;
>> +}
>> +
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg)
>> +{
>> +    struct iommu_domain *d;
>> +    struct msi_desc *desc;
>> +    dma_addr_t iova;
>> +
>> +    desc = irq_data_get_msi_desc(data);
>> +    if (!desc)
>> +        return -EINVAL;
>> +
>> +    d = iommu_msi_mapping_desc_to_domain(desc);
>> +    if (!d)
>> +        return 0;
>> +
>> +    iova = iommu_find_reserved_iova(d, msg_to_phys_addr(msg),
>> +                    sizeof(phys_addr_t));
>> +
>> +    if (iova == DMA_ERROR_CODE)
>> +        return -EINVAL;
>> +
>> +    msg->address_lo = lower_32_bits(iova);
>> +    msg->address_hi = upper_32_bits(iova);
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_translate_msg);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8373929..04e1912f 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -20,6 +20,8 @@
>>
>>   struct iommu_domain;
>>   struct msi_desc;
>> +struct irq_data;
>> +struct msi_msg;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -82,6 +84,25 @@ void iommu_put_reserved_iova(struct iommu_domain
>> *domain, phys_addr_t addr);
>>    */
>>   struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct
>> msi_desc *desc);
>>
>> +/**
>> + * iommu_msi_mapping_translate_msg: in case the MSI transaction is
>> translated
>> + * by an IOMMU, the msg address must be an IOVA instead of a physical
>> address.
>> + * This function overwrites the original MSI message containing the
>> doorbell
>> + * physical address, result of the primary composition, with the
>> doorbell IOVA.
>> + *
>> + * The doorbell physical address must be bound previously to an IOVA
>> using
>> + * iommu_get_reserved_iova
>> + *
>> + * @data: irq data handle
>> + * @msg: original msi message containing the PA to be overwritten with
>> + * the IOVA
>> + *
>> + * return 0 if the MSI does not need to be mapped or when the PA/IOVA
>> + * were successfully swapped; return -EINVAL if the addresses need
>> + * to be swapped but not IOMMU binding is found
>> + */
>> +int iommu_msi_mapping_translate_msg(struct irq_data *data, struct
>> msi_msg *msg);
>> +
>>   #else
>>
>>   static inline int
>> @@ -111,5 +132,11 @@ iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>>       return NULL;
>>   }
>>
>> +static inline int iommu_msi_mapping_translate_msg(struct irq_data *data,
>> +                          struct msi_msg *msg)
>> +{
>> +    return 0;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/20/2016 07:19 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This function checks whether
>> - the device emitting the MSI belongs to a non default iommu domain
>> - the iommu domain requires the MSI address to be mapped.
>>
>> If those conditions are met, the function returns the iommu domain
>> to be used for mapping the MSI doorbell; else it returns NULL.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2522235..907a17f 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -17,6 +17,7 @@
>>
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>> +#include <linux/msi.h>
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -332,3 +333,21 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>>
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>> +{
>> +    struct device *dev;
>> +    struct iommu_domain *d;
>> +
>> +    dev = msi_desc_to_dev(desc);
>> +
>> +    d = iommu_get_domain_for_dev(dev);
>> +
>> +    if (!d || (d->type == IOMMU_DOMAIN_DMA))
>> +        return NULL;
>> +
>> +    if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
>> +        return NULL;
> 
> Yeah, I don't see why we couldn't just use
> 
> if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
>     return NULL
I don't think this works. This will lead to MSI iommu mapping on x86
when irq_remapping is disabled. To be further checked though.

Eric
> 
> there instead.
> 
>> +
>> +    return d;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8722131..8373929 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -19,6 +19,7 @@
>>   #include <linux/kernel.h>
>>
>>   struct iommu_domain;
>> +struct msi_desc;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain
>> *domain,
>>    * if the binding ref count is null, destroy the reserved mapping
>>    */
>>   void iommu_put_reserved_iova(struct iommu_domain *domain,
>> phys_addr_t addr);
>> +
>> +/**
>> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from
>> a device
>> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
>> + * this function returns the iommu domain the MSI doorbell address
>> must be
>> + * mapped in. Else it returns NULL.
>> + *
>> + * @desc: msi desc handle
>> + */
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc);
>> +
>>   #else
>>
>>   static inline int
>> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct
>> iommu_domain *domain,
>>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>>                          phys_addr_t addr) {}
>>
>> +static inline struct iommu_domain *
>> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>> +{
>> +    return NULL;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/20/2016 07:19 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This function checks whether
>> - the device emitting the MSI belongs to a non default iommu domain
>> - the iommu domain requires the MSI address to be mapped.
>>
>> If those conditions are met, the function returns the iommu domain
>> to be used for mapping the MSI doorbell; else it returns NULL.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2522235..907a17f 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -17,6 +17,7 @@
>>
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>> +#include <linux/msi.h>
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -332,3 +333,21 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>>
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>> +{
>> +    struct device *dev;
>> +    struct iommu_domain *d;
>> +
>> +    dev = msi_desc_to_dev(desc);
>> +
>> +    d = iommu_get_domain_for_dev(dev);
>> +
>> +    if (!d || (d->type == IOMMU_DOMAIN_DMA))
>> +        return NULL;
>> +
>> +    if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
>> +        return NULL;
> 
> Yeah, I don't see why we couldn't just use
> 
> if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
>     return NULL
I don't think this works. This will lead to MSI iommu mapping on x86
when irq_remapping is disabled. To be further checked though.

Eric
> 
> there instead.
> 
>> +
>> +    return d;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8722131..8373929 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -19,6 +19,7 @@
>>   #include <linux/kernel.h>
>>
>>   struct iommu_domain;
>> +struct msi_desc;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain
>> *domain,
>>    * if the binding ref count is null, destroy the reserved mapping
>>    */
>>   void iommu_put_reserved_iova(struct iommu_domain *domain,
>> phys_addr_t addr);
>> +
>> +/**
>> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from
>> a device
>> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
>> + * this function returns the iommu domain the MSI doorbell address
>> must be
>> + * mapped in. Else it returns NULL.
>> + *
>> + * @desc: msi desc handle
>> + */
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc);
>> +
>>   #else
>>
>>   static inline int
>> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct
>> iommu_domain *domain,
>>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>>                          phys_addr_t addr) {}
>>
>> +static inline struct iommu_domain *
>> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>> +{
>> +    return NULL;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/20/2016 07:19 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This function checks whether
>> - the device emitting the MSI belongs to a non default iommu domain
>> - the iommu domain requires the MSI address to be mapped.
>>
>> If those conditions are met, the function returns the iommu domain
>> to be used for mapping the MSI doorbell; else it returns NULL.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 19 +++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h | 18 ++++++++++++++++++
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 2522235..907a17f 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -17,6 +17,7 @@
>>
>>   #include <linux/iommu.h>
>>   #include <linux/iova.h>
>> +#include <linux/msi.h>
>>
>>   struct reserved_iova_domain {
>>       struct iova_domain *iovad;
>> @@ -332,3 +333,21 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>>
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc)
>> +{
>> +    struct device *dev;
>> +    struct iommu_domain *d;
>> +
>> +    dev = msi_desc_to_dev(desc);
>> +
>> +    d = iommu_get_domain_for_dev(dev);
>> +
>> +    if (!d || (d->type == IOMMU_DOMAIN_DMA))
>> +        return NULL;
>> +
>> +    if (iommu_domain_get_attr(d, DOMAIN_ATTR_MSI_MAPPING, NULL))
>> +        return NULL;
> 
> Yeah, I don't see why we couldn't just use
> 
> if (domain->ops->capable(IOMMU_CAP_INTR_REMAP))
>     return NULL
I don't think this works. This will lead to MSI iommu mapping on x86
when irq_remapping is disabled. To be further checked though.

Eric
> 
> there instead.
> 
>> +
>> +    return d;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_msi_mapping_desc_to_domain);
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 8722131..8373929 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -19,6 +19,7 @@
>>   #include <linux/kernel.h>
>>
>>   struct iommu_domain;
>> +struct msi_desc;
>>
>>   #ifdef CONFIG_IOMMU_DMA_RESERVED
>>
>> @@ -70,6 +71,17 @@ int iommu_get_reserved_iova(struct iommu_domain
>> *domain,
>>    * if the binding ref count is null, destroy the reserved mapping
>>    */
>>   void iommu_put_reserved_iova(struct iommu_domain *domain,
>> phys_addr_t addr);
>> +
>> +/**
>> + * iommu_msi_mapping_desc_to_domain: in case the MSI originates from
>> a device
>> + * upstream to an IOMMU and this IOMMU translates the MSI transaction,
>> + * this function returns the iommu domain the MSI doorbell address
>> must be
>> + * mapped in. Else it returns NULL.
>> + *
>> + * @desc: msi desc handle
>> + */
>> +struct iommu_domain *iommu_msi_mapping_desc_to_domain(struct msi_desc
>> *desc);
>> +
>>   #else
>>
>>   static inline int
>> @@ -93,5 +105,11 @@ static inline int iommu_get_reserved_iova(struct
>> iommu_domain *domain,
>>   static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>>                          phys_addr_t addr) {}
>>
>> +static inline struct iommu_domain *
>> +iommu_msi_mapping_desc_to_domain(struct msi_desc *desc)
>> +{
>> +    return NULL;
>> +}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi,
On 04/20/2016 07:05 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Now reserved bindings can exist, destroy them when destroying
>> the reserved iova domain. iommu_map is not supposed to be atomic,
>> hence the extra complexity in the locking.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v6 -> v7:
>> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>>    destroy the bindings in iommu_free_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin_lock instead of mutex
>>
>> v3 -> v4:
>> - previously "iommu/arm-smmu: relinquish reserved resources on
>>    domain deletion"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 34
>> ++++++++++++++++++++++++++++------
>>   1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 426d339..2522235 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct
>> iommu_domain *domain)
>>       unsigned long flags;
>>       int ret = 0;
>>
>> -    spin_lock_irqsave(&domain->reserved_lock, flags);
>> -
>> -    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> -    if (!rid) {
>> -        ret = -EINVAL;
>> -        goto unlock;
>> +    while (1) {
>> +        struct iommu_reserved_binding *b;
>> +        struct rb_node *node;
>> +        dma_addr_t iova;
>> +        size_t size;
>> +
>> +        spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +        rid = (struct reserved_iova_domain *)
>> +                domain->reserved_iova_cookie;
> 
> Same comment about casting as before.
OK
> 
>> +        if (!rid) {
>> +            ret = -EINVAL;
>> +            goto unlock;
>> +        }
>> +
>> +        node = rb_first(&domain->reserved_binding_list);
>> +        if (!node)
>> +            break;
>> +        b = rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        iova = b->iova;
>> +        size = b->size;
>> +
>> +        while (!kref_put(&b->kref, reserved_binding_release))
>> +            ;
> 
> Since you're freeing the thing anyway, why not just call the function
> directly?
makes sense indeed.

Thanks

Eric
> 
>> +        spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +        iommu_unmap(domain, iova, size);
>>       }
>>
>> +    domain->reserved_binding_list = RB_ROOT;
>>       domain->reserved_iova_cookie = NULL;
>>   unlock:
>>       spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi,
On 04/20/2016 07:05 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Now reserved bindings can exist, destroy them when destroying
>> the reserved iova domain. iommu_map is not supposed to be atomic,
>> hence the extra complexity in the locking.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>> v6 -> v7:
>> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>>    destroy the bindings in iommu_free_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin_lock instead of mutex
>>
>> v3 -> v4:
>> - previously "iommu/arm-smmu: relinquish reserved resources on
>>    domain deletion"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 34
>> ++++++++++++++++++++++++++++------
>>   1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 426d339..2522235 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct
>> iommu_domain *domain)
>>       unsigned long flags;
>>       int ret = 0;
>>
>> -    spin_lock_irqsave(&domain->reserved_lock, flags);
>> -
>> -    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> -    if (!rid) {
>> -        ret = -EINVAL;
>> -        goto unlock;
>> +    while (1) {
>> +        struct iommu_reserved_binding *b;
>> +        struct rb_node *node;
>> +        dma_addr_t iova;
>> +        size_t size;
>> +
>> +        spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +        rid = (struct reserved_iova_domain *)
>> +                domain->reserved_iova_cookie;
> 
> Same comment about casting as before.
OK
> 
>> +        if (!rid) {
>> +            ret = -EINVAL;
>> +            goto unlock;
>> +        }
>> +
>> +        node = rb_first(&domain->reserved_binding_list);
>> +        if (!node)
>> +            break;
>> +        b = rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        iova = b->iova;
>> +        size = b->size;
>> +
>> +        while (!kref_put(&b->kref, reserved_binding_release))
>> +            ;
> 
> Since you're freeing the thing anyway, why not just call the function
> directly?
makes sense indeed.

Thanks

Eric
> 
>> +        spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +        iommu_unmap(domain, iova, size);
>>       }
>>
>> +    domain->reserved_binding_list = RB_ROOT;
>>       domain->reserved_iova_cookie = NULL;
>>   unlock:
>>       spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain
@ 2016-04-21  8:40       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,
On 04/20/2016 07:05 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> Now reserved bindings can exist, destroy them when destroying
>> the reserved iova domain. iommu_map is not supposed to be atomic,
>> hence the extra complexity in the locking.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v6 -> v7:
>> - remove [PATCH v6 7/7] dma-reserved-iommu: iommu_unmap_reserved and
>>    destroy the bindings in iommu_free_reserved_iova_domain
>>
>> v5 -> v6:
>> - use spin_lock instead of mutex
>>
>> v3 -> v4:
>> - previously "iommu/arm-smmu: relinquish reserved resources on
>>    domain deletion"
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 34
>> ++++++++++++++++++++++++++++------
>>   1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index 426d339..2522235 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -157,14 +157,36 @@ void iommu_free_reserved_iova_domain(struct
>> iommu_domain *domain)
>>       unsigned long flags;
>>       int ret = 0;
>>
>> -    spin_lock_irqsave(&domain->reserved_lock, flags);
>> -
>> -    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> -    if (!rid) {
>> -        ret = -EINVAL;
>> -        goto unlock;
>> +    while (1) {
>> +        struct iommu_reserved_binding *b;
>> +        struct rb_node *node;
>> +        dma_addr_t iova;
>> +        size_t size;
>> +
>> +        spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +        rid = (struct reserved_iova_domain *)
>> +                domain->reserved_iova_cookie;
> 
> Same comment about casting as before.
OK
> 
>> +        if (!rid) {
>> +            ret = -EINVAL;
>> +            goto unlock;
>> +        }
>> +
>> +        node = rb_first(&domain->reserved_binding_list);
>> +        if (!node)
>> +            break;
>> +        b = rb_entry(node, struct iommu_reserved_binding, node);
>> +
>> +        iova = b->iova;
>> +        size = b->size;
>> +
>> +        while (!kref_put(&b->kref, reserved_binding_release))
>> +            ;
> 
> Since you're freeing the thing anyway, why not just call the function
> directly?
makes sense indeed.

Thanks

Eric
> 
>> +        spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +        iommu_unmap(domain, iova, size);
>>       }
>>
>> +    domain->reserved_binding_list = RB_ROOT;
>>       domain->reserved_iova_cookie = NULL;
>>   unlock:
>>       spin_unlock_irqrestore(&domain->reserved_lock, flags);
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-21  8:43       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:43 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/20/2016 06:58 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces iommu_get/put_reserved_iova.
>>
>> iommu_get_reserved_iova allows to iommu map a contiguous physical region
>> onto a reserved contiguous IOVA region. The physical region base address
>> does not need to be iommu page size aligned. iova pages are allocated and
>> mapped so that they cover all the physical region. This mapping is
>> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>>
>> In case a mapping already exists for the physical pages, the IOVA mapped
>> to the PA base is directly returned.
>>
>> Each time the get succeeds a binding ref count is incremented.
>>
>> iommu_put_reserved_iova decrements the ref count and when this latter
>> is null, the mapping is destroyed and the iovas are released.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v7:
>> - change title and rework commit message with new name of the functions
>>    and size parameter
>> - fix locking
>> - rework header doc comments
>> - put now takes a phys_addr_t
>> - check prot argument against reserved_iova_domain prot flags
>>
>> v5 -> v6:
>> - revisit locking with spin_lock instead of mutex
>> - do not kref_get on 1st get
>> - add size parameter to the get function following Marc's request
>> - use the iova domain shift instead of using the smallest supported
>> page size
>>
>> v3 -> v4:
>> - formerly in iommu: iommu_get/put_single_reserved &
>>    iommu/arm-smmu: implement iommu_get/put_single_reserved
>> - Attempted to address Marc's doubts about missing size/alignment
>>    at VFIO level (user-space knows the IOMMU page size and the number
>>    of IOVA pages to provision)
>>
>> v2 -> v3:
>> - remove static implementation of iommu_get_single_reserved &
>>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 150
>> +++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>>   2 files changed, 188 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index f6fa18e..426d339 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -135,6 +135,22 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>>
>> +/* called with domain's reserved_lock held */
>> +static void reserved_binding_release(struct kref *kref)
>> +{
>> +    struct iommu_reserved_binding *b =
>> +        container_of(kref, struct iommu_reserved_binding, kref);
>> +    struct iommu_domain *d = b->domain;
>> +    struct reserved_iova_domain *rid =
>> +        (struct reserved_iova_domain *)d->reserved_iova_cookie;
> 
> Either it's a void *, in which case you don't need to cast it, or it
> should be the appropriate type as I mentioned earlier, in which case you
> still wouldn't need to cast it.
ok
> 
>> +    unsigned long order;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    free_iova(rid->iovad, b->iova >> order);
> 
> iova_pfn() ?
ok
> 
>> +    unlink_reserved_binding(d, b);
>> +    kfree(b);
>> +}
>> +
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>>   {
>>       struct reserved_iova_domain *rid;
>> @@ -160,3 +176,137 @@ unlock:
>>       }
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> +
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                  phys_addr_t addr, size_t size, int prot,
>> +                  dma_addr_t *iova)
>> +{
>> +    unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    struct iommu_reserved_binding *b, *newb;
>> +    size_t iommu_page_size, binding_size;
>> +    phys_addr_t aligned_base, offset;
>> +    struct reserved_iova_domain *rid;
>> +    struct iova_domain *iovad;
>> +    struct iova *p_iova;
>> +    int ret = -EINVAL;
>> +
>> +    newb = kzalloc(sizeof(*newb), GFP_KERNEL);
>> +    if (!newb)
>> +        return -ENOMEM;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto free_newb;
>> +
>> +    if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
>> +        (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
> 
> Are devices wanting to read from MSI doorbells really a thing?
the rationale is Alex asked for propagating the VFIO DMA MAP prot flag
downto this API (genericity context). This later is stored in rid->prot
and I was just checking the iova was mapped according to the direction
the userspace expected.
> 
>> +        goto free_newb;
>> +
>> +    iovad = rid->iovad;
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> offset = iova_offset(iovad, addr);
> aligned_base = addr - offset;
> binding_size = iova_align(iovad, size + offset);
> 
> Am I right?
Looks so. Will further test it. Thanks
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
>> +    if (b) {
>> +        *iova = b->iova + offset + aligned_base - b->addr;
>> +        kref_get(&b->kref);
>> +        ret = 0;
>> +        goto free_newb;
>> +    }
>> +
>> +    p_iova = alloc_iova(iovad, nb_iommu_pages,
>> +                iovad->dma_32bit_pfn, true);
>> +    if (!p_iova) {
>> +        ret = -ENOMEM;
>> +        goto free_newb;
>> +    }
>> +
>> +    *iova = iova_dma_addr(iovad, p_iova);
>> +
>> +    /* unlock to call iommu_map which is not guaranteed to be atomic */
> 
> Hmm, that's concerning, because the ARM DMA mapping code, and
> consequently the iommu-dma layer, has always relied on it being so. On
> brief inspection, it looks to be only the AMD IOMMU doing something
> obviously non-atomic (taking a mutex) in its map callback, but then that
> has a separate DMA ops implementation. It doesn't look like it would be
> too intrusive to change, either, but that's an idea for its own thread.
yes. Making no hypothesis on the atomicity of iommu_map/unmap ops
brought some extra complexity here. Also it obliged to separate the
alloc/map from the iommu "binding" lookup. But now it is done I think it
brings some added value. Typically the fact we introduced an irq-chip
ops to retrieve the doorbells characteristics is valuable to enumerate
their number.
> 
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +
>> +    ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
>> +    if (!rid || (rid->iovad != iovad)) {
>> +        /* reserved iova domain was destroyed in our back */
> 
> That that could happen at all is terrifying! Surely the reserved domain
> should be set up immediately after iommu_domain_alloc() and torn down
> immediately before iommu_domain_free(). Things going missing while a
> domain is live smacks of horrible brokenness.
The VFIO user client creates the "reserved iova domain" using the vfio
VFIO_IOMMU_MAP_DMA ioctl. This can happen anytime after the iommu domain
creation (on VFIO_SET_IOMMU ioctl). The user-space is currently allowed
to unregister this iova domain at any time too. I think this is wrong: I
should have 2 reserved iova domain destroy functions, one used by
user-space and one used by kernel. In the user-space implementation I
should reject any attempt to destroy the reserved iova domain until
there are existing bindings.

> 
>> +        ret = -EBUSY;
>> +        goto free_newb; /* iova already released */
>> +    }
>> +
>> +    /* no change in iova reserved domain but iommu_map failed */
>> +    if (ret)
>> +        goto free_iova;
>> +
>> +    /* everything is fine, add in the new node in the rb tree */
>> +    kref_init(&newb->kref);
>> +    newb->domain = domain;
>> +    newb->addr = aligned_base;
>> +    newb->iova = *iova;
>> +    newb->size = binding_size;
>> +
>> +    link_reserved_binding(domain, newb);
>> +
>> +    *iova += offset;
>> +    goto unlock;
>> +
>> +free_iova:
>> +    free_iova(rid->iovad, p_iova->pfn_lo);
>> +free_newb:
>> +    kfree(newb);
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
>> +
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr)
>> +{
>> +    phys_addr_t aligned_addr, page_size, mask;
>> +    struct iommu_reserved_binding *b;
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long order, flags;
>> +    struct iommu_domain *d;
>> +    dma_addr_t iova;
>> +    size_t size;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto unlock;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    page_size = (uint64_t)1 << order;
>> +    mask = page_size - 1;
>> +    aligned_addr = addr & ~mask;
> 
> addr & ~iova_mask(rid->iovad)
OK
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_addr, page_size);
>> +    if (!b)
>> +        goto unlock;
>> +
>> +    iova = b->iova;
>> +    size = b->size;
>> +    d = b->domain;
>> +
>> +    ret = kref_put(&b->kref, reserved_binding_release);
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret)
>> +        iommu_unmap(d, iova, size);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>> +
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 01ec385..8722131 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>    */
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>>
>> +/**
>> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
>> + * map them to the physical range defined by @addr and @size.
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address to bind
>> + * @size: size of the binding
>> + * @prot: mapping protection attribute
>> + * @iova: returned iova
>> + *
>> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1)
>> >> order]
>> + * where order corresponds to the reserved iova domain order.
>> + * This mapping is tracked and reference counted with the minimal
>> granularity
>> + * of @size.
>> + */
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                phys_addr_t addr, size_t size, int prot,
>> +                dma_addr_t *iova);
>> +
>> +/**
>> + * iommu_put_reserved_iova: decrement a ref count of the reserved
>> mapping
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address whose binding ref count is decremented
>> + *
>> + * if the binding ref count is null, destroy the reserved mapping
>> + */
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr);
>>   #else
>>
>>   static inline int
>> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>   static inline void
>>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>>
>> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                      phys_addr_t addr, size_t size,
>> +                      int prot, dma_addr_t *iova)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>> +                       phys_addr_t addr) {}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 
> I worry that this all falls into the trap of trying too hard to abstract
> something which doesn't need abstracting. AFAICS all we need is
> something for VFIO to keep track of its own IOVA usage vs. userspace's,
> plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging
> off the iommu_domain, with a handful of functions to manage them. The
> former is as good as solved already - stick an iova_domain or even just
> a bitmap in the iova_cookie and use it directly - and the latter would
> actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm
> seeing here is layers upon layers of complexity with no immediate
> justification, that's 'generic' enough to not directly solve the problem
> at hand, but in a way that still makes it more or less unusable for
> solving equivalent problems elsewhere.
> 
> Since I don't like that everything I have to say about this series so
> far seems negative, I'll plan to spend some time next week having a go
> at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I
> can offer code instead of criticism :)
No worries. I really appreciate the time you've already spent reading
this code ;-) I aknowledge it is a lot of trouble for mapping a single
page - in my case - ! Anyway I will take into account your comments and
simplify things accordingly. Let's see how we can converge...

Best Regards

Eric
> 
> Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-21  8:43       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:43 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/20/2016 06:58 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces iommu_get/put_reserved_iova.
>>
>> iommu_get_reserved_iova allows to iommu map a contiguous physical region
>> onto a reserved contiguous IOVA region. The physical region base address
>> does not need to be iommu page size aligned. iova pages are allocated and
>> mapped so that they cover all the physical region. This mapping is
>> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>>
>> In case a mapping already exists for the physical pages, the IOVA mapped
>> to the PA base is directly returned.
>>
>> Each time the get succeeds a binding ref count is incremented.
>>
>> iommu_put_reserved_iova decrements the ref count and when this latter
>> is null, the mapping is destroyed and the iovas are released.
>>
>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> ---
>> v7:
>> - change title and rework commit message with new name of the functions
>>    and size parameter
>> - fix locking
>> - rework header doc comments
>> - put now takes a phys_addr_t
>> - check prot argument against reserved_iova_domain prot flags
>>
>> v5 -> v6:
>> - revisit locking with spin_lock instead of mutex
>> - do not kref_get on 1st get
>> - add size parameter to the get function following Marc's request
>> - use the iova domain shift instead of using the smallest supported
>> page size
>>
>> v3 -> v4:
>> - formerly in iommu: iommu_get/put_single_reserved &
>>    iommu/arm-smmu: implement iommu_get/put_single_reserved
>> - Attempted to address Marc's doubts about missing size/alignment
>>    at VFIO level (user-space knows the IOMMU page size and the number
>>    of IOVA pages to provision)
>>
>> v2 -> v3:
>> - remove static implementation of iommu_get_single_reserved &
>>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 150
>> +++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>>   2 files changed, 188 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index f6fa18e..426d339 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -135,6 +135,22 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>>
>> +/* called with domain's reserved_lock held */
>> +static void reserved_binding_release(struct kref *kref)
>> +{
>> +    struct iommu_reserved_binding *b =
>> +        container_of(kref, struct iommu_reserved_binding, kref);
>> +    struct iommu_domain *d = b->domain;
>> +    struct reserved_iova_domain *rid =
>> +        (struct reserved_iova_domain *)d->reserved_iova_cookie;
> 
> Either it's a void *, in which case you don't need to cast it, or it
> should be the appropriate type as I mentioned earlier, in which case you
> still wouldn't need to cast it.
ok
> 
>> +    unsigned long order;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    free_iova(rid->iovad, b->iova >> order);
> 
> iova_pfn() ?
ok
> 
>> +    unlink_reserved_binding(d, b);
>> +    kfree(b);
>> +}
>> +
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>>   {
>>       struct reserved_iova_domain *rid;
>> @@ -160,3 +176,137 @@ unlock:
>>       }
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> +
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                  phys_addr_t addr, size_t size, int prot,
>> +                  dma_addr_t *iova)
>> +{
>> +    unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    struct iommu_reserved_binding *b, *newb;
>> +    size_t iommu_page_size, binding_size;
>> +    phys_addr_t aligned_base, offset;
>> +    struct reserved_iova_domain *rid;
>> +    struct iova_domain *iovad;
>> +    struct iova *p_iova;
>> +    int ret = -EINVAL;
>> +
>> +    newb = kzalloc(sizeof(*newb), GFP_KERNEL);
>> +    if (!newb)
>> +        return -ENOMEM;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto free_newb;
>> +
>> +    if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
>> +        (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
> 
> Are devices wanting to read from MSI doorbells really a thing?
the rationale is Alex asked for propagating the VFIO DMA MAP prot flag
downto this API (genericity context). This later is stored in rid->prot
and I was just checking the iova was mapped according to the direction
the userspace expected.
> 
>> +        goto free_newb;
>> +
>> +    iovad = rid->iovad;
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> offset = iova_offset(iovad, addr);
> aligned_base = addr - offset;
> binding_size = iova_align(iovad, size + offset);
> 
> Am I right?
Looks so. Will further test it. Thanks
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
>> +    if (b) {
>> +        *iova = b->iova + offset + aligned_base - b->addr;
>> +        kref_get(&b->kref);
>> +        ret = 0;
>> +        goto free_newb;
>> +    }
>> +
>> +    p_iova = alloc_iova(iovad, nb_iommu_pages,
>> +                iovad->dma_32bit_pfn, true);
>> +    if (!p_iova) {
>> +        ret = -ENOMEM;
>> +        goto free_newb;
>> +    }
>> +
>> +    *iova = iova_dma_addr(iovad, p_iova);
>> +
>> +    /* unlock to call iommu_map which is not guaranteed to be atomic */
> 
> Hmm, that's concerning, because the ARM DMA mapping code, and
> consequently the iommu-dma layer, has always relied on it being so. On
> brief inspection, it looks to be only the AMD IOMMU doing something
> obviously non-atomic (taking a mutex) in its map callback, but then that
> has a separate DMA ops implementation. It doesn't look like it would be
> too intrusive to change, either, but that's an idea for its own thread.
yes. Making no hypothesis on the atomicity of iommu_map/unmap ops
brought some extra complexity here. Also it obliged to separate the
alloc/map from the iommu "binding" lookup. But now it is done I think it
brings some added value. Typically the fact we introduced an irq-chip
ops to retrieve the doorbells characteristics is valuable to enumerate
their number.
> 
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +
>> +    ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
>> +    if (!rid || (rid->iovad != iovad)) {
>> +        /* reserved iova domain was destroyed in our back */
> 
> That that could happen at all is terrifying! Surely the reserved domain
> should be set up immediately after iommu_domain_alloc() and torn down
> immediately before iommu_domain_free(). Things going missing while a
> domain is live smacks of horrible brokenness.
The VFIO user client creates the "reserved iova domain" using the vfio
VFIO_IOMMU_MAP_DMA ioctl. This can happen anytime after the iommu domain
creation (on VFIO_SET_IOMMU ioctl). The user-space is currently allowed
to unregister this iova domain at any time too. I think this is wrong: I
should have 2 reserved iova domain destroy functions, one used by
user-space and one used by kernel. In the user-space implementation I
should reject any attempt to destroy the reserved iova domain until
there are existing bindings.

> 
>> +        ret = -EBUSY;
>> +        goto free_newb; /* iova already released */
>> +    }
>> +
>> +    /* no change in iova reserved domain but iommu_map failed */
>> +    if (ret)
>> +        goto free_iova;
>> +
>> +    /* everything is fine, add in the new node in the rb tree */
>> +    kref_init(&newb->kref);
>> +    newb->domain = domain;
>> +    newb->addr = aligned_base;
>> +    newb->iova = *iova;
>> +    newb->size = binding_size;
>> +
>> +    link_reserved_binding(domain, newb);
>> +
>> +    *iova += offset;
>> +    goto unlock;
>> +
>> +free_iova:
>> +    free_iova(rid->iovad, p_iova->pfn_lo);
>> +free_newb:
>> +    kfree(newb);
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
>> +
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr)
>> +{
>> +    phys_addr_t aligned_addr, page_size, mask;
>> +    struct iommu_reserved_binding *b;
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long order, flags;
>> +    struct iommu_domain *d;
>> +    dma_addr_t iova;
>> +    size_t size;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto unlock;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    page_size = (uint64_t)1 << order;
>> +    mask = page_size - 1;
>> +    aligned_addr = addr & ~mask;
> 
> addr & ~iova_mask(rid->iovad)
OK
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_addr, page_size);
>> +    if (!b)
>> +        goto unlock;
>> +
>> +    iova = b->iova;
>> +    size = b->size;
>> +    d = b->domain;
>> +
>> +    ret = kref_put(&b->kref, reserved_binding_release);
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret)
>> +        iommu_unmap(d, iova, size);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>> +
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 01ec385..8722131 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>    */
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>>
>> +/**
>> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
>> + * map them to the physical range defined by @addr and @size.
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address to bind
>> + * @size: size of the binding
>> + * @prot: mapping protection attribute
>> + * @iova: returned iova
>> + *
>> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1)
>> >> order]
>> + * where order corresponds to the reserved iova domain order.
>> + * This mapping is tracked and reference counted with the minimal
>> granularity
>> + * of @size.
>> + */
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                phys_addr_t addr, size_t size, int prot,
>> +                dma_addr_t *iova);
>> +
>> +/**
>> + * iommu_put_reserved_iova: decrement a ref count of the reserved
>> mapping
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address whose binding ref count is decremented
>> + *
>> + * if the binding ref count is null, destroy the reserved mapping
>> + */
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr);
>>   #else
>>
>>   static inline int
>> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>   static inline void
>>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>>
>> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                      phys_addr_t addr, size_t size,
>> +                      int prot, dma_addr_t *iova)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>> +                       phys_addr_t addr) {}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 
> I worry that this all falls into the trap of trying too hard to abstract
> something which doesn't need abstracting. AFAICS all we need is
> something for VFIO to keep track of its own IOVA usage vs. userspace's,
> plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging
> off the iommu_domain, with a handful of functions to manage them. The
> former is as good as solved already - stick an iova_domain or even just
> a bitmap in the iova_cookie and use it directly - and the latter would
> actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm
> seeing here is layers upon layers of complexity with no immediate
> justification, that's 'generic' enough to not directly solve the problem
> at hand, but in a way that still makes it more or less unusable for
> solving equivalent problems elsewhere.
> 
> Since I don't like that everything I have to say about this series so
> far seems negative, I'll plan to spend some time next week having a go
> at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I
> can offer code instead of criticism :)
No worries. I really appreciate the time you've already spent reading
this code ;-) I aknowledge it is a lot of trouble for mapping a single
page - in my case - ! Anyway I will take into account your comments and
simplify things accordingly. Let's see how we can converge...

Best Regards

Eric
> 
> Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
@ 2016-04-21  8:43       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/20/2016 06:58 PM, Robin Murphy wrote:
> On 19/04/16 17:56, Eric Auger wrote:
>> This patch introduces iommu_get/put_reserved_iova.
>>
>> iommu_get_reserved_iova allows to iommu map a contiguous physical region
>> onto a reserved contiguous IOVA region. The physical region base address
>> does not need to be iommu page size aligned. iova pages are allocated and
>> mapped so that they cover all the physical region. This mapping is
>> tracked as a whole (and cannot be split) in an RB tree indexed by PA.
>>
>> In case a mapping already exists for the physical pages, the IOVA mapped
>> to the PA base is directly returned.
>>
>> Each time the get succeeds a binding ref count is incremented.
>>
>> iommu_put_reserved_iova decrements the ref count and when this latter
>> is null, the mapping is destroyed and the iovas are released.
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>
>> ---
>> v7:
>> - change title and rework commit message with new name of the functions
>>    and size parameter
>> - fix locking
>> - rework header doc comments
>> - put now takes a phys_addr_t
>> - check prot argument against reserved_iova_domain prot flags
>>
>> v5 -> v6:
>> - revisit locking with spin_lock instead of mutex
>> - do not kref_get on 1st get
>> - add size parameter to the get function following Marc's request
>> - use the iova domain shift instead of using the smallest supported
>> page size
>>
>> v3 -> v4:
>> - formerly in iommu: iommu_get/put_single_reserved &
>>    iommu/arm-smmu: implement iommu_get/put_single_reserved
>> - Attempted to address Marc's doubts about missing size/alignment
>>    at VFIO level (user-space knows the IOMMU page size and the number
>>    of IOVA pages to provision)
>>
>> v2 -> v3:
>> - remove static implementation of iommu_get_single_reserved &
>>    iommu_put_single_reserved when CONFIG_IOMMU_API is not set
>>
>> v1 -> v2:
>> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
>> ---
>>   drivers/iommu/dma-reserved-iommu.c | 150
>> +++++++++++++++++++++++++++++++++++++
>>   include/linux/dma-reserved-iommu.h |  38 ++++++++++
>>   2 files changed, 188 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>> b/drivers/iommu/dma-reserved-iommu.c
>> index f6fa18e..426d339 100644
>> --- a/drivers/iommu/dma-reserved-iommu.c
>> +++ b/drivers/iommu/dma-reserved-iommu.c
>> @@ -135,6 +135,22 @@ unlock:
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_alloc_reserved_iova_domain);
>>
>> +/* called with domain's reserved_lock held */
>> +static void reserved_binding_release(struct kref *kref)
>> +{
>> +    struct iommu_reserved_binding *b =
>> +        container_of(kref, struct iommu_reserved_binding, kref);
>> +    struct iommu_domain *d = b->domain;
>> +    struct reserved_iova_domain *rid =
>> +        (struct reserved_iova_domain *)d->reserved_iova_cookie;
> 
> Either it's a void *, in which case you don't need to cast it, or it
> should be the appropriate type as I mentioned earlier, in which case you
> still wouldn't need to cast it.
ok
> 
>> +    unsigned long order;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    free_iova(rid->iovad, b->iova >> order);
> 
> iova_pfn() ?
ok
> 
>> +    unlink_reserved_binding(d, b);
>> +    kfree(b);
>> +}
>> +
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
>>   {
>>       struct reserved_iova_domain *rid;
>> @@ -160,3 +176,137 @@ unlock:
>>       }
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
>> +
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                  phys_addr_t addr, size_t size, int prot,
>> +                  dma_addr_t *iova)
>> +{
>> +    unsigned long base_pfn, end_pfn, nb_iommu_pages, order, flags;
>> +    struct iommu_reserved_binding *b, *newb;
>> +    size_t iommu_page_size, binding_size;
>> +    phys_addr_t aligned_base, offset;
>> +    struct reserved_iova_domain *rid;
>> +    struct iova_domain *iovad;
>> +    struct iova *p_iova;
>> +    int ret = -EINVAL;
>> +
>> +    newb = kzalloc(sizeof(*newb), GFP_KERNEL);
>> +    if (!newb)
>> +        return -ENOMEM;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto free_newb;
>> +
>> +    if ((prot & IOMMU_READ & !(rid->prot & IOMMU_READ)) ||
>> +        (prot & IOMMU_WRITE & !(rid->prot & IOMMU_WRITE)))
> 
> Are devices wanting to read from MSI doorbells really a thing?
the rationale is Alex asked for propagating the VFIO DMA MAP prot flag
downto this API (genericity context). This later is stored in rid->prot
and I was just checking the iova was mapped according to the direction
the userspace expected.
> 
>> +        goto free_newb;
>> +
>> +    iovad = rid->iovad;
>> +    order = iova_shift(iovad);
>> +    base_pfn = addr >> order;
>> +    end_pfn = (addr + size - 1) >> order;
>> +    aligned_base = base_pfn << order;
>> +    offset = addr - aligned_base;
>> +    nb_iommu_pages = end_pfn - base_pfn + 1;
>> +    iommu_page_size = 1 << order;
>> +    binding_size = nb_iommu_pages * iommu_page_size;
> 
> offset = iova_offset(iovad, addr);
> aligned_base = addr - offset;
> binding_size = iova_align(iovad, size + offset);
> 
> Am I right?
Looks so. Will further test it. Thanks
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_base, binding_size);
>> +    if (b) {
>> +        *iova = b->iova + offset + aligned_base - b->addr;
>> +        kref_get(&b->kref);
>> +        ret = 0;
>> +        goto free_newb;
>> +    }
>> +
>> +    p_iova = alloc_iova(iovad, nb_iommu_pages,
>> +                iovad->dma_32bit_pfn, true);
>> +    if (!p_iova) {
>> +        ret = -ENOMEM;
>> +        goto free_newb;
>> +    }
>> +
>> +    *iova = iova_dma_addr(iovad, p_iova);
>> +
>> +    /* unlock to call iommu_map which is not guaranteed to be atomic */
> 
> Hmm, that's concerning, because the ARM DMA mapping code, and
> consequently the iommu-dma layer, has always relied on it being so. On
> brief inspection, it looks to be only the AMD IOMMU doing something
> obviously non-atomic (taking a mutex) in its map callback, but then that
> has a separate DMA ops implementation. It doesn't look like it would be
> too intrusive to change, either, but that's an idea for its own thread.
yes. Making no hypothesis on the atomicity of iommu_map/unmap ops
brought some extra complexity here. Also it obliged to separate the
alloc/map from the iommu "binding" lookup. But now it is done I think it
brings some added value. Typically the fact we introduced an irq-chip
ops to retrieve the doorbells characteristics is valuable to enumerate
their number.
> 
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +
>> +    ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *) domain->reserved_iova_cookie;
>> +    if (!rid || (rid->iovad != iovad)) {
>> +        /* reserved iova domain was destroyed in our back */
> 
> That that could happen at all is terrifying! Surely the reserved domain
> should be set up immediately after iommu_domain_alloc() and torn down
> immediately before iommu_domain_free(). Things going missing while a
> domain is live smacks of horrible brokenness.
The VFIO user client creates the "reserved iova domain" using the vfio
VFIO_IOMMU_MAP_DMA ioctl. This can happen anytime after the iommu domain
creation (on VFIO_SET_IOMMU ioctl). The user-space is currently allowed
to unregister this iova domain at any time too. I think this is wrong: I
should have 2 reserved iova domain destroy functions, one used by
user-space and one used by kernel. In the user-space implementation I
should reject any attempt to destroy the reserved iova domain until
there are existing bindings.

> 
>> +        ret = -EBUSY;
>> +        goto free_newb; /* iova already released */
>> +    }
>> +
>> +    /* no change in iova reserved domain but iommu_map failed */
>> +    if (ret)
>> +        goto free_iova;
>> +
>> +    /* everything is fine, add in the new node in the rb tree */
>> +    kref_init(&newb->kref);
>> +    newb->domain = domain;
>> +    newb->addr = aligned_base;
>> +    newb->iova = *iova;
>> +    newb->size = binding_size;
>> +
>> +    link_reserved_binding(domain, newb);
>> +
>> +    *iova += offset;
>> +    goto unlock;
>> +
>> +free_iova:
>> +    free_iova(rid->iovad, p_iova->pfn_lo);
>> +free_newb:
>> +    kfree(newb);
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_get_reserved_iova);
>> +
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr)
>> +{
>> +    phys_addr_t aligned_addr, page_size, mask;
>> +    struct iommu_reserved_binding *b;
>> +    struct reserved_iova_domain *rid;
>> +    unsigned long order, flags;
>> +    struct iommu_domain *d;
>> +    dma_addr_t iova;
>> +    size_t size;
>> +    int ret = 0;
>> +
>> +    spin_lock_irqsave(&domain->reserved_lock, flags);
>> +
>> +    rid = (struct reserved_iova_domain *)domain->reserved_iova_cookie;
>> +    if (!rid)
>> +        goto unlock;
>> +
>> +    order = iova_shift(rid->iovad);
>> +    page_size = (uint64_t)1 << order;
>> +    mask = page_size - 1;
>> +    aligned_addr = addr & ~mask;
> 
> addr & ~iova_mask(rid->iovad)
OK
> 
>> +
>> +    b = find_reserved_binding(domain, aligned_addr, page_size);
>> +    if (!b)
>> +        goto unlock;
>> +
>> +    iova = b->iova;
>> +    size = b->size;
>> +    d = b->domain;
>> +
>> +    ret = kref_put(&b->kref, reserved_binding_release);
>> +
>> +unlock:
>> +    spin_unlock_irqrestore(&domain->reserved_lock, flags);
>> +    if (ret)
>> +        iommu_unmap(d, iova, size);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_put_reserved_iova);
>> +
>> diff --git a/include/linux/dma-reserved-iommu.h
>> b/include/linux/dma-reserved-iommu.h
>> index 01ec385..8722131 100644
>> --- a/include/linux/dma-reserved-iommu.h
>> +++ b/include/linux/dma-reserved-iommu.h
>> @@ -42,6 +42,34 @@ int iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>    */
>>   void iommu_free_reserved_iova_domain(struct iommu_domain *domain);
>>
>> +/**
>> + * iommu_get_reserved_iova: allocate a contiguous set of iova pages and
>> + * map them to the physical range defined by @addr and @size.
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address to bind
>> + * @size: size of the binding
>> + * @prot: mapping protection attribute
>> + * @iova: returned iova
>> + *
>> + * Mapped physical pfns are within [@addr >> order, (@addr + size -1)
>> >> order]
>> + * where order corresponds to the reserved iova domain order.
>> + * This mapping is tracked and reference counted with the minimal
>> granularity
>> + * of @size.
>> + */
>> +int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                phys_addr_t addr, size_t size, int prot,
>> +                dma_addr_t *iova);
>> +
>> +/**
>> + * iommu_put_reserved_iova: decrement a ref count of the reserved
>> mapping
>> + *
>> + * @domain: iommu domain handle
>> + * @addr: physical address whose binding ref count is decremented
>> + *
>> + * if the binding ref count is null, destroy the reserved mapping
>> + */
>> +void iommu_put_reserved_iova(struct iommu_domain *domain, phys_addr_t
>> addr);
>>   #else
>>
>>   static inline int
>> @@ -55,5 +83,15 @@ iommu_alloc_reserved_iova_domain(struct
>> iommu_domain *domain,
>>   static inline void
>>   iommu_free_reserved_iova_domain(struct iommu_domain *domain) {}
>>
>> +static inline int iommu_get_reserved_iova(struct iommu_domain *domain,
>> +                      phys_addr_t addr, size_t size,
>> +                      int prot, dma_addr_t *iova)
>> +{
>> +    return -ENOENT;
>> +}
>> +
>> +static inline void iommu_put_reserved_iova(struct iommu_domain *domain,
>> +                       phys_addr_t addr) {}
>> +
>>   #endif    /* CONFIG_IOMMU_DMA_RESERVED */
>>   #endif    /* __DMA_RESERVED_IOMMU_H */
>>
> 
> I worry that this all falls into the trap of trying too hard to abstract
> something which doesn't need abstracting. AFAICS all we need is
> something for VFIO to keep track of its own IOVA usage vs. userspace's,
> plus a list of MSI descriptors (with IOVAs) wrapped in refcounts hanging
> off the iommu_domain, with a handful of functions to manage them. The
> former is as good as solved already - stick an iova_domain or even just
> a bitmap in the iova_cookie and use it directly - and the latter would
> actually be reusable elsewhere (e.g. for iommu-dma domains). What I'm
> seeing here is layers upon layers of complexity with no immediate
> justification, that's 'generic' enough to not directly solve the problem
> at hand, but in a way that still makes it more or less unusable for
> solving equivalent problems elsewhere.
> 
> Since I don't like that everything I have to say about this series so
> far seems negative, I'll plan to spend some time next week having a go
> at hardening my 50-line proof-of-concept for stage 1 MSIs, and see if I
> can offer code instead of criticism :)
No worries. I really appreciate the time you've already spent reading
this code ;-) I aknowledge it is a lot of trouble for mapping a single
page - in my case - ! Anyway I will take into account your comments and
simplify things accordingly. Let's see how we can converge...

Best Regards

Eric
> 
> Robin.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 12:18   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21 12:18 UTC (permalink / raw)
  To: eric.auger, robin.murphy, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Alex, Robin,
On 04/19/2016 06:56 PM, Eric Auger wrote:
> This series introduces the dma-reserved-iommu api used to:
> 
> - create/destroy an iova domain dedicated to reserved iova bindings
> - map/unmap physical addresses onto reserved IOVAs.
> - search for an existing reserved iova mapping matching a PA window
> - determine whether an msi needs to be iommu mapped
> - translate an msi_msg PA address into its IOVA counterpart

Following Robin's review, I understand one important point we have to
clarify is how much this API has to be generic.

I agree with Robin on the fact there is quite a lot of duplication
between this dma-reserved-iommu implementation and dma-iommu
implementation. Maybe we could consider an msi-mapping API
implementation upon dma-iommu.c. This implementation would add MSI
doorbell binding list management, including, ref counting and locking.

We would need to add a map/unmap function taking an iova/pa/size as
parameters in current dma-iommu.c

An important assumption is that the dma-mapping API and the msi-mapping
API must not be used concurrently (be would be trying to use the same
cookie to store a different iova_domain).

Any thought/suggestion?

Best Regards

Eric


> 
> Currently reserved IOVAs are meant to map MSI physical doorbells. A single
> reserved domain does exit per domain.
> 
> Also a new domain attribute is introduced to signal whether the MSI
> addresses must be mapped in the IOMMU.
> 
> In current usage:
> VFIO subsystem is supposed to create/destroy the iommu reserved domain.
> The MSI layer is supposed to allocate/free iova mappings
> 
> Since several drivers are likely to use the same doorbell, a reference
> counting takes place on the bindings. An RB-tree indexed by PA is used
> to easily lookup for existing mappings at MSI message composition time.
> 
> More details & context can be found at:
> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
> 
> Best Regards
> 
> Eric
> 
> Git: complete series available at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7
> 
> History:
> 
> v6 -> v7:
> - fixed known lock bugs and multiple page sized slots matching
>   (I currently only have a single MSI frame made of a single page)
> - reserved_iova_cookie now pointing to a struct that encapsulates the
>   iova domain handle + protection attribute passed from VFIO (Alex' req)
> - 2 new functions exposed: iommu_msi_mapping_translate_msg,
>   iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
>   though
> - iommu_put_reserved_iova now takes a phys_addr_t
> - everything now is cleanup on iommu_domain destruction
> 
> RFC v5 -> patch v6:
> - split to ease the review process
> - in dma-reserved-api use a spin lock instead of a mutex (reported by
>   Jean-Philippe)
> - revisit iommu_get_reserved_iova API to pass a size parameter upon
>   Marc's request
> - Consistently use the page order passed when creating the iova domain.
> - init reserved_binding_list (reported by Julien)
> 
> RFC v4 -> RFC v5:
> - take into account Thomas' comments on MSI related patches
>   - split "msi: IOMMU map the doorbell address when needed"
>   - increase readability and add comments
>   - fix style issues
>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>  - fix compilation issue with CONFIG_IOMMU API unset
>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
> 
> RFC v3 -> v4:
> - Move doorbell mapping/unmapping in msi.c
> - fix ref count issue on set_affinity: in case of a change in the address
>   the previous address is decremented
> - doorbell map/unmap now is done on msi composition. Should allow the use
>   case for platform MSI controllers
> - create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
>   to reserved IOVA management (looking like dma-iommu glue)
> - series reordering to ease the review:
>   - first part is related to IOMMU
>   - second related to MSI sub-system
>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
> - expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
>   [this partially addresses Marc's comments on iommu_get/put_single_reserved
>    size/alignment problematic - which I did not ignore - but I don't know
>    how much I can do at the moment]
> 
> RFC v2 -> RFC v3:
> - should fix wrong handling of some CONFIG combinations:
>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
> 
> PATCH v1 -> RFC v2:
> - reverted to RFC since it looks more reasonable ;-) the code is split
>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>   choices. Also API need to be further discussed.
> - iova API usage in arm-smmu.c.
> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>   This is not done anymore in vfio-pci driver as suggested by Alex.
> - check irq remapping capability of the group
> 
> RFC v1 [2] -> PATCH v1:
> - use the existing dma map/unmap ioctl interface with a flag to register a
>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
> - a single reserved IOVA contiguous region now is allowed
> - use of an RB tree indexed by PA to store allocated reserved slots
> - use of a vfio_domain iova_domain to manage iova allocation within the
>   window provided by the userspace
> - vfio alloc_map/unmap_free take a vfio_group handle
> - vfio_group handle is cached in vfio_pci_device
> - add ref counting to bindings
> - user modality enabled at the end of the series
> 
> 
> Eric Auger (10):
>   iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu: introduce a reserved iova cookie
>   iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
>   iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
>   iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
>   iommu/dma-reserved-iommu: delete bindings in
>     iommu_free_reserved_iova_domain
>   iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
>   iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
>   iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
>     destruction
> 
>  drivers/iommu/Kconfig              |   8 +
>  drivers/iommu/Makefile             |   1 +
>  drivers/iommu/arm-smmu-v3.c        |   4 +
>  drivers/iommu/arm-smmu.c           |   4 +
>  drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
>  drivers/iommu/iommu.c              |   2 +
>  include/linux/dma-reserved-iommu.h | 142 +++++++++++++
>  include/linux/iommu.h              |   7 +
>  8 files changed, 590 insertions(+)
>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>  create mode 100644 include/linux/dma-reserved-iommu.h
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 12:18   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21 12:18 UTC (permalink / raw)
  To: eric.auger-qxv4g6HH51o, robin.murphy-5wv7dgnIgG8,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Alex, Robin,
On 04/19/2016 06:56 PM, Eric Auger wrote:
> This series introduces the dma-reserved-iommu api used to:
> 
> - create/destroy an iova domain dedicated to reserved iova bindings
> - map/unmap physical addresses onto reserved IOVAs.
> - search for an existing reserved iova mapping matching a PA window
> - determine whether an msi needs to be iommu mapped
> - translate an msi_msg PA address into its IOVA counterpart

Following Robin's review, I understand one important point we have to
clarify is how much this API has to be generic.

I agree with Robin on the fact there is quite a lot of duplication
between this dma-reserved-iommu implementation and dma-iommu
implementation. Maybe we could consider an msi-mapping API
implementation upon dma-iommu.c. This implementation would add MSI
doorbell binding list management, including, ref counting and locking.

We would need to add a map/unmap function taking an iova/pa/size as
parameters in current dma-iommu.c

An important assumption is that the dma-mapping API and the msi-mapping
API must not be used concurrently (be would be trying to use the same
cookie to store a different iova_domain).

Any thought/suggestion?

Best Regards

Eric


> 
> Currently reserved IOVAs are meant to map MSI physical doorbells. A single
> reserved domain does exit per domain.
> 
> Also a new domain attribute is introduced to signal whether the MSI
> addresses must be mapped in the IOMMU.
> 
> In current usage:
> VFIO subsystem is supposed to create/destroy the iommu reserved domain.
> The MSI layer is supposed to allocate/free iova mappings
> 
> Since several drivers are likely to use the same doorbell, a reference
> counting takes place on the bindings. An RB-tree indexed by PA is used
> to easily lookup for existing mappings at MSI message composition time.
> 
> More details & context can be found at:
> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
> 
> Best Regards
> 
> Eric
> 
> Git: complete series available at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7
> 
> History:
> 
> v6 -> v7:
> - fixed known lock bugs and multiple page sized slots matching
>   (I currently only have a single MSI frame made of a single page)
> - reserved_iova_cookie now pointing to a struct that encapsulates the
>   iova domain handle + protection attribute passed from VFIO (Alex' req)
> - 2 new functions exposed: iommu_msi_mapping_translate_msg,
>   iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
>   though
> - iommu_put_reserved_iova now takes a phys_addr_t
> - everything now is cleanup on iommu_domain destruction
> 
> RFC v5 -> patch v6:
> - split to ease the review process
> - in dma-reserved-api use a spin lock instead of a mutex (reported by
>   Jean-Philippe)
> - revisit iommu_get_reserved_iova API to pass a size parameter upon
>   Marc's request
> - Consistently use the page order passed when creating the iova domain.
> - init reserved_binding_list (reported by Julien)
> 
> RFC v4 -> RFC v5:
> - take into account Thomas' comments on MSI related patches
>   - split "msi: IOMMU map the doorbell address when needed"
>   - increase readability and add comments
>   - fix style issues
>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>  - fix compilation issue with CONFIG_IOMMU API unset
>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
> 
> RFC v3 -> v4:
> - Move doorbell mapping/unmapping in msi.c
> - fix ref count issue on set_affinity: in case of a change in the address
>   the previous address is decremented
> - doorbell map/unmap now is done on msi composition. Should allow the use
>   case for platform MSI controllers
> - create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
>   to reserved IOVA management (looking like dma-iommu glue)
> - series reordering to ease the review:
>   - first part is related to IOMMU
>   - second related to MSI sub-system
>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
> - expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
>   [this partially addresses Marc's comments on iommu_get/put_single_reserved
>    size/alignment problematic - which I did not ignore - but I don't know
>    how much I can do at the moment]
> 
> RFC v2 -> RFC v3:
> - should fix wrong handling of some CONFIG combinations:
>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
> 
> PATCH v1 -> RFC v2:
> - reverted to RFC since it looks more reasonable ;-) the code is split
>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>   choices. Also API need to be further discussed.
> - iova API usage in arm-smmu.c.
> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>   This is not done anymore in vfio-pci driver as suggested by Alex.
> - check irq remapping capability of the group
> 
> RFC v1 [2] -> PATCH v1:
> - use the existing dma map/unmap ioctl interface with a flag to register a
>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
> - a single reserved IOVA contiguous region now is allowed
> - use of an RB tree indexed by PA to store allocated reserved slots
> - use of a vfio_domain iova_domain to manage iova allocation within the
>   window provided by the userspace
> - vfio alloc_map/unmap_free take a vfio_group handle
> - vfio_group handle is cached in vfio_pci_device
> - add ref counting to bindings
> - user modality enabled at the end of the series
> 
> 
> Eric Auger (10):
>   iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu: introduce a reserved iova cookie
>   iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
>   iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
>   iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
>   iommu/dma-reserved-iommu: delete bindings in
>     iommu_free_reserved_iova_domain
>   iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
>   iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
>   iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
>     destruction
> 
>  drivers/iommu/Kconfig              |   8 +
>  drivers/iommu/Makefile             |   1 +
>  drivers/iommu/arm-smmu-v3.c        |   4 +
>  drivers/iommu/arm-smmu.c           |   4 +
>  drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
>  drivers/iommu/iommu.c              |   2 +
>  include/linux/dma-reserved-iommu.h | 142 +++++++++++++
>  include/linux/iommu.h              |   7 +
>  8 files changed, 590 insertions(+)
>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>  create mode 100644 include/linux/dma-reserved-iommu.h
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 12:18   ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-21 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Alex, Robin,
On 04/19/2016 06:56 PM, Eric Auger wrote:
> This series introduces the dma-reserved-iommu api used to:
> 
> - create/destroy an iova domain dedicated to reserved iova bindings
> - map/unmap physical addresses onto reserved IOVAs.
> - search for an existing reserved iova mapping matching a PA window
> - determine whether an msi needs to be iommu mapped
> - translate an msi_msg PA address into its IOVA counterpart

Following Robin's review, I understand one important point we have to
clarify is how much this API has to be generic.

I agree with Robin on the fact there is quite a lot of duplication
between this dma-reserved-iommu implementation and dma-iommu
implementation. Maybe we could consider an msi-mapping API
implementation upon dma-iommu.c. This implementation would add MSI
doorbell binding list management, including, ref counting and locking.

We would need to add a map/unmap function taking an iova/pa/size as
parameters in current dma-iommu.c

An important assumption is that the dma-mapping API and the msi-mapping
API must not be used concurrently (be would be trying to use the same
cookie to store a different iova_domain).

Any thought/suggestion?

Best Regards

Eric


> 
> Currently reserved IOVAs are meant to map MSI physical doorbells. A single
> reserved domain does exit per domain.
> 
> Also a new domain attribute is introduced to signal whether the MSI
> addresses must be mapped in the IOMMU.
> 
> In current usage:
> VFIO subsystem is supposed to create/destroy the iommu reserved domain.
> The MSI layer is supposed to allocate/free iova mappings
> 
> Since several drivers are likely to use the same doorbell, a reference
> counting takes place on the bindings. An RB-tree indexed by PA is used
> to easily lookup for existing mappings at MSI message composition time.
> 
> More details & context can be found at:
> http://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
> 
> Best Regards
> 
> Eric
> 
> Git: complete series available at
> https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.6-rc4-pcie-passthrough-v7
> 
> History:
> 
> v6 -> v7:
> - fixed known lock bugs and multiple page sized slots matching
>   (I currently only have a single MSI frame made of a single page)
> - reserved_iova_cookie now pointing to a struct that encapsulates the
>   iova domain handle + protection attribute passed from VFIO (Alex' req)
> - 2 new functions exposed: iommu_msi_mapping_translate_msg,
>   iommu_msi_mapping_desc_to_domain: not sure this is the right location/proto
>   though
> - iommu_put_reserved_iova now takes a phys_addr_t
> - everything now is cleanup on iommu_domain destruction
> 
> RFC v5 -> patch v6:
> - split to ease the review process
> - in dma-reserved-api use a spin lock instead of a mutex (reported by
>   Jean-Philippe)
> - revisit iommu_get_reserved_iova API to pass a size parameter upon
>   Marc's request
> - Consistently use the page order passed when creating the iova domain.
> - init reserved_binding_list (reported by Julien)
> 
> RFC v4 -> RFC v5:
> - take into account Thomas' comments on MSI related patches
>   - split "msi: IOMMU map the doorbell address when needed"
>   - increase readability and add comments
>   - fix style issues
>  - split "iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute"
>  - platform ITS now advertises IOMMU_CAP_INTR_REMAP
>  - fix compilation issue with CONFIG_IOMMU API unset
>  - arm-smmu-v3 now advertises DOMAIN_ATTR_MSI_MAPPING
> 
> RFC v3 -> v4:
> - Move doorbell mapping/unmapping in msi.c
> - fix ref count issue on set_affinity: in case of a change in the address
>   the previous address is decremented
> - doorbell map/unmap now is done on msi composition. Should allow the use
>   case for platform MSI controllers
> - create dma-reserved-iommu.h/c exposing/implementing a new API dedicated
>   to reserved IOVA management (looking like dma-iommu glue)
> - series reordering to ease the review:
>   - first part is related to IOMMU
>   - second related to MSI sub-system
>   - third related to VFIO (except arm-smmu IOMMU_CAP_INTR_REMAP removal)
> - expose the number of requested IOVA pages through VFIO_IOMMU_GET_INFO
>   [this partially addresses Marc's comments on iommu_get/put_single_reserved
>    size/alignment problematic - which I did not ignore - but I don't know
>    how much I can do at the moment]
> 
> RFC v2 -> RFC v3:
> - should fix wrong handling of some CONFIG combinations:
>   CONFIG_IOVA, CONFIG_IOMMU_API, CONFIG_PCI_MSI_IRQ_DOMAIN
> - fix MSI_FLAG_IRQ_REMAPPING setting in GICv3 ITS (although not tested)
> 
> PATCH v1 -> RFC v2:
> - reverted to RFC since it looks more reasonable ;-) the code is split
>   between VFIO, IOMMU, MSI controller and I am not sure I did the right
>   choices. Also API need to be further discussed.
> - iova API usage in arm-smmu.c.
> - MSI controller natively programs the MSI addr with either the PA or IOVA.
>   This is not done anymore in vfio-pci driver as suggested by Alex.
> - check irq remapping capability of the group
> 
> RFC v1 [2] -> PATCH v1:
> - use the existing dma map/unmap ioctl interface with a flag to register a
>   reserved IOVA range. Use the legacy Rb to store this special vfio_dma.
> - a single reserved IOVA contiguous region now is allowed
> - use of an RB tree indexed by PA to store allocated reserved slots
> - use of a vfio_domain iova_domain to manage iova allocation within the
>   window provided by the userspace
> - vfio alloc_map/unmap_free take a vfio_group handle
> - vfio_group handle is cached in vfio_pci_device
> - add ref counting to bindings
> - user modality enabled at the end of the series
> 
> 
> Eric Auger (10):
>   iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu/arm-smmu: advertise DOMAIN_ATTR_MSI_MAPPING attribute
>   iommu: introduce a reserved iova cookie
>   iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain
>   iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
>   iommu/dma-reserved-iommu: iommu_get/put_reserved_iova
>   iommu/dma-reserved-iommu: delete bindings in
>     iommu_free_reserved_iova_domain
>   iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain
>   iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg
>   iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain
>     destruction
> 
>  drivers/iommu/Kconfig              |   8 +
>  drivers/iommu/Makefile             |   1 +
>  drivers/iommu/arm-smmu-v3.c        |   4 +
>  drivers/iommu/arm-smmu.c           |   4 +
>  drivers/iommu/dma-reserved-iommu.c | 422 +++++++++++++++++++++++++++++++++++++
>  drivers/iommu/iommu.c              |   2 +
>  include/linux/dma-reserved-iommu.h | 142 +++++++++++++
>  include/linux/iommu.h              |   7 +
>  8 files changed, 590 insertions(+)
>  create mode 100644 drivers/iommu/dma-reserved-iommu.c
>  create mode 100644 include/linux/dma-reserved-iommu.h
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 19:32     ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-21 19:32 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, patches,
	linux-kernel, Bharat.Bhushan, pranav.sawargaonkar, p.fedin,
	iommu, Jean-Philippe.Brucker, julien.grall

On Thu, 21 Apr 2016 14:18:09 +0200
Eric Auger <eric.auger@linaro.org> wrote:

> Hi Alex, Robin,
> On 04/19/2016 06:56 PM, Eric Auger wrote:
> > This series introduces the dma-reserved-iommu api used to:
> > 
> > - create/destroy an iova domain dedicated to reserved iova bindings
> > - map/unmap physical addresses onto reserved IOVAs.
> > - search for an existing reserved iova mapping matching a PA window
> > - determine whether an msi needs to be iommu mapped
> > - translate an msi_msg PA address into its IOVA counterpart  
> 
> Following Robin's review, I understand one important point we have to
> clarify is how much this API has to be generic.
> 
> I agree with Robin on the fact there is quite a lot of duplication
> between this dma-reserved-iommu implementation and dma-iommu
> implementation. Maybe we could consider an msi-mapping API
> implementation upon dma-iommu.c. This implementation would add MSI
> doorbell binding list management, including, ref counting and locking.
> 
> We would need to add a map/unmap function taking an iova/pa/size as
> parameters in current dma-iommu.c
> 
> An important assumption is that the dma-mapping API and the msi-mapping
> API must not be used concurrently (be would be trying to use the same
> cookie to store a different iova_domain).
> 
> Any thought/suggestion?

Hi Eric,

I'm not attached to a generic interface, the important part for me is
that if we have an iommu domain with space reserved for MSI, the MSI
setup and allocation code should handle that so we don't need to play
the remapping tricks between vfio-pci and a vfio iommu driver that we
saw in early drafts of this.  My first inclination is always to try to
make a generic, re-usable interface, but I apologize if that's led us
astray here and we really do want the more simple, MSI specific
interface.

For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
Maybe something like:

struct iommu_domain_msi_geometry {
	dma_addr_t	aperture_start;
	dma_addr_t	aperture_end;
	bool		fixed; /* or 'programmable' depending on your polarity preference */
};

Calling \get\ on arm would return { 0, 0, false }, indicating it's
programmable, \set\ would allocate the iovad as specified.  That would
make it very easy to expand the API to x86 with reporting of the fixed
MSI range and it operates within the existing IOMMU API interfaces.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 19:32     ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-21 19:32 UTC (permalink / raw)
  To: Eric Auger
  Cc: julien.grall-5wv7dgnIgG8, eric.auger-qxv4g6HH51o,
	jason-NLaQJdtUoK4Be96aLqz0jA, patches-QSEj5FYQhm4dnm+yROfE0A,
	marc.zyngier-5wv7dgnIgG8, p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	will.deacon-5wv7dgnIgG8, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	tglx-hfZtesqFncYOwBW4kG4KsQ,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A

On Thu, 21 Apr 2016 14:18:09 +0200
Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:

> Hi Alex, Robin,
> On 04/19/2016 06:56 PM, Eric Auger wrote:
> > This series introduces the dma-reserved-iommu api used to:
> > 
> > - create/destroy an iova domain dedicated to reserved iova bindings
> > - map/unmap physical addresses onto reserved IOVAs.
> > - search for an existing reserved iova mapping matching a PA window
> > - determine whether an msi needs to be iommu mapped
> > - translate an msi_msg PA address into its IOVA counterpart  
> 
> Following Robin's review, I understand one important point we have to
> clarify is how much this API has to be generic.
> 
> I agree with Robin on the fact there is quite a lot of duplication
> between this dma-reserved-iommu implementation and dma-iommu
> implementation. Maybe we could consider an msi-mapping API
> implementation upon dma-iommu.c. This implementation would add MSI
> doorbell binding list management, including, ref counting and locking.
> 
> We would need to add a map/unmap function taking an iova/pa/size as
> parameters in current dma-iommu.c
> 
> An important assumption is that the dma-mapping API and the msi-mapping
> API must not be used concurrently (be would be trying to use the same
> cookie to store a different iova_domain).
> 
> Any thought/suggestion?

Hi Eric,

I'm not attached to a generic interface, the important part for me is
that if we have an iommu domain with space reserved for MSI, the MSI
setup and allocation code should handle that so we don't need to play
the remapping tricks between vfio-pci and a vfio iommu driver that we
saw in early drafts of this.  My first inclination is always to try to
make a generic, re-usable interface, but I apologize if that's led us
astray here and we really do want the more simple, MSI specific
interface.

For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
Maybe something like:

struct iommu_domain_msi_geometry {
	dma_addr_t	aperture_start;
	dma_addr_t	aperture_end;
	bool		fixed; /* or 'programmable' depending on your polarity preference */
};

Calling \get\ on arm would return { 0, 0, false }, indicating it's
programmable, \set\ would allocate the iovad as specified.  That would
make it very easy to expand the API to x86 with reporting of the fixed
MSI range and it operates within the existing IOMMU API interfaces.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-21 19:32     ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-21 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 21 Apr 2016 14:18:09 +0200
Eric Auger <eric.auger@linaro.org> wrote:

> Hi Alex, Robin,
> On 04/19/2016 06:56 PM, Eric Auger wrote:
> > This series introduces the dma-reserved-iommu api used to:
> > 
> > - create/destroy an iova domain dedicated to reserved iova bindings
> > - map/unmap physical addresses onto reserved IOVAs.
> > - search for an existing reserved iova mapping matching a PA window
> > - determine whether an msi needs to be iommu mapped
> > - translate an msi_msg PA address into its IOVA counterpart  
> 
> Following Robin's review, I understand one important point we have to
> clarify is how much this API has to be generic.
> 
> I agree with Robin on the fact there is quite a lot of duplication
> between this dma-reserved-iommu implementation and dma-iommu
> implementation. Maybe we could consider an msi-mapping API
> implementation upon dma-iommu.c. This implementation would add MSI
> doorbell binding list management, including, ref counting and locking.
> 
> We would need to add a map/unmap function taking an iova/pa/size as
> parameters in current dma-iommu.c
> 
> An important assumption is that the dma-mapping API and the msi-mapping
> API must not be used concurrently (be would be trying to use the same
> cookie to store a different iova_domain).
> 
> Any thought/suggestion?

Hi Eric,

I'm not attached to a generic interface, the important part for me is
that if we have an iommu domain with space reserved for MSI, the MSI
setup and allocation code should handle that so we don't need to play
the remapping tricks between vfio-pci and a vfio iommu driver that we
saw in early drafts of this.  My first inclination is always to try to
make a generic, re-usable interface, but I apologize if that's led us
astray here and we really do want the more simple, MSI specific
interface.

For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
Maybe something like:

struct iommu_domain_msi_geometry {
	dma_addr_t	aperture_start;
	dma_addr_t	aperture_end;
	bool		fixed; /* or 'programmable' depending on your polarity preference */
};

Calling \get\ on arm would return { 0, 0, false }, indicating it's
programmable, \set\ would allocate the iovad as specified.  That would
make it very easy to expand the API to x86 with reporting of the fixed
MSI range and it operates within the existing IOMMU API interfaces.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 11:31         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 11:31 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 20/04/16 16:58, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>> Hi Eric,
>>
>> On 19/04/16 17:56, Eric Auger wrote:
>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>
>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>> FEF0_000h] window which directly targets the APIC configuration space and
>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>> conveyed through the IOMMU.
>>
>> What's stopping us from simply inferring this from the domain's IOMMU
>> not advertising interrupt remapping capabilities?
> My current understanding is it is not possible:
> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
> disabled) and MSIs are never mapped in the IOMMU I think.

Not sure I follow - if the feature is disabled such that the IOMMU 
doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

My point was that this logic:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (DOMAIN_ATTR_MSI_MAPPING)
		if (acquire_msi_remapping_resources(domain))
			we're good
		else
			oh no!
	else
		oh no!

should be easily reducible to this:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (acquire_msi_remapping_resources(domain))
		we're good
	else
		oh no!	// Don't care whether the domain ran out of
			// resources or simply doesn't support it,
			// either way we can't proceed.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>>
>>> v4 -> v5:
>>> - introduce the user in the next patch
>>>
>>> RFC v1 -> v1:
>>> - the data field is not used
>>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>>     capability if needed or <0 if not.
>>> - removed struct iommu_domain_msi_maps
>>> ---
>>>    include/linux/iommu.h | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 62a5eae..b3e8c5b 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>        DOMAIN_ATTR_MAX,
>>>    };
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 11:31         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 11:31 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 20/04/16 16:58, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>> Hi Eric,
>>
>> On 19/04/16 17:56, Eric Auger wrote:
>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>
>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>> FEF0_000h] window which directly targets the APIC configuration space and
>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>> conveyed through the IOMMU.
>>
>> What's stopping us from simply inferring this from the domain's IOMMU
>> not advertising interrupt remapping capabilities?
> My current understanding is it is not possible:
> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
> disabled) and MSIs are never mapped in the IOMMU I think.

Not sure I follow - if the feature is disabled such that the IOMMU 
doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

My point was that this logic:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (DOMAIN_ATTR_MSI_MAPPING)
		if (acquire_msi_remapping_resources(domain))
			we're good
		else
			oh no!
	else
		oh no!

should be easily reducible to this:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (acquire_msi_remapping_resources(domain))
		we're good
	else
		oh no!	// Don't care whether the domain ran out of
			// resources or simply doesn't support it,
			// either way we can't proceed.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>
>>> ---
>>>
>>> v4 -> v5:
>>> - introduce the user in the next patch
>>>
>>> RFC v1 -> v1:
>>> - the data field is not used
>>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>>     capability if needed or <0 if not.
>>> - removed struct iommu_domain_msi_maps
>>> ---
>>>    include/linux/iommu.h | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 62a5eae..b3e8c5b 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>        DOMAIN_ATTR_MAX,
>>>    };
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 11:31         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 11:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/04/16 16:58, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>> Hi Eric,
>>
>> On 19/04/16 17:56, Eric Auger wrote:
>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>
>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>> FEF0_000h] window which directly targets the APIC configuration space and
>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>> conveyed through the IOMMU.
>>
>> What's stopping us from simply inferring this from the domain's IOMMU
>> not advertising interrupt remapping capabilities?
> My current understanding is it is not possible:
> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
> disabled) and MSIs are never mapped in the IOMMU I think.

Not sure I follow - if the feature is disabled such that the IOMMU 
doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

My point was that this logic:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (DOMAIN_ATTR_MSI_MAPPING)
		if (acquire_msi_remapping_resources(domain))
			we're good
		else
			oh no!
	else
		oh no!

should be easily reducible to this:

	if (IOMMU_CAP_INTR_REMAP)
		we're good
	else if (acquire_msi_remapping_resources(domain))
		we're good
	else
		oh no!	// Don't care whether the domain ran out of
			// resources or simply doesn't support it,
			// either way we can't proceed.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>>
>>> v4 -> v5:
>>> - introduce the user in the next patch
>>>
>>> RFC v1 -> v1:
>>> - the data field is not used
>>> - for this attribute domain_get_attr simply returns 0 if the MSI_MAPPING
>>>     capability if needed or <0 if not.
>>> - removed struct iommu_domain_msi_maps
>>> ---
>>>    include/linux/iommu.h | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index 62a5eae..b3e8c5b 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>        DOMAIN_ATTR_MAX,
>>>    };
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 12:00           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:00 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/22/2016 01:31 PM, Robin Murphy wrote:
> On 20/04/16 16:58, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>> Hi Eric,
>>>
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>
>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>> FEF0_000h] window which directly targets the APIC configuration
>>>> space and
>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>> conveyed through the IOMMU.
>>>
>>> What's stopping us from simply inferring this from the domain's IOMMU
>>> not advertising interrupt remapping capabilities?
>> My current understanding is it is not possible:
>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>> disabled) and MSIs are never mapped in the IOMMU I think.
> 
> Not sure I follow - if the feature is disabled such that the IOMMU
> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
criteria to detect whether MSI mapping was requested.
> 
> My point was that this logic:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (DOMAIN_ATTR_MSI_MAPPING)
>         if (acquire_msi_remapping_resources(domain))
>             we're good
>         else
>             oh no!
>     else
>         oh no!
> 
> should be easily reducible to this:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (acquire_msi_remapping_resources(domain))

But Can't we imagine a mix of smmus on the same platform, some
requesting MSI mapping and some which don't. As soon as an smmu requires
MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
acquire_msi_remapping_resources(domain) will be implemented & succeed.
Doesn't it lead to a wrong decision. Do I miss something, or do you
consider this situation as far-fetched?

Thanks

Eric

>         we're good
>     else
>         oh no!    // Don't care whether the domain ran out of
>             // resources or simply doesn't support it,
>             // either way we can't proceed.
> 
> Robin.
> 
>> Best Regards
>>
>> Eric
>>>
>>> Robin.
>>>
>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>
>>>> ---
>>>>
>>>> v4 -> v5:
>>>> - introduce the user in the next patch
>>>>
>>>> RFC v1 -> v1:
>>>> - the data field is not used
>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>> MSI_MAPPING
>>>>     capability if needed or <0 if not.
>>>> - removed struct iommu_domain_msi_maps
>>>> ---
>>>>    include/linux/iommu.h | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index 62a5eae..b3e8c5b 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>        DOMAIN_ATTR_MAX,
>>>>    };
>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 12:00           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:00 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/22/2016 01:31 PM, Robin Murphy wrote:
> On 20/04/16 16:58, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>> Hi Eric,
>>>
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>
>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>> FEF0_000h] window which directly targets the APIC configuration
>>>> space and
>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>> conveyed through the IOMMU.
>>>
>>> What's stopping us from simply inferring this from the domain's IOMMU
>>> not advertising interrupt remapping capabilities?
>> My current understanding is it is not possible:
>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>> disabled) and MSIs are never mapped in the IOMMU I think.
> 
> Not sure I follow - if the feature is disabled such that the IOMMU
> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
criteria to detect whether MSI mapping was requested.
> 
> My point was that this logic:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (DOMAIN_ATTR_MSI_MAPPING)
>         if (acquire_msi_remapping_resources(domain))
>             we're good
>         else
>             oh no!
>     else
>         oh no!
> 
> should be easily reducible to this:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (acquire_msi_remapping_resources(domain))

But Can't we imagine a mix of smmus on the same platform, some
requesting MSI mapping and some which don't. As soon as an smmu requires
MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
acquire_msi_remapping_resources(domain) will be implemented & succeed.
Doesn't it lead to a wrong decision. Do I miss something, or do you
consider this situation as far-fetched?

Thanks

Eric

>         we're good
>     else
>         oh no!    // Don't care whether the domain ran out of
>             // resources or simply doesn't support it,
>             // either way we can't proceed.
> 
> Robin.
> 
>> Best Regards
>>
>> Eric
>>>
>>> Robin.
>>>
>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>>
>>>> ---
>>>>
>>>> v4 -> v5:
>>>> - introduce the user in the next patch
>>>>
>>>> RFC v1 -> v1:
>>>> - the data field is not used
>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>> MSI_MAPPING
>>>>     capability if needed or <0 if not.
>>>> - removed struct iommu_domain_msi_maps
>>>> ---
>>>>    include/linux/iommu.h | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index 62a5eae..b3e8c5b 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>        DOMAIN_ATTR_MAX,
>>>>    };
>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 12:00           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/22/2016 01:31 PM, Robin Murphy wrote:
> On 20/04/16 16:58, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>> Hi Eric,
>>>
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>
>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>> FEF0_000h] window which directly targets the APIC configuration
>>>> space and
>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>> conveyed through the IOMMU.
>>>
>>> What's stopping us from simply inferring this from the domain's IOMMU
>>> not advertising interrupt remapping capabilities?
>> My current understanding is it is not possible:
>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>> disabled) and MSIs are never mapped in the IOMMU I think.
> 
> Not sure I follow - if the feature is disabled such that the IOMMU
> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?

sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
criteria to detect whether MSI mapping was requested.
> 
> My point was that this logic:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (DOMAIN_ATTR_MSI_MAPPING)
>         if (acquire_msi_remapping_resources(domain))
>             we're good
>         else
>             oh no!
>     else
>         oh no!
> 
> should be easily reducible to this:
> 
>     if (IOMMU_CAP_INTR_REMAP)
>         we're good
>     else if (acquire_msi_remapping_resources(domain))

But Can't we imagine a mix of smmus on the same platform, some
requesting MSI mapping and some which don't. As soon as an smmu requires
MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
acquire_msi_remapping_resources(domain) will be implemented & succeed.
Doesn't it lead to a wrong decision. Do I miss something, or do you
consider this situation as far-fetched?

Thanks

Eric

>         we're good
>     else
>         oh no!    // Don't care whether the domain ran out of
>             // resources or simply doesn't support it,
>             // either way we can't proceed.
> 
> Robin.
> 
>> Best Regards
>>
>> Eric
>>>
>>> Robin.
>>>
>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>
>>>> ---
>>>>
>>>> v4 -> v5:
>>>> - introduce the user in the next patch
>>>>
>>>> RFC v1 -> v1:
>>>> - the data field is not used
>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>> MSI_MAPPING
>>>>     capability if needed or <0 if not.
>>>> - removed struct iommu_domain_msi_maps
>>>> ---
>>>>    include/linux/iommu.h | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index 62a5eae..b3e8c5b 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>        DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>        DOMAIN_ATTR_FSL_PAMUV1,
>>>>        DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>        DOMAIN_ATTR_MAX,
>>>>    };
>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 12:31       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, patches,
	linux-kernel, Bharat.Bhushan, pranav.sawargaonkar, p.fedin,
	iommu, Jean-Philippe.Brucker, julien.grall

Hi Alex,
On 04/21/2016 09:32 PM, Alex Williamson wrote:
> On Thu, 21 Apr 2016 14:18:09 +0200
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> Hi Alex, Robin,
>> On 04/19/2016 06:56 PM, Eric Auger wrote:
>>> This series introduces the dma-reserved-iommu api used to:
>>>
>>> - create/destroy an iova domain dedicated to reserved iova bindings
>>> - map/unmap physical addresses onto reserved IOVAs.
>>> - search for an existing reserved iova mapping matching a PA window
>>> - determine whether an msi needs to be iommu mapped
>>> - translate an msi_msg PA address into its IOVA counterpart  
>>
>> Following Robin's review, I understand one important point we have to
>> clarify is how much this API has to be generic.
>>
>> I agree with Robin on the fact there is quite a lot of duplication
>> between this dma-reserved-iommu implementation and dma-iommu
>> implementation. Maybe we could consider an msi-mapping API
>> implementation upon dma-iommu.c. This implementation would add MSI
>> doorbell binding list management, including, ref counting and locking.
>>
>> We would need to add a map/unmap function taking an iova/pa/size as
>> parameters in current dma-iommu.c
>>
>> An important assumption is that the dma-mapping API and the msi-mapping
>> API must not be used concurrently (be would be trying to use the same
>> cookie to store a different iova_domain).
>>
>> Any thought/suggestion?
> 
> Hi Eric,
> 
> I'm not attached to a generic interface, the important part for me is
> that if we have an iommu domain with space reserved for MSI, the MSI
> setup and allocation code should handle that so we don't need to play
> the remapping tricks between vfio-pci and a vfio iommu driver that we
> saw in early drafts of this.  My first inclination is always to try to
> make a generic, re-usable interface, but I apologize if that's led us
> astray here and we really do want the more simple, MSI specific
> interface.
> 
> For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> Maybe something like:
> 
> struct iommu_domain_msi_geometry {
> 	dma_addr_t	aperture_start;
> 	dma_addr_t	aperture_end;
> 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> };
> 
> Calling \get\ on arm would return { 0, 0, false }, indicating it's
> programmable, \set\ would allocate the iovad as specified.  That would
> make it very easy to expand the API to x86 with reporting of the fixed
> MSI range and it operates within the existing IOMMU API interfaces.
> Thanks,
Yes I would be happy to handle this x86 query requirement. I would be
more inclined to define it at "MSI mapping API" level since the IOMMU
API implementation does not handle iova allocation, as Robin argued as
the beginning. When "MSI MAPPING API" CONFIG is unset I would return
default x86 aperture.

Does it make sense?

Best Regards

Eric
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 12:31       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:31 UTC (permalink / raw)
  To: Alex Williamson
  Cc: julien.grall-5wv7dgnIgG8, eric.auger-qxv4g6HH51o,
	jason-NLaQJdtUoK4Be96aLqz0jA, patches-QSEj5FYQhm4dnm+yROfE0A,
	marc.zyngier-5wv7dgnIgG8, p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	will.deacon-5wv7dgnIgG8, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	tglx-hfZtesqFncYOwBW4kG4KsQ,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A

Hi Alex,
On 04/21/2016 09:32 PM, Alex Williamson wrote:
> On Thu, 21 Apr 2016 14:18:09 +0200
> Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> 
>> Hi Alex, Robin,
>> On 04/19/2016 06:56 PM, Eric Auger wrote:
>>> This series introduces the dma-reserved-iommu api used to:
>>>
>>> - create/destroy an iova domain dedicated to reserved iova bindings
>>> - map/unmap physical addresses onto reserved IOVAs.
>>> - search for an existing reserved iova mapping matching a PA window
>>> - determine whether an msi needs to be iommu mapped
>>> - translate an msi_msg PA address into its IOVA counterpart  
>>
>> Following Robin's review, I understand one important point we have to
>> clarify is how much this API has to be generic.
>>
>> I agree with Robin on the fact there is quite a lot of duplication
>> between this dma-reserved-iommu implementation and dma-iommu
>> implementation. Maybe we could consider an msi-mapping API
>> implementation upon dma-iommu.c. This implementation would add MSI
>> doorbell binding list management, including, ref counting and locking.
>>
>> We would need to add a map/unmap function taking an iova/pa/size as
>> parameters in current dma-iommu.c
>>
>> An important assumption is that the dma-mapping API and the msi-mapping
>> API must not be used concurrently (be would be trying to use the same
>> cookie to store a different iova_domain).
>>
>> Any thought/suggestion?
> 
> Hi Eric,
> 
> I'm not attached to a generic interface, the important part for me is
> that if we have an iommu domain with space reserved for MSI, the MSI
> setup and allocation code should handle that so we don't need to play
> the remapping tricks between vfio-pci and a vfio iommu driver that we
> saw in early drafts of this.  My first inclination is always to try to
> make a generic, re-usable interface, but I apologize if that's led us
> astray here and we really do want the more simple, MSI specific
> interface.
> 
> For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> Maybe something like:
> 
> struct iommu_domain_msi_geometry {
> 	dma_addr_t	aperture_start;
> 	dma_addr_t	aperture_end;
> 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> };
> 
> Calling \get\ on arm would return { 0, 0, false }, indicating it's
> programmable, \set\ would allocate the iovad as specified.  That would
> make it very easy to expand the API to x86 with reporting of the fixed
> MSI range and it operates within the existing IOMMU API interfaces.
> Thanks,
Yes I would be happy to handle this x86 query requirement. I would be
more inclined to define it at "MSI mapping API" level since the IOMMU
API implementation does not handle iova allocation, as Robin argued as
the beginning. When "MSI MAPPING API" CONFIG is unset I would return
default x86 aperture.

Does it make sense?

Best Regards

Eric
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 12:31       ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 12:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Alex,
On 04/21/2016 09:32 PM, Alex Williamson wrote:
> On Thu, 21 Apr 2016 14:18:09 +0200
> Eric Auger <eric.auger@linaro.org> wrote:
> 
>> Hi Alex, Robin,
>> On 04/19/2016 06:56 PM, Eric Auger wrote:
>>> This series introduces the dma-reserved-iommu api used to:
>>>
>>> - create/destroy an iova domain dedicated to reserved iova bindings
>>> - map/unmap physical addresses onto reserved IOVAs.
>>> - search for an existing reserved iova mapping matching a PA window
>>> - determine whether an msi needs to be iommu mapped
>>> - translate an msi_msg PA address into its IOVA counterpart  
>>
>> Following Robin's review, I understand one important point we have to
>> clarify is how much this API has to be generic.
>>
>> I agree with Robin on the fact there is quite a lot of duplication
>> between this dma-reserved-iommu implementation and dma-iommu
>> implementation. Maybe we could consider an msi-mapping API
>> implementation upon dma-iommu.c. This implementation would add MSI
>> doorbell binding list management, including, ref counting and locking.
>>
>> We would need to add a map/unmap function taking an iova/pa/size as
>> parameters in current dma-iommu.c
>>
>> An important assumption is that the dma-mapping API and the msi-mapping
>> API must not be used concurrently (be would be trying to use the same
>> cookie to store a different iova_domain).
>>
>> Any thought/suggestion?
> 
> Hi Eric,
> 
> I'm not attached to a generic interface, the important part for me is
> that if we have an iommu domain with space reserved for MSI, the MSI
> setup and allocation code should handle that so we don't need to play
> the remapping tricks between vfio-pci and a vfio iommu driver that we
> saw in early drafts of this.  My first inclination is always to try to
> make a generic, re-usable interface, but I apologize if that's led us
> astray here and we really do want the more simple, MSI specific
> interface.
> 
> For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> Maybe something like:
> 
> struct iommu_domain_msi_geometry {
> 	dma_addr_t	aperture_start;
> 	dma_addr_t	aperture_end;
> 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> };
> 
> Calling \get\ on arm would return { 0, 0, false }, indicating it's
> programmable, \set\ would allocate the iovad as specified.  That would
> make it very easy to expand the API to x86 with reporting of the fixed
> MSI range and it operates within the existing IOMMU API interfaces.
> Thanks,
Yes I would be happy to handle this x86 query requirement. I would be
more inclined to define it at "MSI mapping API" level since the IOMMU
API implementation does not handle iova allocation, as Robin argued as
the beginning. When "MSI MAPPING API" CONFIG is unset I would return
default x86 aperture.

Does it make sense?

Best Regards

Eric
> 
> Alex
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 12:36         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 12:36 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 20/04/16 17:14, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> This patch introduces some new fields in the iommu_domain struct,
>>> dedicated to reserved iova management.
>>>
>>> In a similar way as DMA mapping IOVA window, we need to store
>>> information related to a reserved IOVA window.
>>>
>>> The reserved_iova_cookie will store the reserved iova_domain
>>> handle. An RB tree indexed by physical address is introduced to
>>> store the host physical addresses bound to reserved IOVAs.
>>>
>>> Those physical addresses will correspond to MSI frame base
>>> addresses, also referred to as doorbells. Their number should be
>>> quite limited per domain.
>>>
>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>> tree will need to be accessed in MSI controller code not allowed to
>>> sleep.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - initialize reserved_binding_list
>>> - use a spinlock instead of a mutex
>>> ---
>>>    drivers/iommu/iommu.c | 2 ++
>>>    include/linux/iommu.h | 6 ++++++
>>>    2 files changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index b9df141..f70ef3b 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>
>>>        domain->ops  = bus->iommu_ops;
>>>        domain->type = type;
>>> +    spin_lock_init(&domain->reserved_lock);
>>> +    domain->reserved_binding_list = RB_ROOT;
>>>
>>>        return domain;
>>>    }
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index b3e8c5b..60999db 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -24,6 +24,7 @@
>>>    #include <linux/of.h>
>>>    #include <linux/types.h>
>>>    #include <linux/scatterlist.h>
>>> +#include <linux/spinlock.h>
>>>    #include <trace/events/iommu.h>
>>>
>>>    #define IOMMU_READ    (1 << 0)
>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>        void *handler_token;
>>>        struct iommu_domain_geometry geometry;
>>>        void *iova_cookie;
>>> +    void *reserved_iova_cookie;
>>
>> Why exactly do we need this? From your description, it's for the user of
>> the domain to keep track of IOVA allocations in, but then that's
>> precisely what the iova_cookie exists for.
>
> I was not sure whether both APIs could not be used concurrently, hence a
> separate cookie. If we only consider MSI mapping use case I guess we are
> either with a DMA domain or with a domain for VFIO and I would agree
> with you, ie. we can reuse the same cookie.

Unless somebody cooks up some paravirtualised monstrosity where the 
guest driver somehow uses the host kernel's DMA mapping ops (thankfully, 
I'm not sure how that would even be possible), then they should always 
be mutually exclusive.

(That said, I should probably add a sanity check to 
iommu_dma_put_cookie() to ensure it only touches the cookies of 
IOMMU_DOMAIN_DMA domains...)

>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>> +    struct rb_root reserved_binding_list;
>>
>> Nit: that's more puzzling than helpful - "reserved binding" is
>> particularly vague and nondescript, and makes me think of anything but
>> MSI descriptors.
> my heart is torn between advised genericity and MSI use case. My natural
> short-sighted inclination would head me for an MSI mapping dedicated API
> but I am following advices. As discussed with Alex there are
> implementation details pretty related to MSI problematics I think (the
> fact we store the "bindings" in an rb-tree/list, locking)
>
> If Marc & Alex I can retarget this API to be less generic.
>
>   Plus it's called a list but isn't a list (that said,
>> given that we'd typically only expect a handful of entries, and lookups
>> are hardly going to be a performance-critical bottleneck, would a simple
>> list not suffice?)
> I fully agree on that point. An rb-tree is overkill today for MSI use
> case. Again if we were to use this API for anything else, this may
> change the decision. But sure we can refactor afterwards upon needs. TBH
> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
> was originally located.

Thinking some more, how feasible would it be to handle the IOVA 
management aspect within the existing tree, i.e. extend struct vfio_dma 
so an entry can represent different types of thing - DMA pages, MSI 
pages, arbitrary reservations - and link to more implementation-specific 
data (e.g. a refcounted MSI descriptor stored elsewhere in the domain) 
as necessary?

Robin.

>>
>>> +    /* protects reserved cookie and rbtree manipulation */
>>> +    spinlock_t reserved_lock;
>>
>> A cookie is an opaque structure, so any locking it needs would normally
>> be hidden within. If on the other hand it's not meant to be opaque at
>> this level, then it should probably be something more specific than a
>> void * (if at all, as above).
> agreed
>
> Thanks
>
> Eric
>>
>> Robin.
>>
>>>    };
>>>
>>>    enum iommu_cap {
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 12:36         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 12:36 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 20/04/16 17:14, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> This patch introduces some new fields in the iommu_domain struct,
>>> dedicated to reserved iova management.
>>>
>>> In a similar way as DMA mapping IOVA window, we need to store
>>> information related to a reserved IOVA window.
>>>
>>> The reserved_iova_cookie will store the reserved iova_domain
>>> handle. An RB tree indexed by physical address is introduced to
>>> store the host physical addresses bound to reserved IOVAs.
>>>
>>> Those physical addresses will correspond to MSI frame base
>>> addresses, also referred to as doorbells. Their number should be
>>> quite limited per domain.
>>>
>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>> tree will need to be accessed in MSI controller code not allowed to
>>> sleep.
>>>
>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - initialize reserved_binding_list
>>> - use a spinlock instead of a mutex
>>> ---
>>>    drivers/iommu/iommu.c | 2 ++
>>>    include/linux/iommu.h | 6 ++++++
>>>    2 files changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index b9df141..f70ef3b 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>
>>>        domain->ops  = bus->iommu_ops;
>>>        domain->type = type;
>>> +    spin_lock_init(&domain->reserved_lock);
>>> +    domain->reserved_binding_list = RB_ROOT;
>>>
>>>        return domain;
>>>    }
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index b3e8c5b..60999db 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -24,6 +24,7 @@
>>>    #include <linux/of.h>
>>>    #include <linux/types.h>
>>>    #include <linux/scatterlist.h>
>>> +#include <linux/spinlock.h>
>>>    #include <trace/events/iommu.h>
>>>
>>>    #define IOMMU_READ    (1 << 0)
>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>        void *handler_token;
>>>        struct iommu_domain_geometry geometry;
>>>        void *iova_cookie;
>>> +    void *reserved_iova_cookie;
>>
>> Why exactly do we need this? From your description, it's for the user of
>> the domain to keep track of IOVA allocations in, but then that's
>> precisely what the iova_cookie exists for.
>
> I was not sure whether both APIs could not be used concurrently, hence a
> separate cookie. If we only consider MSI mapping use case I guess we are
> either with a DMA domain or with a domain for VFIO and I would agree
> with you, ie. we can reuse the same cookie.

Unless somebody cooks up some paravirtualised monstrosity where the 
guest driver somehow uses the host kernel's DMA mapping ops (thankfully, 
I'm not sure how that would even be possible), then they should always 
be mutually exclusive.

(That said, I should probably add a sanity check to 
iommu_dma_put_cookie() to ensure it only touches the cookies of 
IOMMU_DOMAIN_DMA domains...)

>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>> +    struct rb_root reserved_binding_list;
>>
>> Nit: that's more puzzling than helpful - "reserved binding" is
>> particularly vague and nondescript, and makes me think of anything but
>> MSI descriptors.
> my heart is torn between advised genericity and MSI use case. My natural
> short-sighted inclination would head me for an MSI mapping dedicated API
> but I am following advices. As discussed with Alex there are
> implementation details pretty related to MSI problematics I think (the
> fact we store the "bindings" in an rb-tree/list, locking)
>
> If Marc & Alex I can retarget this API to be less generic.
>
>   Plus it's called a list but isn't a list (that said,
>> given that we'd typically only expect a handful of entries, and lookups
>> are hardly going to be a performance-critical bottleneck, would a simple
>> list not suffice?)
> I fully agree on that point. An rb-tree is overkill today for MSI use
> case. Again if we were to use this API for anything else, this may
> change the decision. But sure we can refactor afterwards upon needs. TBH
> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
> was originally located.

Thinking some more, how feasible would it be to handle the IOVA 
management aspect within the existing tree, i.e. extend struct vfio_dma 
so an entry can represent different types of thing - DMA pages, MSI 
pages, arbitrary reservations - and link to more implementation-specific 
data (e.g. a refcounted MSI descriptor stored elsewhere in the domain) 
as necessary?

Robin.

>>
>>> +    /* protects reserved cookie and rbtree manipulation */
>>> +    spinlock_t reserved_lock;
>>
>> A cookie is an opaque structure, so any locking it needs would normally
>> be hidden within. If on the other hand it's not meant to be opaque at
>> this level, then it should probably be something more specific than a
>> void * (if at all, as above).
> agreed
>
> Thanks
>
> Eric
>>
>> Robin.
>>
>>>    };
>>>
>>>    enum iommu_cap {
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 12:36         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 12:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/04/16 17:14, Eric Auger wrote:
> Hi Robin,
> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> This patch introduces some new fields in the iommu_domain struct,
>>> dedicated to reserved iova management.
>>>
>>> In a similar way as DMA mapping IOVA window, we need to store
>>> information related to a reserved IOVA window.
>>>
>>> The reserved_iova_cookie will store the reserved iova_domain
>>> handle. An RB tree indexed by physical address is introduced to
>>> store the host physical addresses bound to reserved IOVAs.
>>>
>>> Those physical addresses will correspond to MSI frame base
>>> addresses, also referred to as doorbells. Their number should be
>>> quite limited per domain.
>>>
>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>> tree will need to be accessed in MSI controller code not allowed to
>>> sleep.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - initialize reserved_binding_list
>>> - use a spinlock instead of a mutex
>>> ---
>>>    drivers/iommu/iommu.c | 2 ++
>>>    include/linux/iommu.h | 6 ++++++
>>>    2 files changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index b9df141..f70ef3b 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>
>>>        domain->ops  = bus->iommu_ops;
>>>        domain->type = type;
>>> +    spin_lock_init(&domain->reserved_lock);
>>> +    domain->reserved_binding_list = RB_ROOT;
>>>
>>>        return domain;
>>>    }
>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>> index b3e8c5b..60999db 100644
>>> --- a/include/linux/iommu.h
>>> +++ b/include/linux/iommu.h
>>> @@ -24,6 +24,7 @@
>>>    #include <linux/of.h>
>>>    #include <linux/types.h>
>>>    #include <linux/scatterlist.h>
>>> +#include <linux/spinlock.h>
>>>    #include <trace/events/iommu.h>
>>>
>>>    #define IOMMU_READ    (1 << 0)
>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>        void *handler_token;
>>>        struct iommu_domain_geometry geometry;
>>>        void *iova_cookie;
>>> +    void *reserved_iova_cookie;
>>
>> Why exactly do we need this? From your description, it's for the user of
>> the domain to keep track of IOVA allocations in, but then that's
>> precisely what the iova_cookie exists for.
>
> I was not sure whether both APIs could not be used concurrently, hence a
> separate cookie. If we only consider MSI mapping use case I guess we are
> either with a DMA domain or with a domain for VFIO and I would agree
> with you, ie. we can reuse the same cookie.

Unless somebody cooks up some paravirtualised monstrosity where the 
guest driver somehow uses the host kernel's DMA mapping ops (thankfully, 
I'm not sure how that would even be possible), then they should always 
be mutually exclusive.

(That said, I should probably add a sanity check to 
iommu_dma_put_cookie() to ensure it only touches the cookies of 
IOMMU_DOMAIN_DMA domains...)

>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>> +    struct rb_root reserved_binding_list;
>>
>> Nit: that's more puzzling than helpful - "reserved binding" is
>> particularly vague and nondescript, and makes me think of anything but
>> MSI descriptors.
> my heart is torn between advised genericity and MSI use case. My natural
> short-sighted inclination would head me for an MSI mapping dedicated API
> but I am following advices. As discussed with Alex there are
> implementation details pretty related to MSI problematics I think (the
> fact we store the "bindings" in an rb-tree/list, locking)
>
> If Marc & Alex I can retarget this API to be less generic.
>
>   Plus it's called a list but isn't a list (that said,
>> given that we'd typically only expect a handful of entries, and lookups
>> are hardly going to be a performance-critical bottleneck, would a simple
>> list not suffice?)
> I fully agree on that point. An rb-tree is overkill today for MSI use
> case. Again if we were to use this API for anything else, this may
> change the decision. But sure we can refactor afterwards upon needs. TBH
> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
> was originally located.

Thinking some more, how feasible would it be to handle the IOVA 
management aspect within the existing tree, i.e. extend struct vfio_dma 
so an entry can represent different types of thing - DMA pages, MSI 
pages, arbitrary reservations - and link to more implementation-specific 
data (e.g. a refcounted MSI descriptor stored elsewhere in the domain) 
as necessary?

Robin.

>>
>>> +    /* protects reserved cookie and rbtree manipulation */
>>> +    spinlock_t reserved_lock;
>>
>> A cookie is an opaque structure, so any locking it needs would normally
>> be hidden within. If on the other hand it's not meant to be opaque at
>> this level, then it should probably be something more specific than a
>> void * (if at all, as above).
> agreed
>
> Thanks
>
> Eric
>>
>> Robin.
>>
>>>    };
>>>
>>>    enum iommu_cap {
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 13:02           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 13:02 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/22/2016 02:36 PM, Robin Murphy wrote:
> On 20/04/16 17:14, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> This patch introduces some new fields in the iommu_domain struct,
>>>> dedicated to reserved iova management.
>>>>
>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>> information related to a reserved IOVA window.
>>>>
>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>> handle. An RB tree indexed by physical address is introduced to
>>>> store the host physical addresses bound to reserved IOVAs.
>>>>
>>>> Those physical addresses will correspond to MSI frame base
>>>> addresses, also referred to as doorbells. Their number should be
>>>> quite limited per domain.
>>>>
>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>> tree will need to be accessed in MSI controller code not allowed to
>>>> sleep.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>
>>>> ---
>>>> v5 -> v6:
>>>> - initialize reserved_binding_list
>>>> - use a spinlock instead of a mutex
>>>> ---
>>>>    drivers/iommu/iommu.c | 2 ++
>>>>    include/linux/iommu.h | 6 ++++++
>>>>    2 files changed, 8 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index b9df141..f70ef3b 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>
>>>>        domain->ops  = bus->iommu_ops;
>>>>        domain->type = type;
>>>> +    spin_lock_init(&domain->reserved_lock);
>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>
>>>>        return domain;
>>>>    }
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index b3e8c5b..60999db 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -24,6 +24,7 @@
>>>>    #include <linux/of.h>
>>>>    #include <linux/types.h>
>>>>    #include <linux/scatterlist.h>
>>>> +#include <linux/spinlock.h>
>>>>    #include <trace/events/iommu.h>
>>>>
>>>>    #define IOMMU_READ    (1 << 0)
>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>        void *handler_token;
>>>>        struct iommu_domain_geometry geometry;
>>>>        void *iova_cookie;
>>>> +    void *reserved_iova_cookie;
>>>
>>> Why exactly do we need this? From your description, it's for the user of
>>> the domain to keep track of IOVA allocations in, but then that's
>>> precisely what the iova_cookie exists for.
>>
>> I was not sure whether both APIs could not be used concurrently, hence a
>> separate cookie. If we only consider MSI mapping use case I guess we are
>> either with a DMA domain or with a domain for VFIO and I would agree
>> with you, ie. we can reuse the same cookie.
> 
> Unless somebody cooks up some paravirtualised monstrosity where the
> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
> I'm not sure how that would even be possible), then they should always
> be mutually exclusive.

OK thanks
> 
> (That said, I should probably add a sanity check to
> iommu_dma_put_cookie() to ensure it only touches the cookies of
> IOMMU_DOMAIN_DMA domains...)
> 
>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>> +    struct rb_root reserved_binding_list;
>>>
>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>> particularly vague and nondescript, and makes me think of anything but
>>> MSI descriptors.
>> my heart is torn between advised genericity and MSI use case. My natural
>> short-sighted inclination would head me for an MSI mapping dedicated API
>> but I am following advices. As discussed with Alex there are
>> implementation details pretty related to MSI problematics I think (the
>> fact we store the "bindings" in an rb-tree/list, locking)
>>
>> If Marc & Alex I can retarget this API to be less generic.
>>
>>   Plus it's called a list but isn't a list (that said,
>>> given that we'd typically only expect a handful of entries, and lookups
>>> are hardly going to be a performance-critical bottleneck, would a simple
>>> list not suffice?)
>> I fully agree on that point. An rb-tree is overkill today for MSI use
>> case. Again if we were to use this API for anything else, this may
>> change the decision. But sure we can refactor afterwards upon needs. TBH
>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>> was originally located.
> 
> Thinking some more, how feasible would it be to handle the IOVA
> management aspect within the existing tree, i.e. extend struct vfio_dma
> so an entry can represent different types of thing - DMA pages, MSI
> pages, arbitrary reservations - and link to more implementation-specific
> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
> as necessary?
it is feasible and  was approximately done there at the early ages of
the series:
https://lkml.org/lkml/2016/1/28/803 &
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Then with the intent of doing something reusable the trend was to put it
in the iommu instead of vfio_iommu_typeX.

I am currently trying to make the "msi-iommu api" implemented upon the
dma-iommu api based on the simplifications that we discussed:
- reuse iova_cookie for iova domain
- add an opaque msi_cookie that hides the msi doorbell list + its spinlock
- simplify locking by making sure the msi domain cannot disappear before
the iommu domain destruction

If you agree I would suggest to wait and see the outcome of this new
design and make a shared decision based on that code? Should be
available next week.

Best Regards

Eric

> 
> Robin.
> 
>>>
>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>> +    spinlock_t reserved_lock;
>>>
>>> A cookie is an opaque structure, so any locking it needs would normally
>>> be hidden within. If on the other hand it's not meant to be opaque at
>>> this level, then it should probably be something more specific than a
>>> void * (if at all, as above).
>> agreed
>>
>> Thanks
>>
>> Eric
>>>
>>> Robin.
>>>
>>>>    };
>>>>
>>>>    enum iommu_cap {
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 13:02           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 13:02 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/22/2016 02:36 PM, Robin Murphy wrote:
> On 20/04/16 17:14, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> This patch introduces some new fields in the iommu_domain struct,
>>>> dedicated to reserved iova management.
>>>>
>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>> information related to a reserved IOVA window.
>>>>
>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>> handle. An RB tree indexed by physical address is introduced to
>>>> store the host physical addresses bound to reserved IOVAs.
>>>>
>>>> Those physical addresses will correspond to MSI frame base
>>>> addresses, also referred to as doorbells. Their number should be
>>>> quite limited per domain.
>>>>
>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>> tree will need to be accessed in MSI controller code not allowed to
>>>> sleep.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>>
>>>> ---
>>>> v5 -> v6:
>>>> - initialize reserved_binding_list
>>>> - use a spinlock instead of a mutex
>>>> ---
>>>>    drivers/iommu/iommu.c | 2 ++
>>>>    include/linux/iommu.h | 6 ++++++
>>>>    2 files changed, 8 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index b9df141..f70ef3b 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>
>>>>        domain->ops  = bus->iommu_ops;
>>>>        domain->type = type;
>>>> +    spin_lock_init(&domain->reserved_lock);
>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>
>>>>        return domain;
>>>>    }
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index b3e8c5b..60999db 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -24,6 +24,7 @@
>>>>    #include <linux/of.h>
>>>>    #include <linux/types.h>
>>>>    #include <linux/scatterlist.h>
>>>> +#include <linux/spinlock.h>
>>>>    #include <trace/events/iommu.h>
>>>>
>>>>    #define IOMMU_READ    (1 << 0)
>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>        void *handler_token;
>>>>        struct iommu_domain_geometry geometry;
>>>>        void *iova_cookie;
>>>> +    void *reserved_iova_cookie;
>>>
>>> Why exactly do we need this? From your description, it's for the user of
>>> the domain to keep track of IOVA allocations in, but then that's
>>> precisely what the iova_cookie exists for.
>>
>> I was not sure whether both APIs could not be used concurrently, hence a
>> separate cookie. If we only consider MSI mapping use case I guess we are
>> either with a DMA domain or with a domain for VFIO and I would agree
>> with you, ie. we can reuse the same cookie.
> 
> Unless somebody cooks up some paravirtualised monstrosity where the
> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
> I'm not sure how that would even be possible), then they should always
> be mutually exclusive.

OK thanks
> 
> (That said, I should probably add a sanity check to
> iommu_dma_put_cookie() to ensure it only touches the cookies of
> IOMMU_DOMAIN_DMA domains...)
> 
>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>> +    struct rb_root reserved_binding_list;
>>>
>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>> particularly vague and nondescript, and makes me think of anything but
>>> MSI descriptors.
>> my heart is torn between advised genericity and MSI use case. My natural
>> short-sighted inclination would head me for an MSI mapping dedicated API
>> but I am following advices. As discussed with Alex there are
>> implementation details pretty related to MSI problematics I think (the
>> fact we store the "bindings" in an rb-tree/list, locking)
>>
>> If Marc & Alex I can retarget this API to be less generic.
>>
>>   Plus it's called a list but isn't a list (that said,
>>> given that we'd typically only expect a handful of entries, and lookups
>>> are hardly going to be a performance-critical bottleneck, would a simple
>>> list not suffice?)
>> I fully agree on that point. An rb-tree is overkill today for MSI use
>> case. Again if we were to use this API for anything else, this may
>> change the decision. But sure we can refactor afterwards upon needs. TBH
>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>> was originally located.
> 
> Thinking some more, how feasible would it be to handle the IOVA
> management aspect within the existing tree, i.e. extend struct vfio_dma
> so an entry can represent different types of thing - DMA pages, MSI
> pages, arbitrary reservations - and link to more implementation-specific
> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
> as necessary?
it is feasible and  was approximately done there at the early ages of
the series:
https://lkml.org/lkml/2016/1/28/803 &
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Then with the intent of doing something reusable the trend was to put it
in the iommu instead of vfio_iommu_typeX.

I am currently trying to make the "msi-iommu api" implemented upon the
dma-iommu api based on the simplifications that we discussed:
- reuse iova_cookie for iova domain
- add an opaque msi_cookie that hides the msi doorbell list + its spinlock
- simplify locking by making sure the msi domain cannot disappear before
the iommu domain destruction

If you agree I would suggest to wait and see the outcome of this new
design and make a shared decision based on that code? Should be
available next week.

Best Regards

Eric

> 
> Robin.
> 
>>>
>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>> +    spinlock_t reserved_lock;
>>>
>>> A cookie is an opaque structure, so any locking it needs would normally
>>> be hidden within. If on the other hand it's not meant to be opaque at
>>> this level, then it should probably be something more specific than a
>>> void * (if at all, as above).
>> agreed
>>
>> Thanks
>>
>> Eric
>>>
>>> Robin.
>>>
>>>>    };
>>>>
>>>>    enum iommu_cap {
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 13:02           ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 13:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/22/2016 02:36 PM, Robin Murphy wrote:
> On 20/04/16 17:14, Eric Auger wrote:
>> Hi Robin,
>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>> On 19/04/16 17:56, Eric Auger wrote:
>>>> This patch introduces some new fields in the iommu_domain struct,
>>>> dedicated to reserved iova management.
>>>>
>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>> information related to a reserved IOVA window.
>>>>
>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>> handle. An RB tree indexed by physical address is introduced to
>>>> store the host physical addresses bound to reserved IOVAs.
>>>>
>>>> Those physical addresses will correspond to MSI frame base
>>>> addresses, also referred to as doorbells. Their number should be
>>>> quite limited per domain.
>>>>
>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>> tree will need to be accessed in MSI controller code not allowed to
>>>> sleep.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>
>>>> ---
>>>> v5 -> v6:
>>>> - initialize reserved_binding_list
>>>> - use a spinlock instead of a mutex
>>>> ---
>>>>    drivers/iommu/iommu.c | 2 ++
>>>>    include/linux/iommu.h | 6 ++++++
>>>>    2 files changed, 8 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index b9df141..f70ef3b 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>
>>>>        domain->ops  = bus->iommu_ops;
>>>>        domain->type = type;
>>>> +    spin_lock_init(&domain->reserved_lock);
>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>
>>>>        return domain;
>>>>    }
>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>> index b3e8c5b..60999db 100644
>>>> --- a/include/linux/iommu.h
>>>> +++ b/include/linux/iommu.h
>>>> @@ -24,6 +24,7 @@
>>>>    #include <linux/of.h>
>>>>    #include <linux/types.h>
>>>>    #include <linux/scatterlist.h>
>>>> +#include <linux/spinlock.h>
>>>>    #include <trace/events/iommu.h>
>>>>
>>>>    #define IOMMU_READ    (1 << 0)
>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>        void *handler_token;
>>>>        struct iommu_domain_geometry geometry;
>>>>        void *iova_cookie;
>>>> +    void *reserved_iova_cookie;
>>>
>>> Why exactly do we need this? From your description, it's for the user of
>>> the domain to keep track of IOVA allocations in, but then that's
>>> precisely what the iova_cookie exists for.
>>
>> I was not sure whether both APIs could not be used concurrently, hence a
>> separate cookie. If we only consider MSI mapping use case I guess we are
>> either with a DMA domain or with a domain for VFIO and I would agree
>> with you, ie. we can reuse the same cookie.
> 
> Unless somebody cooks up some paravirtualised monstrosity where the
> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
> I'm not sure how that would even be possible), then they should always
> be mutually exclusive.

OK thanks
> 
> (That said, I should probably add a sanity check to
> iommu_dma_put_cookie() to ensure it only touches the cookies of
> IOMMU_DOMAIN_DMA domains...)
> 
>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>> +    struct rb_root reserved_binding_list;
>>>
>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>> particularly vague and nondescript, and makes me think of anything but
>>> MSI descriptors.
>> my heart is torn between advised genericity and MSI use case. My natural
>> short-sighted inclination would head me for an MSI mapping dedicated API
>> but I am following advices. As discussed with Alex there are
>> implementation details pretty related to MSI problematics I think (the
>> fact we store the "bindings" in an rb-tree/list, locking)
>>
>> If Marc & Alex I can retarget this API to be less generic.
>>
>>   Plus it's called a list but isn't a list (that said,
>>> given that we'd typically only expect a handful of entries, and lookups
>>> are hardly going to be a performance-critical bottleneck, would a simple
>>> list not suffice?)
>> I fully agree on that point. An rb-tree is overkill today for MSI use
>> case. Again if we were to use this API for anything else, this may
>> change the decision. But sure we can refactor afterwards upon needs. TBH
>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>> was originally located.
> 
> Thinking some more, how feasible would it be to handle the IOVA
> management aspect within the existing tree, i.e. extend struct vfio_dma
> so an entry can represent different types of thing - DMA pages, MSI
> pages, arbitrary reservations - and link to more implementation-specific
> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
> as necessary?
it is feasible and  was approximately done there at the early ages of
the series:
https://lkml.org/lkml/2016/1/28/803 &
https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Then with the intent of doing something reusable the trend was to put it
in the iommu instead of vfio_iommu_typeX.

I am currently trying to make the "msi-iommu api" implemented upon the
dma-iommu api based on the simplifications that we discussed:
- reuse iova_cookie for iova domain
- add an opaque msi_cookie that hides the msi doorbell list + its spinlock
- simplify locking by making sure the msi domain cannot disappear before
the iommu domain destruction

If you agree I would suggest to wait and see the outcome of this new
design and make a shared decision based on that code? Should be
available next week.

Best Regards

Eric

> 
> Robin.
> 
>>>
>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>> +    spinlock_t reserved_lock;
>>>
>>> A cookie is an opaque structure, so any locking it needs would normally
>>> be hidden within. If on the other hand it's not meant to be opaque at
>>> this level, then it should probably be something more specific than a
>>> void * (if at all, as above).
>> agreed
>>
>> Thanks
>>
>> Eric
>>>
>>> Robin.
>>>
>>>>    };
>>>>
>>>>    enum iommu_cap {
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-22 13:05         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 13:05 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 20/04/16 17:18, Eric Auger wrote:
> Robin,
> On 04/20/2016 03:12 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> we will need to track which host physical addresses are mapped to
>>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>>> by physical address. This RB tree only is used for reserved IOVA
>>> bindings.
>>>
>>> It is expected this RB tree will contain very few bindings.
>>
>> Sounds like a good reason in favour of using a list, and thus having
>> rather less code here ;)
>
> OK will move to a simple list.
>>
>>>   Those
>>> generally correspond to single page mapping one MSI frame (GICv2m
>>> frame or ITS GITS_TRANSLATER frame).
>>>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - add comment about @d->reserved_lock to be held
>>>
>>> v3 -> v4:
>>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>>> tree"
>>> ---
>>>    drivers/iommu/dma-reserved-iommu.c | 63
>>> ++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 63 insertions(+)
>>>
>>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>>> b/drivers/iommu/dma-reserved-iommu.c
>>> index 2562af0..f6fa18e 100644
>>> --- a/drivers/iommu/dma-reserved-iommu.c
>>> +++ b/drivers/iommu/dma-reserved-iommu.c
>>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>>        int prot; /* iommu protection attributes to be obeyed */
>>>    };
>>>
>>> +struct iommu_reserved_binding {
>>> +    struct kref        kref;
>>> +    struct rb_node        node;
>>> +    struct iommu_domain    *domain;
>>
>> Hang on, the tree these are in is already embedded in a domain. Ergo we
>> can't look them up without first knowing the domain they belong to, so
>> what purpose does this guy serve?
> this is used on the kref_put. The release function takes a kref; then we
> get the container to retrieve the binding and storing the domain here
> enables to unlink the node.

Ah yes, I see now - that's annoyingly awkward. I think it could possibly 
be avoided in the list case (if the kref_put callback just did 
list_del_init(), the entry could then be checked for an empty list and 
disposed of outside the lock), but I'm not sure whether that's really 
worth the fuss. Oh well.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> +    phys_addr_t        addr;
>>> +    dma_addr_t        iova;
>>> +    size_t            size;
>>> +};
>>> +
>>> +/* Reserved binding RB-tree manipulation */
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static struct iommu_reserved_binding *find_reserved_binding(
>>> +                    struct iommu_domain *d,
>>> +                    phys_addr_t start, size_t size)
>>> +{
>>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>>> +
>>> +    while (node) {
>>> +        struct iommu_reserved_binding *binding =
>>> +            rb_entry(node, struct iommu_reserved_binding, node);
>>> +
>>> +        if (start + size <= binding->addr)
>>> +            node = node->rb_left;
>>> +        else if (start >= binding->addr + binding->size)
>>> +            node = node->rb_right;
>>> +        else
>>> +            return binding;
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void link_reserved_binding(struct iommu_domain *d,
>>> +                  struct iommu_reserved_binding *new)
>>> +{
>>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>>> +    struct rb_node *parent = NULL;
>>> +    struct iommu_reserved_binding *binding;
>>> +
>>> +    while (*link) {
>>> +        parent = *link;
>>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>>> +                   node);
>>> +
>>> +        if (new->addr + new->size <= binding->addr)
>>> +            link = &(*link)->rb_left;
>>> +        else
>>> +            link = &(*link)->rb_right;
>>> +    }
>>> +
>>> +    rb_link_node(&new->node, parent, link);
>>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void unlink_reserved_binding(struct iommu_domain *d,
>>> +                    struct iommu_reserved_binding *old)
>>> +{
>>> +    rb_erase(&old->node, &d->reserved_binding_list);
>>> +}
>>> +
>>>    int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>>                         dma_addr_t iova, size_t size, int prot,
>>>                         unsigned long order)
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-22 13:05         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 13:05 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 20/04/16 17:18, Eric Auger wrote:
> Robin,
> On 04/20/2016 03:12 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> we will need to track which host physical addresses are mapped to
>>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>>> by physical address. This RB tree only is used for reserved IOVA
>>> bindings.
>>>
>>> It is expected this RB tree will contain very few bindings.
>>
>> Sounds like a good reason in favour of using a list, and thus having
>> rather less code here ;)
>
> OK will move to a simple list.
>>
>>>   Those
>>> generally correspond to single page mapping one MSI frame (GICv2m
>>> frame or ITS GITS_TRANSLATER frame).
>>>
>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - add comment about @d->reserved_lock to be held
>>>
>>> v3 -> v4:
>>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>>> tree"
>>> ---
>>>    drivers/iommu/dma-reserved-iommu.c | 63
>>> ++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 63 insertions(+)
>>>
>>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>>> b/drivers/iommu/dma-reserved-iommu.c
>>> index 2562af0..f6fa18e 100644
>>> --- a/drivers/iommu/dma-reserved-iommu.c
>>> +++ b/drivers/iommu/dma-reserved-iommu.c
>>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>>        int prot; /* iommu protection attributes to be obeyed */
>>>    };
>>>
>>> +struct iommu_reserved_binding {
>>> +    struct kref        kref;
>>> +    struct rb_node        node;
>>> +    struct iommu_domain    *domain;
>>
>> Hang on, the tree these are in is already embedded in a domain. Ergo we
>> can't look them up without first knowing the domain they belong to, so
>> what purpose does this guy serve?
> this is used on the kref_put. The release function takes a kref; then we
> get the container to retrieve the binding and storing the domain here
> enables to unlink the node.

Ah yes, I see now - that's annoyingly awkward. I think it could possibly 
be avoided in the list case (if the kref_put callback just did 
list_del_init(), the entry could then be checked for an empty list and 
disposed of outside the lock), but I'm not sure whether that's really 
worth the fuss. Oh well.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> +    phys_addr_t        addr;
>>> +    dma_addr_t        iova;
>>> +    size_t            size;
>>> +};
>>> +
>>> +/* Reserved binding RB-tree manipulation */
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static struct iommu_reserved_binding *find_reserved_binding(
>>> +                    struct iommu_domain *d,
>>> +                    phys_addr_t start, size_t size)
>>> +{
>>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>>> +
>>> +    while (node) {
>>> +        struct iommu_reserved_binding *binding =
>>> +            rb_entry(node, struct iommu_reserved_binding, node);
>>> +
>>> +        if (start + size <= binding->addr)
>>> +            node = node->rb_left;
>>> +        else if (start >= binding->addr + binding->size)
>>> +            node = node->rb_right;
>>> +        else
>>> +            return binding;
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void link_reserved_binding(struct iommu_domain *d,
>>> +                  struct iommu_reserved_binding *new)
>>> +{
>>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>>> +    struct rb_node *parent = NULL;
>>> +    struct iommu_reserved_binding *binding;
>>> +
>>> +    while (*link) {
>>> +        parent = *link;
>>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>>> +                   node);
>>> +
>>> +        if (new->addr + new->size <= binding->addr)
>>> +            link = &(*link)->rb_left;
>>> +        else
>>> +            link = &(*link)->rb_right;
>>> +    }
>>> +
>>> +    rb_link_node(&new->node, parent, link);
>>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void unlink_reserved_binding(struct iommu_domain *d,
>>> +                    struct iommu_reserved_binding *old)
>>> +{
>>> +    rb_erase(&old->node, &d->reserved_binding_list);
>>> +}
>>> +
>>>    int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>>                         dma_addr_t iova, size_t size, int prot,
>>>                         unsigned long order)
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers
@ 2016-04-22 13:05         ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/04/16 17:18, Eric Auger wrote:
> Robin,
> On 04/20/2016 03:12 PM, Robin Murphy wrote:
>> On 19/04/16 17:56, Eric Auger wrote:
>>> we will need to track which host physical addresses are mapped to
>>> reserved IOVA. In that prospect we introduce a new RB tree indexed
>>> by physical address. This RB tree only is used for reserved IOVA
>>> bindings.
>>>
>>> It is expected this RB tree will contain very few bindings.
>>
>> Sounds like a good reason in favour of using a list, and thus having
>> rather less code here ;)
>
> OK will move to a simple list.
>>
>>>   Those
>>> generally correspond to single page mapping one MSI frame (GICv2m
>>> frame or ITS GITS_TRANSLATER frame).
>>>
>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>
>>> ---
>>> v5 -> v6:
>>> - add comment about @d->reserved_lock to be held
>>>
>>> v3 -> v4:
>>> - that code was formerly in "iommu/arm-smmu: add a reserved binding RB
>>> tree"
>>> ---
>>>    drivers/iommu/dma-reserved-iommu.c | 63
>>> ++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 63 insertions(+)
>>>
>>> diff --git a/drivers/iommu/dma-reserved-iommu.c
>>> b/drivers/iommu/dma-reserved-iommu.c
>>> index 2562af0..f6fa18e 100644
>>> --- a/drivers/iommu/dma-reserved-iommu.c
>>> +++ b/drivers/iommu/dma-reserved-iommu.c
>>> @@ -23,6 +23,69 @@ struct reserved_iova_domain {
>>>        int prot; /* iommu protection attributes to be obeyed */
>>>    };
>>>
>>> +struct iommu_reserved_binding {
>>> +    struct kref        kref;
>>> +    struct rb_node        node;
>>> +    struct iommu_domain    *domain;
>>
>> Hang on, the tree these are in is already embedded in a domain. Ergo we
>> can't look them up without first knowing the domain they belong to, so
>> what purpose does this guy serve?
> this is used on the kref_put. The release function takes a kref; then we
> get the container to retrieve the binding and storing the domain here
> enables to unlink the node.

Ah yes, I see now - that's annoyingly awkward. I think it could possibly 
be avoided in the list case (if the kref_put callback just did 
list_del_init(), the entry could then be checked for an empty list and 
disposed of outside the lock), but I'm not sure whether that's really 
worth the fuss. Oh well.

Robin.

> Best Regards
>
> Eric
>>
>> Robin.
>>
>>> +    phys_addr_t        addr;
>>> +    dma_addr_t        iova;
>>> +    size_t            size;
>>> +};
>>> +
>>> +/* Reserved binding RB-tree manipulation */
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static struct iommu_reserved_binding *find_reserved_binding(
>>> +                    struct iommu_domain *d,
>>> +                    phys_addr_t start, size_t size)
>>> +{
>>> +    struct rb_node *node = d->reserved_binding_list.rb_node;
>>> +
>>> +    while (node) {
>>> +        struct iommu_reserved_binding *binding =
>>> +            rb_entry(node, struct iommu_reserved_binding, node);
>>> +
>>> +        if (start + size <= binding->addr)
>>> +            node = node->rb_left;
>>> +        else if (start >= binding->addr + binding->size)
>>> +            node = node->rb_right;
>>> +        else
>>> +            return binding;
>>> +    }
>>> +
>>> +    return NULL;
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void link_reserved_binding(struct iommu_domain *d,
>>> +                  struct iommu_reserved_binding *new)
>>> +{
>>> +    struct rb_node **link = &d->reserved_binding_list.rb_node;
>>> +    struct rb_node *parent = NULL;
>>> +    struct iommu_reserved_binding *binding;
>>> +
>>> +    while (*link) {
>>> +        parent = *link;
>>> +        binding = rb_entry(parent, struct iommu_reserved_binding,
>>> +                   node);
>>> +
>>> +        if (new->addr + new->size <= binding->addr)
>>> +            link = &(*link)->rb_left;
>>> +        else
>>> +            link = &(*link)->rb_right;
>>> +    }
>>> +
>>> +    rb_link_node(&new->node, parent, link);
>>> +    rb_insert_color(&new->node, &d->reserved_binding_list);
>>> +}
>>> +
>>> +/* @d->reserved_lock must be held */
>>> +static void unlink_reserved_binding(struct iommu_domain *d,
>>> +                    struct iommu_reserved_binding *old)
>>> +{
>>> +    rb_erase(&old->node, &d->reserved_binding_list);
>>> +}
>>> +
>>>    int iommu_alloc_reserved_iova_domain(struct iommu_domain *domain,
>>>                         dma_addr_t iova, size_t size, int prot,
>>>                         unsigned long order)
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 14:49             ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 14:49 UTC (permalink / raw)
  To: Eric Auger, eric.auger, alex.williamson, will.deacon, joro, tglx,
	jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

On 22/04/16 13:00, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>> On 20/04/16 16:58, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>> Hi Eric,
>>>>
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>
>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>> space and
>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>>> conveyed through the IOMMU.
>>>>
>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>> not advertising interrupt remapping capabilities?
>>> My current understanding is it is not possible:
>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>
>> Not sure I follow - if the feature is disabled such that the IOMMU
>> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?
>
> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
> criteria to detect whether MSI mapping was requested.
>>
>> My point was that this logic:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>          if (acquire_msi_remapping_resources(domain))
>>              we're good
>>          else
>>              oh no!
>>      else
>>          oh no!
>>
>> should be easily reducible to this:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (acquire_msi_remapping_resources(domain))
>
> But Can't we imagine a mix of smmus on the same platform, some
> requesting MSI mapping and some which don't. As soon as an smmu requires
> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
> acquire_msi_remapping_resources(domain) will be implemented & succeed.
> Doesn't it lead to a wrong decision. Do I miss something, or do you
> consider this situation as far-fetched?

Sorry, what was implicit there was that the imaginary 
acquire_msi_remapping_resources(*IOMMU* domian) call still involves 
going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the 
relevant place.

Either way, now that I consider it further, a flag on the IOMMU domain 
is not just unnecessary, I think it's actually fundamentally incorrect: 
picture a system which for some crazy reason has both a GICv3 ITS plus 
some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain 
targets the (safe) ITS or the (unsafe) bridge isn't a property of the 
IOMMU domain it's trying to attach to; if you can't rely on the IOMMU 
itself to isolate MSIs, then you can only know whether to allow or 
reject any given group by inspecting every device in that group to make 
sure they all have isolation provided by their MSI domains and that the 
IOMMU domain has all the relevant doorbell mappings ready.

I guess the allow_unsafe_interrupts case might complicate matters beyond 
the logic I sketched out, because then we might actually care about the 
difference between "is isolation provided?" and "are sufficient 
IOVA/descriptor resources available?", but the main point still stands.

Robin.

> Thanks
>
> Eric
>
>>          we're good
>>      else
>>          oh no!    // Don't care whether the domain ran out of
>>              // resources or simply doesn't support it,
>>              // either way we can't proceed.
>>
>> Robin.
>>
>>> Best Regards
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>
>>>>> ---
>>>>>
>>>>> v4 -> v5:
>>>>> - introduce the user in the next patch
>>>>>
>>>>> RFC v1 -> v1:
>>>>> - the data field is not used
>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>> MSI_MAPPING
>>>>>      capability if needed or <0 if not.
>>>>> - removed struct iommu_domain_msi_maps
>>>>> ---
>>>>>     include/linux/iommu.h | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index 62a5eae..b3e8c5b 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>         DOMAIN_ATTR_MAX,
>>>>>     };
>>>>>
>>>>>
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 14:49             ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 14:49 UTC (permalink / raw)
  To: Eric Auger, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

On 22/04/16 13:00, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>> On 20/04/16 16:58, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>> Hi Eric,
>>>>
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>
>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>> space and
>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>>> conveyed through the IOMMU.
>>>>
>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>> not advertising interrupt remapping capabilities?
>>> My current understanding is it is not possible:
>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>
>> Not sure I follow - if the feature is disabled such that the IOMMU
>> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?
>
> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
> criteria to detect whether MSI mapping was requested.
>>
>> My point was that this logic:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>          if (acquire_msi_remapping_resources(domain))
>>              we're good
>>          else
>>              oh no!
>>      else
>>          oh no!
>>
>> should be easily reducible to this:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (acquire_msi_remapping_resources(domain))
>
> But Can't we imagine a mix of smmus on the same platform, some
> requesting MSI mapping and some which don't. As soon as an smmu requires
> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
> acquire_msi_remapping_resources(domain) will be implemented & succeed.
> Doesn't it lead to a wrong decision. Do I miss something, or do you
> consider this situation as far-fetched?

Sorry, what was implicit there was that the imaginary 
acquire_msi_remapping_resources(*IOMMU* domian) call still involves 
going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the 
relevant place.

Either way, now that I consider it further, a flag on the IOMMU domain 
is not just unnecessary, I think it's actually fundamentally incorrect: 
picture a system which for some crazy reason has both a GICv3 ITS plus 
some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain 
targets the (safe) ITS or the (unsafe) bridge isn't a property of the 
IOMMU domain it's trying to attach to; if you can't rely on the IOMMU 
itself to isolate MSIs, then you can only know whether to allow or 
reject any given group by inspecting every device in that group to make 
sure they all have isolation provided by their MSI domains and that the 
IOMMU domain has all the relevant doorbell mappings ready.

I guess the allow_unsafe_interrupts case might complicate matters beyond 
the logic I sketched out, because then we might actually care about the 
difference between "is isolation provided?" and "are sufficient 
IOVA/descriptor resources available?", but the main point still stands.

Robin.

> Thanks
>
> Eric
>
>>          we're good
>>      else
>>          oh no!    // Don't care whether the domain ran out of
>>              // resources or simply doesn't support it,
>>              // either way we can't proceed.
>>
>> Robin.
>>
>>> Best Regards
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>>>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>>>
>>>>> ---
>>>>>
>>>>> v4 -> v5:
>>>>> - introduce the user in the next patch
>>>>>
>>>>> RFC v1 -> v1:
>>>>> - the data field is not used
>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>> MSI_MAPPING
>>>>>      capability if needed or <0 if not.
>>>>> - removed struct iommu_domain_msi_maps
>>>>> ---
>>>>>     include/linux/iommu.h | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index 62a5eae..b3e8c5b 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>         DOMAIN_ATTR_MAX,
>>>>>     };
>>>>>
>>>>>
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 14:49             ` Robin Murphy
  0 siblings, 0 replies; 127+ messages in thread
From: Robin Murphy @ 2016-04-22 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 22/04/16 13:00, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>> On 20/04/16 16:58, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>> Hi Eric,
>>>>
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If supported,
>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>
>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI write
>>>>> transaction addresses always are within the 1MB PA region [FEE0_0000h -
>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>> space and
>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions are
>>>>> conveyed through the IOMMU.
>>>>
>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>> not advertising interrupt remapping capabilities?
>>> My current understanding is it is not possible:
>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>
>> Not sure I follow - if the feature is disabled such that the IOMMU
>> doesn't isolate MSIs, then it's no different a situation from the SMMU, no?
>
> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
> criteria to detect whether MSI mapping was requested.
>>
>> My point was that this logic:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>          if (acquire_msi_remapping_resources(domain))
>>              we're good
>>          else
>>              oh no!
>>      else
>>          oh no!
>>
>> should be easily reducible to this:
>>
>>      if (IOMMU_CAP_INTR_REMAP)
>>          we're good
>>      else if (acquire_msi_remapping_resources(domain))
>
> But Can't we imagine a mix of smmus on the same platform, some
> requesting MSI mapping and some which don't. As soon as an smmu requires
> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
> acquire_msi_remapping_resources(domain) will be implemented & succeed.
> Doesn't it lead to a wrong decision. Do I miss something, or do you
> consider this situation as far-fetched?

Sorry, what was implicit there was that the imaginary 
acquire_msi_remapping_resources(*IOMMU* domian) call still involves 
going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the 
relevant place.

Either way, now that I consider it further, a flag on the IOMMU domain 
is not just unnecessary, I think it's actually fundamentally incorrect: 
picture a system which for some crazy reason has both a GICv3 ITS plus 
some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain 
targets the (safe) ITS or the (unsafe) bridge isn't a property of the 
IOMMU domain it's trying to attach to; if you can't rely on the IOMMU 
itself to isolate MSIs, then you can only know whether to allow or 
reject any given group by inspecting every device in that group to make 
sure they all have isolation provided by their MSI domains and that the 
IOMMU domain has all the relevant doorbell mappings ready.

I guess the allow_unsafe_interrupts case might complicate matters beyond 
the logic I sketched out, because then we might actually care about the 
difference between "is isolation provided?" and "are sufficient 
IOVA/descriptor resources available?", but the main point still stands.

Robin.

> Thanks
>
> Eric
>
>>          we're good
>>      else
>>          oh no!    // Don't care whether the domain ran out of
>>              // resources or simply doesn't support it,
>>              // either way we can't proceed.
>>
>> Robin.
>>
>>> Best Regards
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>
>>>>> ---
>>>>>
>>>>> v4 -> v5:
>>>>> - introduce the user in the next patch
>>>>>
>>>>> RFC v1 -> v1:
>>>>> - the data field is not used
>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>> MSI_MAPPING
>>>>>      capability if needed or <0 if not.
>>>>> - removed struct iommu_domain_msi_maps
>>>>> ---
>>>>>     include/linux/iommu.h | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index 62a5eae..b3e8c5b 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>         DOMAIN_ATTR_MAX,
>>>>>     };
>>>>>
>>>>>
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 14:53             ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 14:53 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Hi Robin,
On 04/22/2016 03:02 PM, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 02:36 PM, Robin Murphy wrote:
>> On 20/04/16 17:14, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> This patch introduces some new fields in the iommu_domain struct,
>>>>> dedicated to reserved iova management.
>>>>>
>>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>>> information related to a reserved IOVA window.
>>>>>
>>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>>> handle. An RB tree indexed by physical address is introduced to
>>>>> store the host physical addresses bound to reserved IOVAs.
>>>>>
>>>>> Those physical addresses will correspond to MSI frame base
>>>>> addresses, also referred to as doorbells. Their number should be
>>>>> quite limited per domain.
>>>>>
>>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>>> tree will need to be accessed in MSI controller code not allowed to
>>>>> sleep.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>
>>>>> ---
>>>>> v5 -> v6:
>>>>> - initialize reserved_binding_list
>>>>> - use a spinlock instead of a mutex
>>>>> ---
>>>>>    drivers/iommu/iommu.c | 2 ++
>>>>>    include/linux/iommu.h | 6 ++++++
>>>>>    2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>>> index b9df141..f70ef3b 100644
>>>>> --- a/drivers/iommu/iommu.c
>>>>> +++ b/drivers/iommu/iommu.c
>>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>>
>>>>>        domain->ops  = bus->iommu_ops;
>>>>>        domain->type = type;
>>>>> +    spin_lock_init(&domain->reserved_lock);
>>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>>
>>>>>        return domain;
>>>>>    }
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index b3e8c5b..60999db 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -24,6 +24,7 @@
>>>>>    #include <linux/of.h>
>>>>>    #include <linux/types.h>
>>>>>    #include <linux/scatterlist.h>
>>>>> +#include <linux/spinlock.h>
>>>>>    #include <trace/events/iommu.h>
>>>>>
>>>>>    #define IOMMU_READ    (1 << 0)
>>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>>        void *handler_token;
>>>>>        struct iommu_domain_geometry geometry;
>>>>>        void *iova_cookie;
>>>>> +    void *reserved_iova_cookie;
>>>>
>>>> Why exactly do we need this? From your description, it's for the user of
>>>> the domain to keep track of IOVA allocations in, but then that's
>>>> precisely what the iova_cookie exists for.
>>>
>>> I was not sure whether both APIs could not be used concurrently, hence a
>>> separate cookie. If we only consider MSI mapping use case I guess we are
>>> either with a DMA domain or with a domain for VFIO and I would agree
>>> with you, ie. we can reuse the same cookie.
>>
>> Unless somebody cooks up some paravirtualised monstrosity where the
>> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
>> I'm not sure how that would even be possible), then they should always
>> be mutually exclusive.
> 
> OK thanks
>>
>> (That said, I should probably add a sanity check to
>> iommu_dma_put_cookie() to ensure it only touches the cookies of
>> IOMMU_DOMAIN_DMA domains...)
>>
>>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>>> +    struct rb_root reserved_binding_list;
>>>>
>>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>>> particularly vague and nondescript, and makes me think of anything but
>>>> MSI descriptors.
>>> my heart is torn between advised genericity and MSI use case. My natural
>>> short-sighted inclination would head me for an MSI mapping dedicated API
>>> but I am following advices. As discussed with Alex there are
>>> implementation details pretty related to MSI problematics I think (the
>>> fact we store the "bindings" in an rb-tree/list, locking)
>>>
>>> If Marc & Alex I can retarget this API to be less generic.
>>>
>>>   Plus it's called a list but isn't a list (that said,
>>>> given that we'd typically only expect a handful of entries, and lookups
>>>> are hardly going to be a performance-critical bottleneck, would a simple
>>>> list not suffice?)
>>> I fully agree on that point. An rb-tree is overkill today for MSI use
>>> case. Again if we were to use this API for anything else, this may
>>> change the decision. But sure we can refactor afterwards upon needs. TBH
>>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>>> was originally located.
>>
>> Thinking some more, how feasible would it be to handle the IOVA
>> management aspect within the existing tree, i.e. extend struct vfio_dma
>> so an entry can represent different types of thing - DMA pages, MSI
>> pages, arbitrary reservations - and link to more implementation-specific
>> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
>> as necessary?
> it is feasible and  was approximately done there at the early ages of
> the series:
> https://lkml.org/lkml/2016/1/28/803 &
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Forgot to mention that locking mechanism currently used in
vfio_iommu_type1 is based on mutex. Hence it is not compatible with MSI
layer requirements where msi_domain_(de)activate and msi_set_affinity
must be atomic.

in msi case I think an rcu list might be quite appropriate.

Best Regards

Eric
> 
> Then with the intent of doing something reusable the trend was to put it
> in the iommu instead of vfio_iommu_typeX.
> 
> I am currently trying to make the "msi-iommu api" implemented upon the
> dma-iommu api based on the simplifications that we discussed:
> - reuse iova_cookie for iova domain
> - add an opaque msi_cookie that hides the msi doorbell list + its spinlock
> - simplify locking by making sure the msi domain cannot disappear before
> the iommu domain destruction
> 
> If you agree I would suggest to wait and see the outcome of this new
> design and make a shared decision based on that code? Should be
> available next week.
> 
> Best Regards
> 
> Eric
> 
>>
>> Robin.
>>
>>>>
>>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>>> +    spinlock_t reserved_lock;
>>>>
>>>> A cookie is an opaque structure, so any locking it needs would normally
>>>> be hidden within. If on the other hand it's not meant to be opaque at
>>>> this level, then it should probably be something more specific than a
>>>> void * (if at all, as above).
>>> agreed
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>>    };
>>>>>
>>>>>    enum iommu_cap {
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 14:53             ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 14:53 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Hi Robin,
On 04/22/2016 03:02 PM, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 02:36 PM, Robin Murphy wrote:
>> On 20/04/16 17:14, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> This patch introduces some new fields in the iommu_domain struct,
>>>>> dedicated to reserved iova management.
>>>>>
>>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>>> information related to a reserved IOVA window.
>>>>>
>>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>>> handle. An RB tree indexed by physical address is introduced to
>>>>> store the host physical addresses bound to reserved IOVAs.
>>>>>
>>>>> Those physical addresses will correspond to MSI frame base
>>>>> addresses, also referred to as doorbells. Their number should be
>>>>> quite limited per domain.
>>>>>
>>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>>> tree will need to be accessed in MSI controller code not allowed to
>>>>> sleep.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>>>
>>>>> ---
>>>>> v5 -> v6:
>>>>> - initialize reserved_binding_list
>>>>> - use a spinlock instead of a mutex
>>>>> ---
>>>>>    drivers/iommu/iommu.c | 2 ++
>>>>>    include/linux/iommu.h | 6 ++++++
>>>>>    2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>>> index b9df141..f70ef3b 100644
>>>>> --- a/drivers/iommu/iommu.c
>>>>> +++ b/drivers/iommu/iommu.c
>>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>>
>>>>>        domain->ops  = bus->iommu_ops;
>>>>>        domain->type = type;
>>>>> +    spin_lock_init(&domain->reserved_lock);
>>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>>
>>>>>        return domain;
>>>>>    }
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index b3e8c5b..60999db 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -24,6 +24,7 @@
>>>>>    #include <linux/of.h>
>>>>>    #include <linux/types.h>
>>>>>    #include <linux/scatterlist.h>
>>>>> +#include <linux/spinlock.h>
>>>>>    #include <trace/events/iommu.h>
>>>>>
>>>>>    #define IOMMU_READ    (1 << 0)
>>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>>        void *handler_token;
>>>>>        struct iommu_domain_geometry geometry;
>>>>>        void *iova_cookie;
>>>>> +    void *reserved_iova_cookie;
>>>>
>>>> Why exactly do we need this? From your description, it's for the user of
>>>> the domain to keep track of IOVA allocations in, but then that's
>>>> precisely what the iova_cookie exists for.
>>>
>>> I was not sure whether both APIs could not be used concurrently, hence a
>>> separate cookie. If we only consider MSI mapping use case I guess we are
>>> either with a DMA domain or with a domain for VFIO and I would agree
>>> with you, ie. we can reuse the same cookie.
>>
>> Unless somebody cooks up some paravirtualised monstrosity where the
>> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
>> I'm not sure how that would even be possible), then they should always
>> be mutually exclusive.
> 
> OK thanks
>>
>> (That said, I should probably add a sanity check to
>> iommu_dma_put_cookie() to ensure it only touches the cookies of
>> IOMMU_DOMAIN_DMA domains...)
>>
>>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>>> +    struct rb_root reserved_binding_list;
>>>>
>>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>>> particularly vague and nondescript, and makes me think of anything but
>>>> MSI descriptors.
>>> my heart is torn between advised genericity and MSI use case. My natural
>>> short-sighted inclination would head me for an MSI mapping dedicated API
>>> but I am following advices. As discussed with Alex there are
>>> implementation details pretty related to MSI problematics I think (the
>>> fact we store the "bindings" in an rb-tree/list, locking)
>>>
>>> If Marc & Alex I can retarget this API to be less generic.
>>>
>>>   Plus it's called a list but isn't a list (that said,
>>>> given that we'd typically only expect a handful of entries, and lookups
>>>> are hardly going to be a performance-critical bottleneck, would a simple
>>>> list not suffice?)
>>> I fully agree on that point. An rb-tree is overkill today for MSI use
>>> case. Again if we were to use this API for anything else, this may
>>> change the decision. But sure we can refactor afterwards upon needs. TBH
>>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>>> was originally located.
>>
>> Thinking some more, how feasible would it be to handle the IOVA
>> management aspect within the existing tree, i.e. extend struct vfio_dma
>> so an entry can represent different types of thing - DMA pages, MSI
>> pages, arbitrary reservations - and link to more implementation-specific
>> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
>> as necessary?
> it is feasible and  was approximately done there at the early ages of
> the series:
> https://lkml.org/lkml/2016/1/28/803 &
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Forgot to mention that locking mechanism currently used in
vfio_iommu_type1 is based on mutex. Hence it is not compatible with MSI
layer requirements where msi_domain_(de)activate and msi_set_affinity
must be atomic.

in msi case I think an rcu list might be quite appropriate.

Best Regards

Eric
> 
> Then with the intent of doing something reusable the trend was to put it
> in the iommu instead of vfio_iommu_typeX.
> 
> I am currently trying to make the "msi-iommu api" implemented upon the
> dma-iommu api based on the simplifications that we discussed:
> - reuse iova_cookie for iova domain
> - add an opaque msi_cookie that hides the msi doorbell list + its spinlock
> - simplify locking by making sure the msi domain cannot disappear before
> the iommu domain destruction
> 
> If you agree I would suggest to wait and see the outcome of this new
> design and make a shared decision based on that code? Should be
> available next week.
> 
> Best Regards
> 
> Eric
> 
>>
>> Robin.
>>
>>>>
>>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>>> +    spinlock_t reserved_lock;
>>>>
>>>> A cookie is an opaque structure, so any locking it needs would normally
>>>> be hidden within. If on the other hand it's not meant to be opaque at
>>>> this level, then it should probably be something more specific than a
>>>> void * (if at all, as above).
>>> agreed
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>>    };
>>>>>
>>>>>    enum iommu_cap {
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 03/10] iommu: introduce a reserved iova cookie
@ 2016-04-22 14:53             ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 14:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,
On 04/22/2016 03:02 PM, Eric Auger wrote:
> Hi Robin,
> On 04/22/2016 02:36 PM, Robin Murphy wrote:
>> On 20/04/16 17:14, Eric Auger wrote:
>>> Hi Robin,
>>> On 04/20/2016 02:55 PM, Robin Murphy wrote:
>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>> This patch introduces some new fields in the iommu_domain struct,
>>>>> dedicated to reserved iova management.
>>>>>
>>>>> In a similar way as DMA mapping IOVA window, we need to store
>>>>> information related to a reserved IOVA window.
>>>>>
>>>>> The reserved_iova_cookie will store the reserved iova_domain
>>>>> handle. An RB tree indexed by physical address is introduced to
>>>>> store the host physical addresses bound to reserved IOVAs.
>>>>>
>>>>> Those physical addresses will correspond to MSI frame base
>>>>> addresses, also referred to as doorbells. Their number should be
>>>>> quite limited per domain.
>>>>>
>>>>> Also a spin_lock is introduced to protect accesses to the iova_domain
>>>>> and RB tree. The choice of a spin_lock is driven by the fact the RB
>>>>> tree will need to be accessed in MSI controller code not allowed to
>>>>> sleep.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>
>>>>> ---
>>>>> v5 -> v6:
>>>>> - initialize reserved_binding_list
>>>>> - use a spinlock instead of a mutex
>>>>> ---
>>>>>    drivers/iommu/iommu.c | 2 ++
>>>>>    include/linux/iommu.h | 6 ++++++
>>>>>    2 files changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>>> index b9df141..f70ef3b 100644
>>>>> --- a/drivers/iommu/iommu.c
>>>>> +++ b/drivers/iommu/iommu.c
>>>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain
>>>>> *__iommu_domain_alloc(struct bus_type *bus,
>>>>>
>>>>>        domain->ops  = bus->iommu_ops;
>>>>>        domain->type = type;
>>>>> +    spin_lock_init(&domain->reserved_lock);
>>>>> +    domain->reserved_binding_list = RB_ROOT;
>>>>>
>>>>>        return domain;
>>>>>    }
>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>> index b3e8c5b..60999db 100644
>>>>> --- a/include/linux/iommu.h
>>>>> +++ b/include/linux/iommu.h
>>>>> @@ -24,6 +24,7 @@
>>>>>    #include <linux/of.h>
>>>>>    #include <linux/types.h>
>>>>>    #include <linux/scatterlist.h>
>>>>> +#include <linux/spinlock.h>
>>>>>    #include <trace/events/iommu.h>
>>>>>
>>>>>    #define IOMMU_READ    (1 << 0)
>>>>> @@ -83,6 +84,11 @@ struct iommu_domain {
>>>>>        void *handler_token;
>>>>>        struct iommu_domain_geometry geometry;
>>>>>        void *iova_cookie;
>>>>> +    void *reserved_iova_cookie;
>>>>
>>>> Why exactly do we need this? From your description, it's for the user of
>>>> the domain to keep track of IOVA allocations in, but then that's
>>>> precisely what the iova_cookie exists for.
>>>
>>> I was not sure whether both APIs could not be used concurrently, hence a
>>> separate cookie. If we only consider MSI mapping use case I guess we are
>>> either with a DMA domain or with a domain for VFIO and I would agree
>>> with you, ie. we can reuse the same cookie.
>>
>> Unless somebody cooks up some paravirtualised monstrosity where the
>> guest driver somehow uses the host kernel's DMA mapping ops (thankfully,
>> I'm not sure how that would even be possible), then they should always
>> be mutually exclusive.
> 
> OK thanks
>>
>> (That said, I should probably add a sanity check to
>> iommu_dma_put_cookie() to ensure it only touches the cookies of
>> IOMMU_DOMAIN_DMA domains...)
>>
>>>>> +    /* rb tree indexed by PA, for reserved bindings only */
>>>>> +    struct rb_root reserved_binding_list;
>>>>
>>>> Nit: that's more puzzling than helpful - "reserved binding" is
>>>> particularly vague and nondescript, and makes me think of anything but
>>>> MSI descriptors.
>>> my heart is torn between advised genericity and MSI use case. My natural
>>> short-sighted inclination would head me for an MSI mapping dedicated API
>>> but I am following advices. As discussed with Alex there are
>>> implementation details pretty related to MSI problematics I think (the
>>> fact we store the "bindings" in an rb-tree/list, locking)
>>>
>>> If Marc & Alex I can retarget this API to be less generic.
>>>
>>>   Plus it's called a list but isn't a list (that said,
>>>> given that we'd typically only expect a handful of entries, and lookups
>>>> are hardly going to be a performance-critical bottleneck, would a simple
>>>> list not suffice?)
>>> I fully agree on that point. An rb-tree is overkill today for MSI use
>>> case. Again if we were to use this API for anything else, this may
>>> change the decision. But sure we can refactor afterwards upon needs. TBH
>>> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code
>>> was originally located.
>>
>> Thinking some more, how feasible would it be to handle the IOVA
>> management aspect within the existing tree, i.e. extend struct vfio_dma
>> so an entry can represent different types of thing - DMA pages, MSI
>> pages, arbitrary reservations - and link to more implementation-specific
>> data (e.g. a refcounted MSI descriptor stored elsewhere in the domain)
>> as necessary?
> it is feasible and  was approximately done there at the early ages of
> the series:
> https://lkml.org/lkml/2016/1/28/803 &
> https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html

Forgot to mention that locking mechanism currently used in
vfio_iommu_type1 is based on mutex. Hence it is not compatible with MSI
layer requirements where msi_domain_(de)activate and msi_set_affinity
must be atomic.

in msi case I think an rcu list might be quite appropriate.

Best Regards

Eric
> 
> Then with the intent of doing something reusable the trend was to put it
> in the iommu instead of vfio_iommu_typeX.
> 
> I am currently trying to make the "msi-iommu api" implemented upon the
> dma-iommu api based on the simplifications that we discussed:
> - reuse iova_cookie for iova domain
> - add an opaque msi_cookie that hides the msi doorbell list + its spinlock
> - simplify locking by making sure the msi domain cannot disappear before
> the iommu domain destruction
> 
> If you agree I would suggest to wait and see the outcome of this new
> design and make a shared decision based on that code? Should be
> available next week.
> 
> Best Regards
> 
> Eric
> 
>>
>> Robin.
>>
>>>>
>>>>> +    /* protects reserved cookie and rbtree manipulation */
>>>>> +    spinlock_t reserved_lock;
>>>>
>>>> A cookie is an opaque structure, so any locking it needs would normally
>>>> be hidden within. If on the other hand it's not meant to be opaque at
>>>> this level, then it should probably be something more specific than a
>>>> void * (if at all, as above).
>>> agreed
>>>
>>> Thanks
>>>
>>> Eric
>>>>
>>>> Robin.
>>>>
>>>>>    };
>>>>>
>>>>>    enum iommu_cap {
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 15:33               ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 15:33 UTC (permalink / raw)
  To: Robin Murphy, eric.auger, alex.williamson, will.deacon, joro,
	tglx, jason, marc.zyngier, christoffer.dall, linux-arm-kernel
  Cc: patches, linux-kernel, Bharat.Bhushan, pranav.sawargaonkar,
	p.fedin, iommu, Jean-Philippe.Brucker, julien.grall

Ho Robin,
On 04/22/2016 04:49 PM, Robin Murphy wrote:
> On 22/04/16 13:00, Eric Auger wrote:
>> Hi Robin,
>> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>>> On 20/04/16 16:58, Eric Auger wrote:
>>>> Hi Robin,
>>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If
>>>>>> supported,
>>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>>
>>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI
>>>>>> write
>>>>>> transaction addresses always are within the 1MB PA region
>>>>>> [FEE0_0000h -
>>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>>> space and
>>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions
>>>>>> are
>>>>>> conveyed through the IOMMU.
>>>>>
>>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>>> not advertising interrupt remapping capabilities?
>>>> My current understanding is it is not possible:
>>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>>
>>> Not sure I follow - if the feature is disabled such that the IOMMU
>>> doesn't isolate MSIs, then it's no different a situation from the
>>> SMMU, no?
>>
>> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
>> criteria to detect whether MSI mapping was requested.
>>>
>>> My point was that this logic:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>>          if (acquire_msi_remapping_resources(domain))
>>>              we're good
>>>          else
>>>              oh no!
>>>      else
>>>          oh no!
>>>
>>> should be easily reducible to this:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (acquire_msi_remapping_resources(domain))
>>
>> But Can't we imagine a mix of smmus on the same platform, some
>> requesting MSI mapping and some which don't. As soon as an smmu requires
>> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
>> acquire_msi_remapping_resources(domain) will be implemented & succeed.
>> Doesn't it lead to a wrong decision. Do I miss something, or do you
>> consider this situation as far-fetched?
> 
> Sorry, what was implicit there was that the imaginary
> acquire_msi_remapping_resources(*IOMMU* domian) call still involves
> going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the
> relevant place.
> 
> Either way, now that I consider it further, a flag on the IOMMU domain
> is not just unnecessary, I think it's actually fundamentally incorrect:
> picture a system which for some crazy reason has both a GICv3 ITS plus
> some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain
> targets the (safe) ITS or the (unsafe) bridge isn't a property of the
> IOMMU domain it's trying to attach to; if you can't rely on the IOMMU
> itself to isolate MSIs, then you can only know whether to allow or
> reject any given group by inspecting every device in that group to make
> sure they all have isolation provided by their MSI domains and that the
> IOMMU domain has all the relevant doorbell mappings ready.

Yes we discussed that (inspection of all devices) with Alex already and
we concluded it was too complex for the benefits. What is currently
implemented in vfio_iommu_type1 to figure out whether IRQs are safe is:
- either the smmu domain implements IRQ remapping (x86 case) or
- the interrupt controller implements IRQ remapping (arm case)

as a result, assigning a device connected to a GICv2m is unsafe and
assigning a device connected to an ITS is safe.

the fact an IOVA is available or not impacts the assignment
functionality but not really the safety (iommu faults).
> 
> I guess the allow_unsafe_interrupts case might complicate matters beyond
> the logic I sketched out, because then we might actually care about the
> difference between "is isolation provided?" and "are sufficient
> IOVA/descriptor resources available?", but the main point still stands.

So just to make sure I did not misunderstand you, since we are talking
about orthogonal concepts:
1) requirement to map the MSI in the IOMMU
2) capability to isolate MSI,

we eventually
- keep the IOMMU domain attribute, DOMAIN_ATTR_MSI_MAPPING that tells
whether the MSI need to be mapped
- we keep the MSI domain attribute that tells whether the interrupt
controller implements MSI isolation?
- we remove IRQ_REMAPPING cap from arm-smmus

Best Regards

Eric
> 
> Robin.
> 
>> Thanks
>>
>> Eric
>>
>>>          we're good
>>>      else
>>>          oh no!    // Don't care whether the domain ran out of
>>>              // resources or simply doesn't support it,
>>>              // either way we can't proceed.
>>>
>>> Robin.
>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>>
>>>>> Robin.
>>>>>
>>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> v4 -> v5:
>>>>>> - introduce the user in the next patch
>>>>>>
>>>>>> RFC v1 -> v1:
>>>>>> - the data field is not used
>>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>>> MSI_MAPPING
>>>>>>      capability if needed or <0 if not.
>>>>>> - removed struct iommu_domain_msi_maps
>>>>>> ---
>>>>>>     include/linux/iommu.h | 1 +
>>>>>>     1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>>> index 62a5eae..b3e8c5b 100644
>>>>>> --- a/include/linux/iommu.h
>>>>>> +++ b/include/linux/iommu.h
>>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>>         DOMAIN_ATTR_MAX,
>>>>>>     };
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 15:33               ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 15:33 UTC (permalink / raw)
  To: Robin Murphy, eric.auger-qxv4g6HH51o,
	alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, tglx-hfZtesqFncYOwBW4kG4KsQ,
	jason-NLaQJdtUoK4Be96aLqz0jA, marc.zyngier-5wv7dgnIgG8,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: julien.grall-5wv7dgnIgG8, patches-QSEj5FYQhm4dnm+yROfE0A,
	p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w

Ho Robin,
On 04/22/2016 04:49 PM, Robin Murphy wrote:
> On 22/04/16 13:00, Eric Auger wrote:
>> Hi Robin,
>> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>>> On 20/04/16 16:58, Eric Auger wrote:
>>>> Hi Robin,
>>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If
>>>>>> supported,
>>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>>
>>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI
>>>>>> write
>>>>>> transaction addresses always are within the 1MB PA region
>>>>>> [FEE0_0000h -
>>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>>> space and
>>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions
>>>>>> are
>>>>>> conveyed through the IOMMU.
>>>>>
>>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>>> not advertising interrupt remapping capabilities?
>>>> My current understanding is it is not possible:
>>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>>
>>> Not sure I follow - if the feature is disabled such that the IOMMU
>>> doesn't isolate MSIs, then it's no different a situation from the
>>> SMMU, no?
>>
>> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
>> criteria to detect whether MSI mapping was requested.
>>>
>>> My point was that this logic:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>>          if (acquire_msi_remapping_resources(domain))
>>>              we're good
>>>          else
>>>              oh no!
>>>      else
>>>          oh no!
>>>
>>> should be easily reducible to this:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (acquire_msi_remapping_resources(domain))
>>
>> But Can't we imagine a mix of smmus on the same platform, some
>> requesting MSI mapping and some which don't. As soon as an smmu requires
>> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
>> acquire_msi_remapping_resources(domain) will be implemented & succeed.
>> Doesn't it lead to a wrong decision. Do I miss something, or do you
>> consider this situation as far-fetched?
> 
> Sorry, what was implicit there was that the imaginary
> acquire_msi_remapping_resources(*IOMMU* domian) call still involves
> going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the
> relevant place.
> 
> Either way, now that I consider it further, a flag on the IOMMU domain
> is not just unnecessary, I think it's actually fundamentally incorrect:
> picture a system which for some crazy reason has both a GICv3 ITS plus
> some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain
> targets the (safe) ITS or the (unsafe) bridge isn't a property of the
> IOMMU domain it's trying to attach to; if you can't rely on the IOMMU
> itself to isolate MSIs, then you can only know whether to allow or
> reject any given group by inspecting every device in that group to make
> sure they all have isolation provided by their MSI domains and that the
> IOMMU domain has all the relevant doorbell mappings ready.

Yes we discussed that (inspection of all devices) with Alex already and
we concluded it was too complex for the benefits. What is currently
implemented in vfio_iommu_type1 to figure out whether IRQs are safe is:
- either the smmu domain implements IRQ remapping (x86 case) or
- the interrupt controller implements IRQ remapping (arm case)

as a result, assigning a device connected to a GICv2m is unsafe and
assigning a device connected to an ITS is safe.

the fact an IOVA is available or not impacts the assignment
functionality but not really the safety (iommu faults).
> 
> I guess the allow_unsafe_interrupts case might complicate matters beyond
> the logic I sketched out, because then we might actually care about the
> difference between "is isolation provided?" and "are sufficient
> IOVA/descriptor resources available?", but the main point still stands.

So just to make sure I did not misunderstand you, since we are talking
about orthogonal concepts:
1) requirement to map the MSI in the IOMMU
2) capability to isolate MSI,

we eventually
- keep the IOMMU domain attribute, DOMAIN_ATTR_MSI_MAPPING that tells
whether the MSI need to be mapped
- we keep the MSI domain attribute that tells whether the interrupt
controller implements MSI isolation?
- we remove IRQ_REMAPPING cap from arm-smmus

Best Regards

Eric
> 
> Robin.
> 
>> Thanks
>>
>> Eric
>>
>>>          we're good
>>>      else
>>>          oh no!    // Don't care whether the domain ran out of
>>>              // resources or simply doesn't support it,
>>>              // either way we can't proceed.
>>>
>>> Robin.
>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>>
>>>>> Robin.
>>>>>
>>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>>>>>> Signed-off-by: Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> v4 -> v5:
>>>>>> - introduce the user in the next patch
>>>>>>
>>>>>> RFC v1 -> v1:
>>>>>> - the data field is not used
>>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>>> MSI_MAPPING
>>>>>>      capability if needed or <0 if not.
>>>>>> - removed struct iommu_domain_msi_maps
>>>>>> ---
>>>>>>     include/linux/iommu.h | 1 +
>>>>>>     1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>>> index 62a5eae..b3e8c5b 100644
>>>>>> --- a/include/linux/iommu.h
>>>>>> +++ b/include/linux/iommu.h
>>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>>         DOMAIN_ATTR_MAX,
>>>>>>     };
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute
@ 2016-04-22 15:33               ` Eric Auger
  0 siblings, 0 replies; 127+ messages in thread
From: Eric Auger @ 2016-04-22 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

Ho Robin,
On 04/22/2016 04:49 PM, Robin Murphy wrote:
> On 22/04/16 13:00, Eric Auger wrote:
>> Hi Robin,
>> On 04/22/2016 01:31 PM, Robin Murphy wrote:
>>> On 20/04/16 16:58, Eric Auger wrote:
>>>> Hi Robin,
>>>> On 04/20/2016 02:47 PM, Robin Murphy wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On 19/04/16 17:56, Eric Auger wrote:
>>>>>> Introduce a new DOMAIN_ATTR_MSI_MAPPING domain attribute. If
>>>>>> supported,
>>>>>> this means the MSI addresses need to be mapped in the IOMMU.
>>>>>>
>>>>>> x86 IOMMUs typically don't expose the attribute since on x86, MSI
>>>>>> write
>>>>>> transaction addresses always are within the 1MB PA region
>>>>>> [FEE0_0000h -
>>>>>> FEF0_000h] window which directly targets the APIC configuration
>>>>>> space and
>>>>>> hence bypass the sMMU. On ARM and PowerPC however MSI transactions
>>>>>> are
>>>>>> conveyed through the IOMMU.
>>>>>
>>>>> What's stopping us from simply inferring this from the domain's IOMMU
>>>>> not advertising interrupt remapping capabilities?
>>>> My current understanding is it is not possible:
>>>> on x86 CAP_INTR_REMAP is not systematically exposed (the feature can be
>>>> disabled) and MSIs are never mapped in the IOMMU I think.
>>>
>>> Not sure I follow - if the feature is disabled such that the IOMMU
>>> doesn't isolate MSIs, then it's no different a situation from the
>>> SMMU, no?
>>
>> sorry I understood you wanted to use IOMMU_CAP_INTR_REMAP as the sole
>> criteria to detect whether MSI mapping was requested.
>>>
>>> My point was that this logic:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (DOMAIN_ATTR_MSI_MAPPING)
>>>          if (acquire_msi_remapping_resources(domain))
>>>              we're good
>>>          else
>>>              oh no!
>>>      else
>>>          oh no!
>>>
>>> should be easily reducible to this:
>>>
>>>      if (IOMMU_CAP_INTR_REMAP)
>>>          we're good
>>>      else if (acquire_msi_remapping_resources(domain))
>>
>> But Can't we imagine a mix of smmus on the same platform, some
>> requesting MSI mapping and some which don't. As soon as an smmu requires
>> MSI mapping, CONFIG_IOMMU_DMA_RESERVED is set and
>> acquire_msi_remapping_resources(domain) will be implemented & succeed.
>> Doesn't it lead to a wrong decision. Do I miss something, or do you
>> consider this situation as far-fetched?
> 
> Sorry, what was implicit there was that the imaginary
> acquire_msi_remapping_resources(*IOMMU* domian) call still involves
> going all the way down to check for MSI_FLAG_IRQ_REMAPPING in the
> relevant place.
> 
> Either way, now that I consider it further, a flag on the IOMMU domain
> is not just unnecessary, I think it's actually fundamentally incorrect:
> picture a system which for some crazy reason has both a GICv3 ITS plus
> some other dumb v2m-like MMIO-SPI bridge - whether a device's MSI domain
> targets the (safe) ITS or the (unsafe) bridge isn't a property of the
> IOMMU domain it's trying to attach to; if you can't rely on the IOMMU
> itself to isolate MSIs, then you can only know whether to allow or
> reject any given group by inspecting every device in that group to make
> sure they all have isolation provided by their MSI domains and that the
> IOMMU domain has all the relevant doorbell mappings ready.

Yes we discussed that (inspection of all devices) with Alex already and
we concluded it was too complex for the benefits. What is currently
implemented in vfio_iommu_type1 to figure out whether IRQs are safe is:
- either the smmu domain implements IRQ remapping (x86 case) or
- the interrupt controller implements IRQ remapping (arm case)

as a result, assigning a device connected to a GICv2m is unsafe and
assigning a device connected to an ITS is safe.

the fact an IOVA is available or not impacts the assignment
functionality but not really the safety (iommu faults).
> 
> I guess the allow_unsafe_interrupts case might complicate matters beyond
> the logic I sketched out, because then we might actually care about the
> difference between "is isolation provided?" and "are sufficient
> IOVA/descriptor resources available?", but the main point still stands.

So just to make sure I did not misunderstand you, since we are talking
about orthogonal concepts:
1) requirement to map the MSI in the IOMMU
2) capability to isolate MSI,

we eventually
- keep the IOMMU domain attribute, DOMAIN_ATTR_MSI_MAPPING that tells
whether the MSI need to be mapped
- we keep the MSI domain attribute that tells whether the interrupt
controller implements MSI isolation?
- we remove IRQ_REMAPPING cap from arm-smmus

Best Regards

Eric
> 
> Robin.
> 
>> Thanks
>>
>> Eric
>>
>>>          we're good
>>>      else
>>>          oh no!    // Don't care whether the domain ran out of
>>>              // resources or simply doesn't support it,
>>>              // either way we can't proceed.
>>>
>>> Robin.
>>>
>>>> Best Regards
>>>>
>>>> Eric
>>>>>
>>>>> Robin.
>>>>>
>>>>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>>>>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> v4 -> v5:
>>>>>> - introduce the user in the next patch
>>>>>>
>>>>>> RFC v1 -> v1:
>>>>>> - the data field is not used
>>>>>> - for this attribute domain_get_attr simply returns 0 if the
>>>>>> MSI_MAPPING
>>>>>>      capability if needed or <0 if not.
>>>>>> - removed struct iommu_domain_msi_maps
>>>>>> ---
>>>>>>     include/linux/iommu.h | 1 +
>>>>>>     1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>>>>>> index 62a5eae..b3e8c5b 100644
>>>>>> --- a/include/linux/iommu.h
>>>>>> +++ b/include/linux/iommu.h
>>>>>> @@ -113,6 +113,7 @@ enum iommu_attr {
>>>>>>         DOMAIN_ATTR_FSL_PAMU_ENABLE,
>>>>>>         DOMAIN_ATTR_FSL_PAMUV1,
>>>>>>         DOMAIN_ATTR_NESTING,    /* two stages of translation */
>>>>>> +    DOMAIN_ATTR_MSI_MAPPING, /* Require MSIs mapping in iommu */
>>>>>>         DOMAIN_ATTR_MAX,
>>>>>>     };
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 19:07         ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-22 19:07 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger, robin.murphy, will.deacon, joro, tglx, jason,
	marc.zyngier, christoffer.dall, linux-arm-kernel, patches,
	linux-kernel, Bharat.Bhushan, pranav.sawargaonkar, p.fedin,
	iommu, Jean-Philippe.Brucker, julien.grall

On Fri, 22 Apr 2016 14:31:18 +0200
Eric Auger <eric.auger@linaro.org> wrote:

> Hi Alex,
> On 04/21/2016 09:32 PM, Alex Williamson wrote:
> > On Thu, 21 Apr 2016 14:18:09 +0200
> > Eric Auger <eric.auger@linaro.org> wrote:
> >   
> >> Hi Alex, Robin,
> >> On 04/19/2016 06:56 PM, Eric Auger wrote:  
> >>> This series introduces the dma-reserved-iommu api used to:
> >>>
> >>> - create/destroy an iova domain dedicated to reserved iova bindings
> >>> - map/unmap physical addresses onto reserved IOVAs.
> >>> - search for an existing reserved iova mapping matching a PA window
> >>> - determine whether an msi needs to be iommu mapped
> >>> - translate an msi_msg PA address into its IOVA counterpart    
> >>
> >> Following Robin's review, I understand one important point we have to
> >> clarify is how much this API has to be generic.
> >>
> >> I agree with Robin on the fact there is quite a lot of duplication
> >> between this dma-reserved-iommu implementation and dma-iommu
> >> implementation. Maybe we could consider an msi-mapping API
> >> implementation upon dma-iommu.c. This implementation would add MSI
> >> doorbell binding list management, including, ref counting and locking.
> >>
> >> We would need to add a map/unmap function taking an iova/pa/size as
> >> parameters in current dma-iommu.c
> >>
> >> An important assumption is that the dma-mapping API and the msi-mapping
> >> API must not be used concurrently (be would be trying to use the same
> >> cookie to store a different iova_domain).
> >>
> >> Any thought/suggestion?  
> > 
> > Hi Eric,
> > 
> > I'm not attached to a generic interface, the important part for me is
> > that if we have an iommu domain with space reserved for MSI, the MSI
> > setup and allocation code should handle that so we don't need to play
> > the remapping tricks between vfio-pci and a vfio iommu driver that we
> > saw in early drafts of this.  My first inclination is always to try to
> > make a generic, re-usable interface, but I apologize if that's led us
> > astray here and we really do want the more simple, MSI specific
> > interface.
> > 
> > For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> > what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> > Maybe something like:
> > 
> > struct iommu_domain_msi_geometry {
> > 	dma_addr_t	aperture_start;
> > 	dma_addr_t	aperture_end;
> > 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> > };
> > 
> > Calling \get\ on arm would return { 0, 0, false }, indicating it's
> > programmable, \set\ would allocate the iovad as specified.  That would
> > make it very easy to expand the API to x86 with reporting of the fixed
> > MSI range and it operates within the existing IOMMU API interfaces.
> > Thanks,  
> Yes I would be happy to handle this x86 query requirement. I would be
> more inclined to define it at "MSI mapping API" level since the IOMMU
> API implementation does not handle iova allocation, as Robin argued as
> the beginning. When "MSI MAPPING API" CONFIG is unset I would return
> default x86 aperture.
> 
> Does it make sense?

It's not entirely clear to me if x86 would be participating in this MSI
mapping API given the implicit handling within iommu/irq-remapping.
It might make sense if x86 iommus simply left a gap in their existing
geometry reporting through the iommu api.  I guess we'll see in your
next draft ;)  Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 19:07         ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-22 19:07 UTC (permalink / raw)
  To: Eric Auger
  Cc: julien.grall-5wv7dgnIgG8, eric.auger-qxv4g6HH51o,
	jason-NLaQJdtUoK4Be96aLqz0jA, patches-QSEj5FYQhm4dnm+yROfE0A,
	marc.zyngier-5wv7dgnIgG8, p.fedin-Sze3O3UU22JBDgjK7y7TUQ,
	will.deacon-5wv7dgnIgG8, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	tglx-hfZtesqFncYOwBW4kG4KsQ,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A

On Fri, 22 Apr 2016 14:31:18 +0200
Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:

> Hi Alex,
> On 04/21/2016 09:32 PM, Alex Williamson wrote:
> > On Thu, 21 Apr 2016 14:18:09 +0200
> > Eric Auger <eric.auger-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> >   
> >> Hi Alex, Robin,
> >> On 04/19/2016 06:56 PM, Eric Auger wrote:  
> >>> This series introduces the dma-reserved-iommu api used to:
> >>>
> >>> - create/destroy an iova domain dedicated to reserved iova bindings
> >>> - map/unmap physical addresses onto reserved IOVAs.
> >>> - search for an existing reserved iova mapping matching a PA window
> >>> - determine whether an msi needs to be iommu mapped
> >>> - translate an msi_msg PA address into its IOVA counterpart    
> >>
> >> Following Robin's review, I understand one important point we have to
> >> clarify is how much this API has to be generic.
> >>
> >> I agree with Robin on the fact there is quite a lot of duplication
> >> between this dma-reserved-iommu implementation and dma-iommu
> >> implementation. Maybe we could consider an msi-mapping API
> >> implementation upon dma-iommu.c. This implementation would add MSI
> >> doorbell binding list management, including, ref counting and locking.
> >>
> >> We would need to add a map/unmap function taking an iova/pa/size as
> >> parameters in current dma-iommu.c
> >>
> >> An important assumption is that the dma-mapping API and the msi-mapping
> >> API must not be used concurrently (be would be trying to use the same
> >> cookie to store a different iova_domain).
> >>
> >> Any thought/suggestion?  
> > 
> > Hi Eric,
> > 
> > I'm not attached to a generic interface, the important part for me is
> > that if we have an iommu domain with space reserved for MSI, the MSI
> > setup and allocation code should handle that so we don't need to play
> > the remapping tricks between vfio-pci and a vfio iommu driver that we
> > saw in early drafts of this.  My first inclination is always to try to
> > make a generic, re-usable interface, but I apologize if that's led us
> > astray here and we really do want the more simple, MSI specific
> > interface.
> > 
> > For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> > what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> > Maybe something like:
> > 
> > struct iommu_domain_msi_geometry {
> > 	dma_addr_t	aperture_start;
> > 	dma_addr_t	aperture_end;
> > 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> > };
> > 
> > Calling \get\ on arm would return { 0, 0, false }, indicating it's
> > programmable, \set\ would allocate the iovad as specified.  That would
> > make it very easy to expand the API to x86 with reporting of the fixed
> > MSI range and it operates within the existing IOMMU API interfaces.
> > Thanks,  
> Yes I would be happy to handle this x86 query requirement. I would be
> more inclined to define it at "MSI mapping API" level since the IOMMU
> API implementation does not handle iova allocation, as Robin argued as
> the beginning. When "MSI MAPPING API" CONFIG is unset I would return
> default x86 aperture.
> 
> Does it make sense?

It's not entirely clear to me if x86 would be participating in this MSI
mapping API given the implicit handling within iommu/irq-remapping.
It might make sense if x86 iommus simply left a gap in their existing
geometry reporting through the iommu api.  I guess we'll see in your
next draft ;)  Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes
@ 2016-04-22 19:07         ` Alex Williamson
  0 siblings, 0 replies; 127+ messages in thread
From: Alex Williamson @ 2016-04-22 19:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 22 Apr 2016 14:31:18 +0200
Eric Auger <eric.auger@linaro.org> wrote:

> Hi Alex,
> On 04/21/2016 09:32 PM, Alex Williamson wrote:
> > On Thu, 21 Apr 2016 14:18:09 +0200
> > Eric Auger <eric.auger@linaro.org> wrote:
> >   
> >> Hi Alex, Robin,
> >> On 04/19/2016 06:56 PM, Eric Auger wrote:  
> >>> This series introduces the dma-reserved-iommu api used to:
> >>>
> >>> - create/destroy an iova domain dedicated to reserved iova bindings
> >>> - map/unmap physical addresses onto reserved IOVAs.
> >>> - search for an existing reserved iova mapping matching a PA window
> >>> - determine whether an msi needs to be iommu mapped
> >>> - translate an msi_msg PA address into its IOVA counterpart    
> >>
> >> Following Robin's review, I understand one important point we have to
> >> clarify is how much this API has to be generic.
> >>
> >> I agree with Robin on the fact there is quite a lot of duplication
> >> between this dma-reserved-iommu implementation and dma-iommu
> >> implementation. Maybe we could consider an msi-mapping API
> >> implementation upon dma-iommu.c. This implementation would add MSI
> >> doorbell binding list management, including, ref counting and locking.
> >>
> >> We would need to add a map/unmap function taking an iova/pa/size as
> >> parameters in current dma-iommu.c
> >>
> >> An important assumption is that the dma-mapping API and the msi-mapping
> >> API must not be used concurrently (be would be trying to use the same
> >> cookie to store a different iova_domain).
> >>
> >> Any thought/suggestion?  
> > 
> > Hi Eric,
> > 
> > I'm not attached to a generic interface, the important part for me is
> > that if we have an iommu domain with space reserved for MSI, the MSI
> > setup and allocation code should handle that so we don't need to play
> > the remapping tricks between vfio-pci and a vfio iommu driver that we
> > saw in early drafts of this.  My first inclination is always to try to
> > make a generic, re-usable interface, but I apologize if that's led us
> > astray here and we really do want the more simple, MSI specific
> > interface.
> > 
> > For the IOMMU API, rather than just a DOMAIN_ATTR_MSI_MAPPING flag,
> > what about DOMAIN_ATTR_MSI_GEOMETRY with both a get and set attribute?
> > Maybe something like:
> > 
> > struct iommu_domain_msi_geometry {
> > 	dma_addr_t	aperture_start;
> > 	dma_addr_t	aperture_end;
> > 	bool		fixed; /* or 'programmable' depending on your polarity preference */
> > };
> > 
> > Calling \get\ on arm would return { 0, 0, false }, indicating it's
> > programmable, \set\ would allocate the iovad as specified.  That would
> > make it very easy to expand the API to x86 with reporting of the fixed
> > MSI range and it operates within the existing IOMMU API interfaces.
> > Thanks,  
> Yes I would be happy to handle this x86 query requirement. I would be
> more inclined to define it at "MSI mapping API" level since the IOMMU
> API implementation does not handle iova allocation, as Robin argued as
> the beginning. When "MSI MAPPING API" CONFIG is unset I would return
> default x86 aperture.
> 
> Does it make sense?

It's not entirely clear to me if x86 would be participating in this MSI
mapping API given the implicit handling within iommu/irq-remapping.
It might make sense if x86 iommus simply left a gap in their existing
geometry reporting through the iommu api.  I guess we'll see in your
next draft ;)  Thanks,

Alex

^ permalink raw reply	[flat|nested] 127+ messages in thread

end of thread, other threads:[~2016-04-22 19:07 UTC | newest]

Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-19 16:56 [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
2016-04-19 16:56 ` Eric Auger
2016-04-19 16:56 ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 01/10] iommu: Add DOMAIN_ATTR_MSI_MAPPING attribute Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 12:47   ` Robin Murphy
2016-04-20 12:47     ` Robin Murphy
2016-04-20 12:47     ` Robin Murphy
2016-04-20 15:58     ` Eric Auger
2016-04-20 15:58       ` Eric Auger
2016-04-20 15:58       ` Eric Auger
2016-04-22 11:31       ` Robin Murphy
2016-04-22 11:31         ` Robin Murphy
2016-04-22 11:31         ` Robin Murphy
2016-04-22 12:00         ` Eric Auger
2016-04-22 12:00           ` Eric Auger
2016-04-22 12:00           ` Eric Auger
2016-04-22 14:49           ` Robin Murphy
2016-04-22 14:49             ` Robin Murphy
2016-04-22 14:49             ` Robin Murphy
2016-04-22 15:33             ` Eric Auger
2016-04-22 15:33               ` Eric Auger
2016-04-22 15:33               ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 02/10] iommu/arm-smmu: advertise " Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 03/10] iommu: introduce a reserved iova cookie Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 12:55   ` Robin Murphy
2016-04-20 12:55     ` Robin Murphy
2016-04-20 12:55     ` Robin Murphy
2016-04-20 16:14     ` Eric Auger
2016-04-20 16:14       ` Eric Auger
2016-04-20 16:14       ` Eric Auger
2016-04-22 12:36       ` Robin Murphy
2016-04-22 12:36         ` Robin Murphy
2016-04-22 12:36         ` Robin Murphy
2016-04-22 13:02         ` Eric Auger
2016-04-22 13:02           ` Eric Auger
2016-04-22 13:02           ` Eric Auger
2016-04-22 14:53           ` Eric Auger
2016-04-22 14:53             ` Eric Auger
2016-04-22 14:53             ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 04/10] iommu/dma-reserved-iommu: alloc/free_reserved_iova_domain Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 13:03   ` Robin Murphy
2016-04-20 13:03     ` Robin Murphy
2016-04-20 13:03     ` Robin Murphy
2016-04-20 13:11     ` Eric Auger
2016-04-20 13:11       ` Eric Auger
2016-04-20 13:11       ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 05/10] iommu/dma-reserved-iommu: reserved binding rb-tree and helpers Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 13:12   ` Robin Murphy
2016-04-20 13:12     ` Robin Murphy
2016-04-20 13:12     ` Robin Murphy
2016-04-20 16:18     ` Eric Auger
2016-04-20 16:18       ` Eric Auger
2016-04-20 16:18       ` Eric Auger
2016-04-22 13:05       ` Robin Murphy
2016-04-22 13:05         ` Robin Murphy
2016-04-22 13:05         ` Robin Murphy
2016-04-19 16:56 ` [PATCH v7 06/10] iommu/dma-reserved-iommu: iommu_get/put_reserved_iova Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 16:58   ` Robin Murphy
2016-04-20 16:58     ` Robin Murphy
2016-04-20 16:58     ` Robin Murphy
2016-04-21  8:43     ` Eric Auger
2016-04-21  8:43       ` Eric Auger
2016-04-21  8:43       ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 07/10] iommu/dma-reserved-iommu: delete bindings in iommu_free_reserved_iova_domain Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 17:05   ` Robin Murphy
2016-04-20 17:05     ` Robin Murphy
2016-04-20 17:05     ` Robin Murphy
2016-04-21  8:40     ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 08/10] iommu/dma-reserved_iommu: iommu_msi_mapping_desc_to_domain Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 17:19   ` Robin Murphy
2016-04-20 17:19     ` Robin Murphy
2016-04-20 17:19     ` Robin Murphy
2016-04-21  8:40     ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 09/10] iommu/dma-reserved-iommu: iommu_msi_mapping_translate_msg Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20  9:38   ` Marc Zyngier
2016-04-20  9:38     ` Marc Zyngier
2016-04-20 12:50     ` Eric Auger
2016-04-20 12:50       ` Eric Auger
2016-04-20 12:50       ` Eric Auger
2016-04-20 17:28   ` Robin Murphy
2016-04-20 17:28     ` Robin Murphy
2016-04-20 17:28     ` Robin Murphy
2016-04-21  8:40     ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-21  8:40       ` Eric Auger
2016-04-19 16:56 ` [PATCH v7 10/10] iommu/arm-smmu: call iommu_free_reserved_iova_domain on domain destruction Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-19 16:56   ` Eric Auger
2016-04-20 17:35   ` Robin Murphy
2016-04-20 17:35     ` Robin Murphy
2016-04-20 17:35     ` Robin Murphy
2016-04-21  8:39     ` Eric Auger
2016-04-21  8:39       ` Eric Auger
2016-04-21  8:39       ` Eric Auger
2016-04-21 12:18 ` [PATCH v7 00/10] KVM PCIe/MSI passthrough on ARM/ARM64: kernel part 1/3: iommu changes Eric Auger
2016-04-21 12:18   ` Eric Auger
2016-04-21 12:18   ` Eric Auger
2016-04-21 19:32   ` Alex Williamson
2016-04-21 19:32     ` Alex Williamson
2016-04-21 19:32     ` Alex Williamson
2016-04-22 12:31     ` Eric Auger
2016-04-22 12:31       ` Eric Auger
2016-04-22 12:31       ` Eric Auger
2016-04-22 19:07       ` Alex Williamson
2016-04-22 19:07         ` Alex Williamson
2016-04-22 19:07         ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.