All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-27 16:03 ` Srinath Mannam via iommu
  0 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam @ 2020-05-27 16:03 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel
  Cc: bcm-kernel-feedback-list, linux-arm-kernel, iommu, linux-kernel,
	Srinath Mannam

This patch gives the provision to change default value of MSI IOVA base
to platform's suitable IOVA using module parameter. The present
hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
property are reserved.

If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.

Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
---
 drivers/iommu/arm-smmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f1a350..5e59c9d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -72,6 +72,9 @@ static bool disable_bypass =
 module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
 	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
 
 struct arm_smmu_s2cr {
 	struct iommu_group		*group;
@@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	struct iommu_resv_region *region;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 					 prot, IOMMU_RESV_SW_MSI);
 	if (!region)
 		return;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-27 16:03 ` Srinath Mannam via iommu
  0 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam via iommu @ 2020-05-27 16:03 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel
  Cc: Srinath Mannam, iommu, bcm-kernel-feedback-list, linux-kernel,
	linux-arm-kernel

This patch gives the provision to change default value of MSI IOVA base
to platform's suitable IOVA using module parameter. The present
hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
property are reserved.

If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.

Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
---
 drivers/iommu/arm-smmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f1a350..5e59c9d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -72,6 +72,9 @@ static bool disable_bypass =
 module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
 	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
 
 struct arm_smmu_s2cr {
 	struct iommu_group		*group;
@@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	struct iommu_resv_region *region;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 					 prot, IOMMU_RESV_SW_MSI);
 	if (!region)
 		return;
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-27 16:03 ` Srinath Mannam via iommu
  0 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam @ 2020-05-27 16:03 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel
  Cc: Srinath Mannam, iommu, bcm-kernel-feedback-list, linux-kernel,
	linux-arm-kernel

This patch gives the provision to change default value of MSI IOVA base
to platform's suitable IOVA using module parameter. The present
hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
property are reserved.

If any platform has the limitaion to access default MSI IOVA, then it can
be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.

Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
---
 drivers/iommu/arm-smmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f1a350..5e59c9d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -72,6 +72,9 @@ static bool disable_bypass =
 module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
 	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
+static unsigned long msi_iova_base = MSI_IOVA_BASE;
+module_param(msi_iova_base, ulong, S_IRUGO);
+MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
 
 struct arm_smmu_s2cr {
 	struct iommu_group		*group;
@@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 	struct iommu_resv_region *region;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
 					 prot, IOMMU_RESV_SW_MSI);
 	if (!region)
 		return;
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-27 16:03 ` Srinath Mannam via iommu
  (?)
@ 2020-05-27 17:30   ` Robin Murphy
  -1 siblings, 0 replies; 37+ messages in thread
From: Robin Murphy @ 2020-05-27 17:30 UTC (permalink / raw)
  To: Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: bcm-kernel-feedback-list, linux-arm-kernel, iommu, linux-kernel,
	Eric Auger, Alex Williamson, Jean-Philippe Brucker

On 2020-05-27 17:03, Srinath Mannam wrote:
> This patch gives the provision to change default value of MSI IOVA base
> to platform's suitable IOVA using module parameter. The present
> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

That in itself doesn't seem entirely unreasonable; IIRC the current 
address is just an arbitrary choice to fit nicely into Qemu's memory 
map, and there was always the possibility that it wouldn't suit everything.

> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> property are reserved.

That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells 
dynamically, so they aren't affected by reserved regions any more than 
regular DMA pages are. In fact, it explicitly ignores the software MSI 
region, since as the comment says, it *is* the software that manages those.

The MSI_IOVA_BASE region exists for VFIO, precisely because in that case 
the kernel *doesn't* control the address space, but still needs some way 
to steal a bit of it for MSIs that the guest doesn't necessarily know 
about, and give userspace a fighting chance of knowing what it's taken. 
I think at the time we discussed the idea of adding something to the 
VFIO uapi such that userspace could move this around if it wanted or 
needed to, but decided we could live without that initially. Perhaps now 
the time has come?

Robin.

> If any platform has the limitaion to access default MSI IOVA, then it can
> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> 
> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> ---
>   drivers/iommu/arm-smmu.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 4f1a350..5e59c9d 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -72,6 +72,9 @@ static bool disable_bypass =
>   module_param(disable_bypass, bool, S_IRUGO);
>   MODULE_PARM_DESC(disable_bypass,
>   	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> +module_param(msi_iova_base, ulong, S_IRUGO);
> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>   
>   struct arm_smmu_s2cr {
>   	struct iommu_group		*group;
> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>   	struct iommu_resv_region *region;
>   	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>   
> -	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> +	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>   					 prot, IOMMU_RESV_SW_MSI);
>   	if (!region)
>   		return;
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-27 17:30   ` Robin Murphy
  0 siblings, 0 replies; 37+ messages in thread
From: Robin Murphy @ 2020-05-27 17:30 UTC (permalink / raw)
  To: Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: Jean-Philippe Brucker, iommu, linux-kernel, Alex Williamson,
	bcm-kernel-feedback-list, linux-arm-kernel

On 2020-05-27 17:03, Srinath Mannam wrote:
> This patch gives the provision to change default value of MSI IOVA base
> to platform's suitable IOVA using module parameter. The present
> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

That in itself doesn't seem entirely unreasonable; IIRC the current 
address is just an arbitrary choice to fit nicely into Qemu's memory 
map, and there was always the possibility that it wouldn't suit everything.

> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> property are reserved.

That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells 
dynamically, so they aren't affected by reserved regions any more than 
regular DMA pages are. In fact, it explicitly ignores the software MSI 
region, since as the comment says, it *is* the software that manages those.

The MSI_IOVA_BASE region exists for VFIO, precisely because in that case 
the kernel *doesn't* control the address space, but still needs some way 
to steal a bit of it for MSIs that the guest doesn't necessarily know 
about, and give userspace a fighting chance of knowing what it's taken. 
I think at the time we discussed the idea of adding something to the 
VFIO uapi such that userspace could move this around if it wanted or 
needed to, but decided we could live without that initially. Perhaps now 
the time has come?

Robin.

> If any platform has the limitaion to access default MSI IOVA, then it can
> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> 
> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> ---
>   drivers/iommu/arm-smmu.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 4f1a350..5e59c9d 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -72,6 +72,9 @@ static bool disable_bypass =
>   module_param(disable_bypass, bool, S_IRUGO);
>   MODULE_PARM_DESC(disable_bypass,
>   	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> +module_param(msi_iova_base, ulong, S_IRUGO);
> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>   
>   struct arm_smmu_s2cr {
>   	struct iommu_group		*group;
> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>   	struct iommu_resv_region *region;
>   	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>   
> -	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> +	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>   					 prot, IOMMU_RESV_SW_MSI);
>   	if (!region)
>   		return;
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-27 17:30   ` Robin Murphy
  0 siblings, 0 replies; 37+ messages in thread
From: Robin Murphy @ 2020-05-27 17:30 UTC (permalink / raw)
  To: Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: Jean-Philippe Brucker, iommu, linux-kernel, Eric Auger,
	Alex Williamson, bcm-kernel-feedback-list, linux-arm-kernel

On 2020-05-27 17:03, Srinath Mannam wrote:
> This patch gives the provision to change default value of MSI IOVA base
> to platform's suitable IOVA using module parameter. The present
> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.

That in itself doesn't seem entirely unreasonable; IIRC the current 
address is just an arbitrary choice to fit nicely into Qemu's memory 
map, and there was always the possibility that it wouldn't suit everything.

> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> property are reserved.

That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells 
dynamically, so they aren't affected by reserved regions any more than 
regular DMA pages are. In fact, it explicitly ignores the software MSI 
region, since as the comment says, it *is* the software that manages those.

The MSI_IOVA_BASE region exists for VFIO, precisely because in that case 
the kernel *doesn't* control the address space, but still needs some way 
to steal a bit of it for MSIs that the guest doesn't necessarily know 
about, and give userspace a fighting chance of knowing what it's taken. 
I think at the time we discussed the idea of adding something to the 
VFIO uapi such that userspace could move this around if it wanted or 
needed to, but decided we could live without that initially. Perhaps now 
the time has come?

Robin.

> If any platform has the limitaion to access default MSI IOVA, then it can
> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> 
> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> ---
>   drivers/iommu/arm-smmu.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 4f1a350..5e59c9d 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -72,6 +72,9 @@ static bool disable_bypass =
>   module_param(disable_bypass, bool, S_IRUGO);
>   MODULE_PARM_DESC(disable_bypass,
>   	"Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> +module_param(msi_iova_base, ulong, S_IRUGO);
> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>   
>   struct arm_smmu_s2cr {
>   	struct iommu_group		*group;
> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>   	struct iommu_resv_region *region;
>   	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>   
> -	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> +	region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>   					 prot, IOMMU_RESV_SW_MSI);
>   	if (!region)
>   		return;
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-27 17:30   ` Robin Murphy
  (?)
@ 2020-05-28  5:15     ` Srinath Mannam via iommu
  -1 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam @ 2020-05-28  5:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, Joerg Roedel, BCM Kernel Feedback, Linux ARM, iommu,
	Linux Kernel Mailing List, Eric Auger, Alex Williamson,
	Jean-Philippe Brucker

On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
Thanks Robin for your quick response.
> On 2020-05-27 17:03, Srinath Mannam wrote:
> > This patch gives the provision to change default value of MSI IOVA base
> > to platform's suitable IOVA using module parameter. The present
> > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map, and there was always the possibility that it wouldn't suit everything.
>
> > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > property are reserved.
>
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially. Perhaps now
> the time has come?
Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
region is considered to map MSI registers. This patch helps us to fix the issue.

Thanks,
Srinath.
>
> Robin.
>
> > If any platform has the limitaion to access default MSI IOVA, then it can
> > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >
> > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > ---
> >   drivers/iommu/arm-smmu.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 4f1a350..5e59c9d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -72,6 +72,9 @@ static bool disable_bypass =
> >   module_param(disable_bypass, bool, S_IRUGO);
> >   MODULE_PARM_DESC(disable_bypass,
> >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > +module_param(msi_iova_base, ulong, S_IRUGO);
> > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >
> >   struct arm_smmu_s2cr {
> >       struct iommu_group              *group;
> > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >       struct iommu_resv_region *region;
> >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >
> > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >                                        prot, IOMMU_RESV_SW_MSI);
> >       if (!region)
> >               return;
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  5:15     ` Srinath Mannam via iommu
  0 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam via iommu @ 2020-05-28  5:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Alex Williamson,
	Linux Kernel Mailing List, iommu, BCM Kernel Feedback,
	Will Deacon, Linux ARM

On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
Thanks Robin for your quick response.
> On 2020-05-27 17:03, Srinath Mannam wrote:
> > This patch gives the provision to change default value of MSI IOVA base
> > to platform's suitable IOVA using module parameter. The present
> > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map, and there was always the possibility that it wouldn't suit everything.
>
> > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > property are reserved.
>
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially. Perhaps now
> the time has come?
Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
region is considered to map MSI registers. This patch helps us to fix the issue.

Thanks,
Srinath.
>
> Robin.
>
> > If any platform has the limitaion to access default MSI IOVA, then it can
> > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >
> > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > ---
> >   drivers/iommu/arm-smmu.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 4f1a350..5e59c9d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -72,6 +72,9 @@ static bool disable_bypass =
> >   module_param(disable_bypass, bool, S_IRUGO);
> >   MODULE_PARM_DESC(disable_bypass,
> >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > +module_param(msi_iova_base, ulong, S_IRUGO);
> > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >
> >   struct arm_smmu_s2cr {
> >       struct iommu_group              *group;
> > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >       struct iommu_resv_region *region;
> >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >
> > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >                                        prot, IOMMU_RESV_SW_MSI);
> >       if (!region)
> >               return;
> >
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  5:15     ` Srinath Mannam via iommu
  0 siblings, 0 replies; 37+ messages in thread
From: Srinath Mannam @ 2020-05-28  5:15 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jean-Philippe Brucker, Alex Williamson, Joerg Roedel,
	Linux Kernel Mailing List, Eric Auger, iommu,
	BCM Kernel Feedback, Will Deacon, Linux ARM

On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
Thanks Robin for your quick response.
> On 2020-05-27 17:03, Srinath Mannam wrote:
> > This patch gives the provision to change default value of MSI IOVA base
> > to platform's suitable IOVA using module parameter. The present
> > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map, and there was always the possibility that it wouldn't suit everything.
>
> > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > property are reserved.
>
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially. Perhaps now
> the time has come?
Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
region is considered to map MSI registers. This patch helps us to fix the issue.

Thanks,
Srinath.
>
> Robin.
>
> > If any platform has the limitaion to access default MSI IOVA, then it can
> > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >
> > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > ---
> >   drivers/iommu/arm-smmu.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 4f1a350..5e59c9d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -72,6 +72,9 @@ static bool disable_bypass =
> >   module_param(disable_bypass, bool, S_IRUGO);
> >   MODULE_PARM_DESC(disable_bypass,
> >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > +module_param(msi_iova_base, ulong, S_IRUGO);
> > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >
> >   struct arm_smmu_s2cr {
> >       struct iommu_group              *group;
> > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >       struct iommu_resv_region *region;
> >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >
> > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >                                        prot, IOMMU_RESV_SW_MSI);
> >       if (!region)
> >               return;
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-27 17:30   ` Robin Murphy
  (?)
@ 2020-05-28  7:21     ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:21 UTC (permalink / raw)
  To: Robin Murphy, Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: bcm-kernel-feedback-list, linux-arm-kernel, iommu, linux-kernel,
	Alex Williamson, Jean-Philippe Brucker

Hi,

On 5/27/20 7:30 PM, Robin Murphy wrote:
> On 2020-05-27 17:03, Srinath Mannam wrote:
>> This patch gives the provision to change default value of MSI IOVA base
>> to platform's suitable IOVA using module parameter. The present
>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
> 
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map,
correct
 and there was always the possibility that it wouldn't suit everything.

Indeed I also recently had this case of PCI host bridge collision with
the SW MSI reserved window - maybe that's the same ;-) -.
> 
>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>> property are reserved.
> 
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
> 
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially.

Yes indeed ;-)

 Perhaps now
> the time has come?

Do you mean you would welcome a VFIO based approach or would a driver
parameter be sufficient?

Thanks

Eric


> 
> Robin.
> 
>> If any platform has the limitaion to access default MSI IOVA, then it can
>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line
>> argument.
>>
>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>> ---
>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 4f1a350..5e59c9d 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>   module_param(disable_bypass, bool, S_IRUGO);
>>   MODULE_PARM_DESC(disable_bypass,
>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort
>> back to the device and will not be allowed to pass through the SMMU.");
>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>> +module_param(msi_iova_base, ulong, S_IRUGO);
>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>     struct arm_smmu_s2cr {
>>       struct iommu_group        *group;
>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct
>> device *dev,
>>       struct iommu_resv_region *region;
>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>   -    region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>> +    region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>                        prot, IOMMU_RESV_SW_MSI);
>>       if (!region)
>>           return;
>>
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:21     ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:21 UTC (permalink / raw)
  To: Robin Murphy, Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: Jean-Philippe Brucker, iommu, linux-kernel, Alex Williamson,
	bcm-kernel-feedback-list, linux-arm-kernel

Hi,

On 5/27/20 7:30 PM, Robin Murphy wrote:
> On 2020-05-27 17:03, Srinath Mannam wrote:
>> This patch gives the provision to change default value of MSI IOVA base
>> to platform's suitable IOVA using module parameter. The present
>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
> 
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map,
correct
 and there was always the possibility that it wouldn't suit everything.

Indeed I also recently had this case of PCI host bridge collision with
the SW MSI reserved window - maybe that's the same ;-) -.
> 
>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>> property are reserved.
> 
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
> 
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially.

Yes indeed ;-)

 Perhaps now
> the time has come?

Do you mean you would welcome a VFIO based approach or would a driver
parameter be sufficient?

Thanks

Eric


> 
> Robin.
> 
>> If any platform has the limitaion to access default MSI IOVA, then it can
>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line
>> argument.
>>
>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>> ---
>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 4f1a350..5e59c9d 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>   module_param(disable_bypass, bool, S_IRUGO);
>>   MODULE_PARM_DESC(disable_bypass,
>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort
>> back to the device and will not be allowed to pass through the SMMU.");
>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>> +module_param(msi_iova_base, ulong, S_IRUGO);
>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>     struct arm_smmu_s2cr {
>>       struct iommu_group        *group;
>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct
>> device *dev,
>>       struct iommu_resv_region *region;
>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>   -    region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>> +    region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>                        prot, IOMMU_RESV_SW_MSI);
>>       if (!region)
>>           return;
>>
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:21     ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:21 UTC (permalink / raw)
  To: Robin Murphy, Srinath Mannam, Will Deacon, Joerg Roedel
  Cc: Jean-Philippe Brucker, iommu, linux-kernel, Alex Williamson,
	bcm-kernel-feedback-list, linux-arm-kernel

Hi,

On 5/27/20 7:30 PM, Robin Murphy wrote:
> On 2020-05-27 17:03, Srinath Mannam wrote:
>> This patch gives the provision to change default value of MSI IOVA base
>> to platform's suitable IOVA using module parameter. The present
>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
> 
> That in itself doesn't seem entirely unreasonable; IIRC the current
> address is just an arbitrary choice to fit nicely into Qemu's memory
> map,
correct
 and there was always the possibility that it wouldn't suit everything.

Indeed I also recently had this case of PCI host bridge collision with
the SW MSI reserved window - maybe that's the same ;-) -.
> 
>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>> property are reserved.
> 
> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> dynamically, so they aren't affected by reserved regions any more than
> regular DMA pages are. In fact, it explicitly ignores the software MSI
> region, since as the comment says, it *is* the software that manages those.
> 
> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> the kernel *doesn't* control the address space, but still needs some way
> to steal a bit of it for MSIs that the guest doesn't necessarily know
> about, and give userspace a fighting chance of knowing what it's taken.
> I think at the time we discussed the idea of adding something to the
> VFIO uapi such that userspace could move this around if it wanted or
> needed to, but decided we could live without that initially.

Yes indeed ;-)

 Perhaps now
> the time has come?

Do you mean you would welcome a VFIO based approach or would a driver
parameter be sufficient?

Thanks

Eric


> 
> Robin.
> 
>> If any platform has the limitaion to access default MSI IOVA, then it can
>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line
>> argument.
>>
>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>> ---
>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 4f1a350..5e59c9d 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>   module_param(disable_bypass, bool, S_IRUGO);
>>   MODULE_PARM_DESC(disable_bypass,
>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort
>> back to the device and will not be allowed to pass through the SMMU.");
>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>> +module_param(msi_iova_base, ulong, S_IRUGO);
>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>     struct arm_smmu_s2cr {
>>       struct iommu_group        *group;
>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct
>> device *dev,
>>       struct iommu_resv_region *region;
>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>   -    region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>> +    region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>                        prot, IOMMU_RESV_SW_MSI);
>>       if (!region)
>>           return;
>>
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  5:15     ` Srinath Mannam via iommu
  (?)
@ 2020-05-28  7:23       ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  7:23 UTC (permalink / raw)
  To: Srinath Mannam
  Cc: Robin Murphy, Will Deacon, Joerg Roedel, BCM Kernel Feedback,
	Linux ARM, iommu, Linux Kernel Mailing List, Eric Auger,
	Alex Williamson

On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> Thanks Robin for your quick response.
> > On 2020-05-27 17:03, Srinath Mannam wrote:
> > > This patch gives the provision to change default value of MSI IOVA base
> > > to platform's suitable IOVA using module parameter. The present
> > > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >
> > That in itself doesn't seem entirely unreasonable; IIRC the current
> > address is just an arbitrary choice to fit nicely into Qemu's memory
> > map, and there was always the possibility that it wouldn't suit everything.
> >
> > > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > > property are reserved.

I don't understand why we only reserve the PCIe windows for DMA domains.
Shouldn't VFIO also prevent userspace from mapping them?  If they were
part of the common reserved regions then we could have VFIO choose a
SW_MSI region among the remaining free space. It would just need a
different way of asking the IOMMU driver if a SW_MSI is needed, for
example with a domain attribute.

Thanks,
Jean

> >
> > That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> > dynamically, so they aren't affected by reserved regions any more than
> > regular DMA pages are. In fact, it explicitly ignores the software MSI
> > region, since as the comment says, it *is* the software that manages those.
> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >
> > The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> > the kernel *doesn't* control the address space, but still needs some way
> > to steal a bit of it for MSIs that the guest doesn't necessarily know
> > about, and give userspace a fighting chance of knowing what it's taken.
> > I think at the time we discussed the idea of adding something to the
> > VFIO uapi such that userspace could move this around if it wanted or
> > needed to, but decided we could live without that initially. Perhaps now
> > the time has come?
> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> region is considered to map MSI registers. This patch helps us to fix the issue.
> 
> Thanks,
> Srinath.
> >
> > Robin.
> >
> > > If any platform has the limitaion to access default MSI IOVA, then it can
> > > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> > >
> > > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > > ---
> > >   drivers/iommu/arm-smmu.c | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 4f1a350..5e59c9d 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -72,6 +72,9 @@ static bool disable_bypass =
> > >   module_param(disable_bypass, bool, S_IRUGO);
> > >   MODULE_PARM_DESC(disable_bypass,
> > >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > > +module_param(msi_iova_base, ulong, S_IRUGO);
> > > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> > >
> > >   struct arm_smmu_s2cr {
> > >       struct iommu_group              *group;
> > > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> > >       struct iommu_resv_region *region;
> > >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > >
> > > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> > >                                        prot, IOMMU_RESV_SW_MSI);
> > >       if (!region)
> > >               return;
> > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:23       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  7:23 UTC (permalink / raw)
  To: Srinath Mannam
  Cc: Alex Williamson, Robin Murphy, Linux Kernel Mailing List, iommu,
	BCM Kernel Feedback, Will Deacon, Linux ARM

On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> Thanks Robin for your quick response.
> > On 2020-05-27 17:03, Srinath Mannam wrote:
> > > This patch gives the provision to change default value of MSI IOVA base
> > > to platform's suitable IOVA using module parameter. The present
> > > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >
> > That in itself doesn't seem entirely unreasonable; IIRC the current
> > address is just an arbitrary choice to fit nicely into Qemu's memory
> > map, and there was always the possibility that it wouldn't suit everything.
> >
> > > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > > property are reserved.

I don't understand why we only reserve the PCIe windows for DMA domains.
Shouldn't VFIO also prevent userspace from mapping them?  If they were
part of the common reserved regions then we could have VFIO choose a
SW_MSI region among the remaining free space. It would just need a
different way of asking the IOMMU driver if a SW_MSI is needed, for
example with a domain attribute.

Thanks,
Jean

> >
> > That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> > dynamically, so they aren't affected by reserved regions any more than
> > regular DMA pages are. In fact, it explicitly ignores the software MSI
> > region, since as the comment says, it *is* the software that manages those.
> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >
> > The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> > the kernel *doesn't* control the address space, but still needs some way
> > to steal a bit of it for MSIs that the guest doesn't necessarily know
> > about, and give userspace a fighting chance of knowing what it's taken.
> > I think at the time we discussed the idea of adding something to the
> > VFIO uapi such that userspace could move this around if it wanted or
> > needed to, but decided we could live without that initially. Perhaps now
> > the time has come?
> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> region is considered to map MSI registers. This patch helps us to fix the issue.
> 
> Thanks,
> Srinath.
> >
> > Robin.
> >
> > > If any platform has the limitaion to access default MSI IOVA, then it can
> > > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> > >
> > > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > > ---
> > >   drivers/iommu/arm-smmu.c | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 4f1a350..5e59c9d 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -72,6 +72,9 @@ static bool disable_bypass =
> > >   module_param(disable_bypass, bool, S_IRUGO);
> > >   MODULE_PARM_DESC(disable_bypass,
> > >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > > +module_param(msi_iova_base, ulong, S_IRUGO);
> > > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> > >
> > >   struct arm_smmu_s2cr {
> > >       struct iommu_group              *group;
> > > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> > >       struct iommu_resv_region *region;
> > >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > >
> > > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> > >                                        prot, IOMMU_RESV_SW_MSI);
> > >       if (!region)
> > >               return;
> > >
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:23       ` Jean-Philippe Brucker
  0 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  7:23 UTC (permalink / raw)
  To: Srinath Mannam
  Cc: Alex Williamson, Robin Murphy, Joerg Roedel,
	Linux Kernel Mailing List, Eric Auger, iommu,
	BCM Kernel Feedback, Will Deacon, Linux ARM

On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> Thanks Robin for your quick response.
> > On 2020-05-27 17:03, Srinath Mannam wrote:
> > > This patch gives the provision to change default value of MSI IOVA base
> > > to platform's suitable IOVA using module parameter. The present
> > > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >
> > That in itself doesn't seem entirely unreasonable; IIRC the current
> > address is just an arbitrary choice to fit nicely into Qemu's memory
> > map, and there was always the possibility that it wouldn't suit everything.
> >
> > > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> > > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> > > property are reserved.

I don't understand why we only reserve the PCIe windows for DMA domains.
Shouldn't VFIO also prevent userspace from mapping them?  If they were
part of the common reserved regions then we could have VFIO choose a
SW_MSI region among the remaining free space. It would just need a
different way of asking the IOMMU driver if a SW_MSI is needed, for
example with a domain attribute.

Thanks,
Jean

> >
> > That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> > dynamically, so they aren't affected by reserved regions any more than
> > regular DMA pages are. In fact, it explicitly ignores the software MSI
> > region, since as the comment says, it *is* the software that manages those.
> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >
> > The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> > the kernel *doesn't* control the address space, but still needs some way
> > to steal a bit of it for MSIs that the guest doesn't necessarily know
> > about, and give userspace a fighting chance of knowing what it's taken.
> > I think at the time we discussed the idea of adding something to the
> > VFIO uapi such that userspace could move this around if it wanted or
> > needed to, but decided we could live without that initially. Perhaps now
> > the time has come?
> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> region is considered to map MSI registers. This patch helps us to fix the issue.
> 
> Thanks,
> Srinath.
> >
> > Robin.
> >
> > > If any platform has the limitaion to access default MSI IOVA, then it can
> > > be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> > >
> > > Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> > > ---
> > >   drivers/iommu/arm-smmu.c | 5 ++++-
> > >   1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 4f1a350..5e59c9d 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -72,6 +72,9 @@ static bool disable_bypass =
> > >   module_param(disable_bypass, bool, S_IRUGO);
> > >   MODULE_PARM_DESC(disable_bypass,
> > >       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> > > +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> > > +module_param(msi_iova_base, ulong, S_IRUGO);
> > > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> > >
> > >   struct arm_smmu_s2cr {
> > >       struct iommu_group              *group;
> > > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> > >       struct iommu_resv_region *region;
> > >       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > >
> > > -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> > > +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> > >                                        prot, IOMMU_RESV_SW_MSI);
> > >       if (!region)
> > >               return;
> > >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  7:23       ` Jean-Philippe Brucker
  (?)
@ 2020-05-28  7:43         ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:43 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Srinath Mannam
  Cc: Alex Williamson, Robin Murphy, Joerg Roedel,
	Linux Kernel Mailing List, iommu, BCM Kernel Feedback,
	Will Deacon, Linux ARM

Hi,

On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>> Thanks Robin for your quick response.
>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>> This patch gives the provision to change default value of MSI IOVA base
>>>> to platform's suitable IOVA using module parameter. The present
>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>
>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>> map, and there was always the possibility that it wouldn't suit everything.
>>>
>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>> property are reserved.
> 
> I don't understand why we only reserve the PCIe windows for DMA domains.
> Shouldn't VFIO also prevent userspace from mapping them?

VFIO prevents userspace from DMA mapping iovas within reserved regions:
9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

but it does not prevent the SW MSI region chosen by the kernel from
colliding with other reserved regions (esp. PCIe host bridge windows).

  If they were
> part of the common reserved regions then we could have VFIO choose a
> SW_MSI region among the remaining free space.
As Robin said this was the initial chosen approach
[PATCH 10/10] vfio: allow the user to register reserved iova range for
MSI mapping
https://patchwork.kernel.org/patch/8121641/

Some additional background about why the static SW MSI region chosen by
the kernel was later chosen:
Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
PCIe/MSI passthrough on ARM/ARM64 (Alt II))
https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html

Thanks

Eric


 It would just need a
> different way of asking the IOMMU driver if a SW_MSI is needed, for
> example with a domain attribute.
> 
> Thanks,
> Jean
> 
>>>
>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>> dynamically, so they aren't affected by reserved regions any more than
>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>> region, since as the comment says, it *is* the software that manages those.
>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>
>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>> the kernel *doesn't* control the address space, but still needs some way
>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>> about, and give userspace a fighting chance of knowing what it's taken.
>>> I think at the time we discussed the idea of adding something to the
>>> VFIO uapi such that userspace could move this around if it wanted or
>>> needed to, but decided we could live without that initially. Perhaps now
>>> the time has come?
>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>
>> Thanks,
>> Srinath.
>>>
>>> Robin.
>>>
>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>
>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>> ---
>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>> index 4f1a350..5e59c9d 100644
>>>> --- a/drivers/iommu/arm-smmu.c
>>>> +++ b/drivers/iommu/arm-smmu.c
>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>
>>>>   struct arm_smmu_s2cr {
>>>>       struct iommu_group              *group;
>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>       struct iommu_resv_region *region;
>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>
>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>       if (!region)
>>>>               return;
>>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:43         ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:43 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Srinath Mannam
  Cc: Will Deacon, iommu, Linux Kernel Mailing List, Alex Williamson,
	BCM Kernel Feedback, Robin Murphy, Linux ARM

Hi,

On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>> Thanks Robin for your quick response.
>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>> This patch gives the provision to change default value of MSI IOVA base
>>>> to platform's suitable IOVA using module parameter. The present
>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>
>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>> map, and there was always the possibility that it wouldn't suit everything.
>>>
>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>> property are reserved.
> 
> I don't understand why we only reserve the PCIe windows for DMA domains.
> Shouldn't VFIO also prevent userspace from mapping them?

VFIO prevents userspace from DMA mapping iovas within reserved regions:
9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

but it does not prevent the SW MSI region chosen by the kernel from
colliding with other reserved regions (esp. PCIe host bridge windows).

  If they were
> part of the common reserved regions then we could have VFIO choose a
> SW_MSI region among the remaining free space.
As Robin said this was the initial chosen approach
[PATCH 10/10] vfio: allow the user to register reserved iova range for
MSI mapping
https://patchwork.kernel.org/patch/8121641/

Some additional background about why the static SW MSI region chosen by
the kernel was later chosen:
Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
PCIe/MSI passthrough on ARM/ARM64 (Alt II))
https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html

Thanks

Eric


 It would just need a
> different way of asking the IOMMU driver if a SW_MSI is needed, for
> example with a domain attribute.
> 
> Thanks,
> Jean
> 
>>>
>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>> dynamically, so they aren't affected by reserved regions any more than
>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>> region, since as the comment says, it *is* the software that manages those.
>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>
>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>> the kernel *doesn't* control the address space, but still needs some way
>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>> about, and give userspace a fighting chance of knowing what it's taken.
>>> I think at the time we discussed the idea of adding something to the
>>> VFIO uapi such that userspace could move this around if it wanted or
>>> needed to, but decided we could live without that initially. Perhaps now
>>> the time has come?
>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>
>> Thanks,
>> Srinath.
>>>
>>> Robin.
>>>
>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>
>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>> ---
>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>> index 4f1a350..5e59c9d 100644
>>>> --- a/drivers/iommu/arm-smmu.c
>>>> +++ b/drivers/iommu/arm-smmu.c
>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>
>>>>   struct arm_smmu_s2cr {
>>>>       struct iommu_group              *group;
>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>       struct iommu_resv_region *region;
>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>
>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>       if (!region)
>>>>               return;
>>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  7:43         ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  7:43 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Srinath Mannam
  Cc: Will Deacon, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, BCM Kernel Feedback, Robin Murphy, Linux ARM

Hi,

On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>
>> Thanks Robin for your quick response.
>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>> This patch gives the provision to change default value of MSI IOVA base
>>>> to platform's suitable IOVA using module parameter. The present
>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>
>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>> map, and there was always the possibility that it wouldn't suit everything.
>>>
>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>> property are reserved.
> 
> I don't understand why we only reserve the PCIe windows for DMA domains.
> Shouldn't VFIO also prevent userspace from mapping them?

VFIO prevents userspace from DMA mapping iovas within reserved regions:
9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

but it does not prevent the SW MSI region chosen by the kernel from
colliding with other reserved regions (esp. PCIe host bridge windows).

  If they were
> part of the common reserved regions then we could have VFIO choose a
> SW_MSI region among the remaining free space.
As Robin said this was the initial chosen approach
[PATCH 10/10] vfio: allow the user to register reserved iova range for
MSI mapping
https://patchwork.kernel.org/patch/8121641/

Some additional background about why the static SW MSI region chosen by
the kernel was later chosen:
Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
PCIe/MSI passthrough on ARM/ARM64 (Alt II))
https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html

Thanks

Eric


 It would just need a
> different way of asking the IOMMU driver if a SW_MSI is needed, for
> example with a domain attribute.
> 
> Thanks,
> Jean
> 
>>>
>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>> dynamically, so they aren't affected by reserved regions any more than
>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>> region, since as the comment says, it *is* the software that manages those.
>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>
>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>> the kernel *doesn't* control the address space, but still needs some way
>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>> about, and give userspace a fighting chance of knowing what it's taken.
>>> I think at the time we discussed the idea of adding something to the
>>> VFIO uapi such that userspace could move this around if it wanted or
>>> needed to, but decided we could live without that initially. Perhaps now
>>> the time has come?
>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>
>> Thanks,
>> Srinath.
>>>
>>> Robin.
>>>
>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>
>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>> ---
>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>> index 4f1a350..5e59c9d 100644
>>>> --- a/drivers/iommu/arm-smmu.c
>>>> +++ b/drivers/iommu/arm-smmu.c
>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>
>>>>   struct arm_smmu_s2cr {
>>>>       struct iommu_group              *group;
>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>       struct iommu_resv_region *region;
>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>
>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>       if (!region)
>>>>               return;
>>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  7:43         ` Auger Eric
  (?)
@ 2020-05-28  8:38           ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  8:38 UTC (permalink / raw)
  To: Auger Eric
  Cc: Srinath Mannam, Alex Williamson, Robin Murphy, Joerg Roedel,
	Linux Kernel Mailing List, iommu, BCM Kernel Feedback,
	Will Deacon, Linux ARM, shameerali.kolothum.thodi

[+ Shameer]

On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> Hi,
> 
> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> > On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >>>
> >> Thanks Robin for your quick response.
> >>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>> This patch gives the provision to change default value of MSI IOVA base
> >>>> to platform's suitable IOVA using module parameter. The present
> >>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >>>
> >>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>> map, and there was always the possibility that it wouldn't suit everything.
> >>>
> >>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> >>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> >>>> property are reserved.
> > 
> > I don't understand why we only reserve the PCIe windows for DMA domains.
> > Shouldn't VFIO also prevent userspace from mapping them?
> 
> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

Right but I was asking specifically about the IOVA reservation introduced
by commit aadad097cd46. They are not registered as reserved regions within
the IOMMU core, they are only taken into account by dma-iommu.c when
creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
those regions and they won't be seen by vfio_iommu_resv_exclude().

It looks like the PCIe regions used to be common until cd2c9fcf5c66
("iommu/dma: Move PCI window region reservation back into dma specific
path.") But I couldn't find the justification for this commit.

The thing is, if VFIO isn't aware of the reserved PCIe windows, then
allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
reported by Srinath, because they could well choose an IOVA within the
PCIe window...

Thanks,
Jean

> but it does not prevent the SW MSI region chosen by the kernel from
> colliding with other reserved regions (esp. PCIe host bridge windows).
> 
>   If they were
> > part of the common reserved regions then we could have VFIO choose a
> > SW_MSI region among the remaining free space.
> As Robin said this was the initial chosen approach
> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> MSI mapping
> https://patchwork.kernel.org/patch/8121641/
> 
> Some additional background about why the static SW MSI region chosen by
> the kernel was later chosen:
> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
> 
> Thanks
> 
> Eric
> 
> 
>  It would just need a
> > different way of asking the IOMMU driver if a SW_MSI is needed, for
> > example with a domain attribute.
> > 
> > Thanks,
> > Jean
> > 
> >>>
> >>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> >>> dynamically, so they aren't affected by reserved regions any more than
> >>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>> region, since as the comment says, it *is* the software that manages those.
> >> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> >> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >>>
> >>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> >>> the kernel *doesn't* control the address space, but still needs some way
> >>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>> about, and give userspace a fighting chance of knowing what it's taken.
> >>> I think at the time we discussed the idea of adding something to the
> >>> VFIO uapi such that userspace could move this around if it wanted or
> >>> needed to, but decided we could live without that initially. Perhaps now
> >>> the time has come?
> >> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> >> region is considered to map MSI registers. This patch helps us to fix the issue.
> >>
> >> Thanks,
> >> Srinath.
> >>>
> >>> Robin.
> >>>
> >>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >>>>
> >>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>> ---
> >>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>> index 4f1a350..5e59c9d 100644
> >>>> --- a/drivers/iommu/arm-smmu.c
> >>>> +++ b/drivers/iommu/arm-smmu.c
> >>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> >>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>
> >>>>   struct arm_smmu_s2cr {
> >>>>       struct iommu_group              *group;
> >>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >>>>       struct iommu_resv_region *region;
> >>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>>>
> >>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> >>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >>>>                                        prot, IOMMU_RESV_SW_MSI);
> >>>>       if (!region)
> >>>>               return;
> >>>>
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > 
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  8:38           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  8:38 UTC (permalink / raw)
  To: Auger Eric
  Cc: Will Deacon, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Robin Murphy, Linux ARM

[+ Shameer]

On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> Hi,
> 
> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> > On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >>>
> >> Thanks Robin for your quick response.
> >>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>> This patch gives the provision to change default value of MSI IOVA base
> >>>> to platform's suitable IOVA using module parameter. The present
> >>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >>>
> >>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>> map, and there was always the possibility that it wouldn't suit everything.
> >>>
> >>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> >>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> >>>> property are reserved.
> > 
> > I don't understand why we only reserve the PCIe windows for DMA domains.
> > Shouldn't VFIO also prevent userspace from mapping them?
> 
> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

Right but I was asking specifically about the IOVA reservation introduced
by commit aadad097cd46. They are not registered as reserved regions within
the IOMMU core, they are only taken into account by dma-iommu.c when
creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
those regions and they won't be seen by vfio_iommu_resv_exclude().

It looks like the PCIe regions used to be common until cd2c9fcf5c66
("iommu/dma: Move PCI window region reservation back into dma specific
path.") But I couldn't find the justification for this commit.

The thing is, if VFIO isn't aware of the reserved PCIe windows, then
allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
reported by Srinath, because they could well choose an IOVA within the
PCIe window...

Thanks,
Jean

> but it does not prevent the SW MSI region chosen by the kernel from
> colliding with other reserved regions (esp. PCIe host bridge windows).
> 
>   If they were
> > part of the common reserved regions then we could have VFIO choose a
> > SW_MSI region among the remaining free space.
> As Robin said this was the initial chosen approach
> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> MSI mapping
> https://patchwork.kernel.org/patch/8121641/
> 
> Some additional background about why the static SW MSI region chosen by
> the kernel was later chosen:
> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
> 
> Thanks
> 
> Eric
> 
> 
>  It would just need a
> > different way of asking the IOMMU driver if a SW_MSI is needed, for
> > example with a domain attribute.
> > 
> > Thanks,
> > Jean
> > 
> >>>
> >>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> >>> dynamically, so they aren't affected by reserved regions any more than
> >>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>> region, since as the comment says, it *is* the software that manages those.
> >> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> >> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >>>
> >>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> >>> the kernel *doesn't* control the address space, but still needs some way
> >>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>> about, and give userspace a fighting chance of knowing what it's taken.
> >>> I think at the time we discussed the idea of adding something to the
> >>> VFIO uapi such that userspace could move this around if it wanted or
> >>> needed to, but decided we could live without that initially. Perhaps now
> >>> the time has come?
> >> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> >> region is considered to map MSI registers. This patch helps us to fix the issue.
> >>
> >> Thanks,
> >> Srinath.
> >>>
> >>> Robin.
> >>>
> >>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >>>>
> >>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>> ---
> >>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>> index 4f1a350..5e59c9d 100644
> >>>> --- a/drivers/iommu/arm-smmu.c
> >>>> +++ b/drivers/iommu/arm-smmu.c
> >>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> >>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>
> >>>>   struct arm_smmu_s2cr {
> >>>>       struct iommu_group              *group;
> >>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >>>>       struct iommu_resv_region *region;
> >>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>>>
> >>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> >>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >>>>                                        prot, IOMMU_RESV_SW_MSI);
> >>>>       if (!region)
> >>>>               return;
> >>>>
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > 
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  8:38           ` Jean-Philippe Brucker
  0 siblings, 0 replies; 37+ messages in thread
From: Jean-Philippe Brucker @ 2020-05-28  8:38 UTC (permalink / raw)
  To: Auger Eric
  Cc: Will Deacon, Joerg Roedel, iommu, shameerali.kolothum.thodi,
	Linux Kernel Mailing List, Alex Williamson, Srinath Mannam,
	BCM Kernel Feedback, Robin Murphy, Linux ARM

[+ Shameer]

On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> Hi,
> 
> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> > On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
> >>>
> >> Thanks Robin for your quick response.
> >>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>> This patch gives the provision to change default value of MSI IOVA base
> >>>> to platform's suitable IOVA using module parameter. The present
> >>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
> >>>
> >>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>> map, and there was always the possibility that it wouldn't suit everything.
> >>>
> >>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
> >>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
> >>>> property are reserved.
> > 
> > I don't understand why we only reserve the PCIe windows for DMA domains.
> > Shouldn't VFIO also prevent userspace from mapping them?
> 
> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range

Right but I was asking specifically about the IOVA reservation introduced
by commit aadad097cd46. They are not registered as reserved regions within
the IOMMU core, they are only taken into account by dma-iommu.c when
creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
those regions and they won't be seen by vfio_iommu_resv_exclude().

It looks like the PCIe regions used to be common until cd2c9fcf5c66
("iommu/dma: Move PCI window region reservation back into dma specific
path.") But I couldn't find the justification for this commit.

The thing is, if VFIO isn't aware of the reserved PCIe windows, then
allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
reported by Srinath, because they could well choose an IOVA within the
PCIe window...

Thanks,
Jean

> but it does not prevent the SW MSI region chosen by the kernel from
> colliding with other reserved regions (esp. PCIe host bridge windows).
> 
>   If they were
> > part of the common reserved regions then we could have VFIO choose a
> > SW_MSI region among the remaining free space.
> As Robin said this was the initial chosen approach
> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> MSI mapping
> https://patchwork.kernel.org/patch/8121641/
> 
> Some additional background about why the static SW MSI region chosen by
> the kernel was later chosen:
> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
> 
> Thanks
> 
> Eric
> 
> 
>  It would just need a
> > different way of asking the IOMMU driver if a SW_MSI is needed, for
> > example with a domain attribute.
> > 
> > Thanks,
> > Jean
> > 
> >>>
> >>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
> >>> dynamically, so they aren't affected by reserved regions any more than
> >>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>> region, since as the comment says, it *is* the software that manages those.
> >> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
> >> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
> >>>
> >>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
> >>> the kernel *doesn't* control the address space, but still needs some way
> >>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>> about, and give userspace a fighting chance of knowing what it's taken.
> >>> I think at the time we discussed the idea of adding something to the
> >>> VFIO uapi such that userspace could move this around if it wanted or
> >>> needed to, but decided we could live without that initially. Perhaps now
> >>> the time has come?
> >> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
> >> region is considered to map MSI registers. This patch helps us to fix the issue.
> >>
> >> Thanks,
> >> Srinath.
> >>>
> >>> Robin.
> >>>
> >>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
> >>>>
> >>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>> ---
> >>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>> index 4f1a350..5e59c9d 100644
> >>>> --- a/drivers/iommu/arm-smmu.c
> >>>> +++ b/drivers/iommu/arm-smmu.c
> >>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
> >>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>
> >>>>   struct arm_smmu_s2cr {
> >>>>       struct iommu_group              *group;
> >>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
> >>>>       struct iommu_resv_region *region;
> >>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>>>
> >>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
> >>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
> >>>>                                        prot, IOMMU_RESV_SW_MSI);
> >>>>       if (!region)
> >>>>               return;
> >>>>
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  8:38           ` Jean-Philippe Brucker
  (?)
@ 2020-05-28  8:53             ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  8:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Will Deacon, Joerg Roedel, iommu, shameerali.kolothum.thodi,
	Linux Kernel Mailing List, Alex Williamson, Srinath Mannam,
	BCM Kernel Feedback, Robin Murphy, Linux ARM

Hi,

On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> [+ Shameer]
> 
> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>> Hi,
>>
>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>>>
>>>> Thanks Robin for your quick response.
>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>>>
>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>> map, and there was always the possibility that it wouldn't suit everything.
>>>>>
>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>>>> property are reserved.
>>>
>>> I don't understand why we only reserve the PCIe windows for DMA domains.
>>> Shouldn't VFIO also prevent userspace from mapping them?
>>
>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range
> 
> Right but I was asking specifically about the IOVA reservation introduced
> by commit aadad097cd46. They are not registered as reserved regions within
> the IOMMU core, they are only taken into account by dma-iommu.c when
> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
> those regions and they won't be seen by vfio_iommu_resv_exclude().
> 
> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> ("iommu/dma: Move PCI window region reservation back into dma specific
> path.") But I couldn't find the justification for this commit.

Yes I noticed that as well when debugging the above mentioned case
before and after cd2c9fcf5c66. I do not remember about the rationale of
removing the DMA host brige windows from the resv regions. Did it break
a legacy case?
> 
> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
> reported by Srinath, because they could well choose an IOVA within the
> PCIe window...
I agree with you

Thanks

Eric
> 
> Thanks,
> Jean
> 
>> but it does not prevent the SW MSI region chosen by the kernel from
>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>
>>   If they were
>>> part of the common reserved regions then we could have VFIO choose a
>>> SW_MSI region among the remaining free space.
>> As Robin said this was the initial chosen approach
>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>> MSI mapping
>> https://patchwork.kernel.org/patch/8121641/
>>
>> Some additional background about why the static SW MSI region chosen by
>> the kernel was later chosen:
>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
>>
>> Thanks
>>
>> Eric
>>
>>
>>  It would just need a
>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>> example with a domain attribute.
>>>
>>> Thanks,
>>> Jean
>>>
>>>>>
>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>> region, since as the comment says, it *is* the software that manages those.
>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>>>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>>>
>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>> I think at the time we discussed the idea of adding something to the
>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>> the time has come?
>>>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>>>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>>>
>>>> Thanks,
>>>> Srinath.
>>>>>
>>>>> Robin.
>>>>>
>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>>>
>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>> ---
>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>> index 4f1a350..5e59c9d 100644
>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>
>>>>>>   struct arm_smmu_s2cr {
>>>>>>       struct iommu_group              *group;
>>>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>>>       struct iommu_resv_region *region;
>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>
>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>>>       if (!region)
>>>>>>               return;
>>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  8:53             ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  8:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Robin Murphy, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Will Deacon, Linux ARM

Hi,

On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> [+ Shameer]
> 
> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>> Hi,
>>
>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>>>
>>>> Thanks Robin for your quick response.
>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>>>
>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>> map, and there was always the possibility that it wouldn't suit everything.
>>>>>
>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>>>> property are reserved.
>>>
>>> I don't understand why we only reserve the PCIe windows for DMA domains.
>>> Shouldn't VFIO also prevent userspace from mapping them?
>>
>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range
> 
> Right but I was asking specifically about the IOVA reservation introduced
> by commit aadad097cd46. They are not registered as reserved regions within
> the IOMMU core, they are only taken into account by dma-iommu.c when
> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
> those regions and they won't be seen by vfio_iommu_resv_exclude().
> 
> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> ("iommu/dma: Move PCI window region reservation back into dma specific
> path.") But I couldn't find the justification for this commit.

Yes I noticed that as well when debugging the above mentioned case
before and after cd2c9fcf5c66. I do not remember about the rationale of
removing the DMA host brige windows from the resv regions. Did it break
a legacy case?
> 
> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
> reported by Srinath, because they could well choose an IOVA within the
> PCIe window...
I agree with you

Thanks

Eric
> 
> Thanks,
> Jean
> 
>> but it does not prevent the SW MSI region chosen by the kernel from
>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>
>>   If they were
>>> part of the common reserved regions then we could have VFIO choose a
>>> SW_MSI region among the remaining free space.
>> As Robin said this was the initial chosen approach
>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>> MSI mapping
>> https://patchwork.kernel.org/patch/8121641/
>>
>> Some additional background about why the static SW MSI region chosen by
>> the kernel was later chosen:
>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
>>
>> Thanks
>>
>> Eric
>>
>>
>>  It would just need a
>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>> example with a domain attribute.
>>>
>>> Thanks,
>>> Jean
>>>
>>>>>
>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>> region, since as the comment says, it *is* the software that manages those.
>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>>>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>>>
>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>> I think at the time we discussed the idea of adding something to the
>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>> the time has come?
>>>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>>>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>>>
>>>> Thanks,
>>>> Srinath.
>>>>>
>>>>> Robin.
>>>>>
>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>>>
>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>> ---
>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>> index 4f1a350..5e59c9d 100644
>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>
>>>>>>   struct arm_smmu_s2cr {
>>>>>>       struct iommu_group              *group;
>>>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>>>       struct iommu_resv_region *region;
>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>
>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>>>       if (!region)
>>>>>>               return;
>>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  8:53             ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28  8:53 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Robin Murphy, Joerg Roedel, iommu, shameerali.kolothum.thodi,
	Linux Kernel Mailing List, Alex Williamson, Srinath Mannam,
	BCM Kernel Feedback, Will Deacon, Linux ARM

Hi,

On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> [+ Shameer]
> 
> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>> Hi,
>>
>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>>>>
>>>> Thanks Robin for your quick response.
>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform.
>>>>>
>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>> map, and there was always the possibility that it wouldn't suit everything.
>>>>>
>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible
>>>>>> DMA address"), inaccessible IOVA address ranges parsed from dma-ranges
>>>>>> property are reserved.
>>>
>>> I don't understand why we only reserve the PCIe windows for DMA domains.
>>> Shouldn't VFIO also prevent userspace from mapping them?
>>
>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova range
> 
> Right but I was asking specifically about the IOVA reservation introduced
> by commit aadad097cd46. They are not registered as reserved regions within
> the IOMMU core, they are only taken into account by dma-iommu.c when
> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware of
> those regions and they won't be seen by vfio_iommu_resv_exclude().
> 
> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> ("iommu/dma: Move PCI window region reservation back into dma specific
> path.") But I couldn't find the justification for this commit.

Yes I noticed that as well when debugging the above mentioned case
before and after cd2c9fcf5c66. I do not remember about the rationale of
removing the DMA host brige windows from the resv regions. Did it break
a legacy case?
> 
> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the problem
> reported by Srinath, because they could well choose an IOVA within the
> PCIe window...
I agree with you

Thanks

Eric
> 
> Thanks,
> Jean
> 
>> but it does not prevent the SW MSI region chosen by the kernel from
>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>
>>   If they were
>>> part of the common reserved regions then we could have VFIO choose a
>>> SW_MSI region among the remaining free space.
>> As Robin said this was the initial chosen approach
>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>> MSI mapping
>> https://patchwork.kernel.org/patch/8121641/
>>
>> Some additional background about why the static SW MSI region chosen by
>> the kernel was later chosen:
>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.html
>>
>> Thanks
>>
>> Eric
>>
>>
>>  It would just need a
>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>> example with a domain attribute.
>>>
>>> Thanks,
>>> Jean
>>>
>>>>>
>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells
>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>> region, since as the comment says, it *is* the software that manages those.
>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP) because
>>>> MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages.
>>>>>
>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that case
>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>> I think at the time we discussed the idea of adding something to the
>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>> the time has come?
>>>> Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE
>>>> region is considered to map MSI registers. This patch helps us to fix the issue.
>>>>
>>>> Thanks,
>>>> Srinath.
>>>>>
>>>>> Robin.
>>>>>
>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command line argument.
>>>>>>
>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>> ---
>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>> index 4f1a350..5e59c9d 100644
>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>       "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU.");
>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>
>>>>>>   struct arm_smmu_s2cr {
>>>>>>       struct iommu_group              *group;
>>>>>> @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device *dev,
>>>>>>       struct iommu_resv_region *region;
>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>
>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH,
>>>>>>                                        prot, IOMMU_RESV_SW_MSI);
>>>>>>       if (!region)
>>>>>>               return;
>>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  8:53             ` Auger Eric
  (?)
@ 2020-05-28  9:15               ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28  9:15 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Will Deacon, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Robin Murphy, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 09:54
> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> Hi,
> 
> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> > [+ Shameer]
> >
> > On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >> Hi,
> >>
> >> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> <robin.murphy@arm.com> wrote:
> >>>>>
> >>>> Thanks Robin for your quick response.
> >>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>> This patch gives the provision to change default value of MSI IOVA base
> >>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> platform.
> >>>>>
> >>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>> map, and there was always the possibility that it wouldn't suit
> everything.
> >>>>>
> >>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> inaccessible
> >>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> dma-ranges
> >>>>>> property are reserved.
> >>>
> >>> I don't understand why we only reserve the PCIe windows for DMA
> domains.
> >>> Shouldn't VFIO also prevent userspace from mapping them?
> >>
> >> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> >> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> range
> >
> > Right but I was asking specifically about the IOVA reservation introduced
> > by commit aadad097cd46. They are not registered as reserved regions within
> > the IOMMU core, they are only taken into account by dma-iommu.c when
> > creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
> of
> > those regions and they won't be seen by vfio_iommu_resv_exclude().
> >
> > It looks like the PCIe regions used to be common until cd2c9fcf5c66
> > ("iommu/dma: Move PCI window region reservation back into dma specific
> > path.") But I couldn't find the justification for this commit.
> 
> Yes I noticed that as well when debugging the above mentioned case
> before and after cd2c9fcf5c66. I do not remember about the rationale of
> removing the DMA host brige windows from the resv regions. Did it break
> a legacy case?
> >

I think yes. And going through the ML discussions, this was done so because with the 
" vfio/type1: Add support for valid iova list management" series you reported
an issue with Seattle platform. See the full discussion here,

https://lore.kernel.org/patchwork/patch/889012/

Cheers,
Shameer

> > The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> > allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> problem
> > reported by Srinath, because they could well choose an IOVA within the
> > PCIe window...
> I agree with you
> 
> Thanks
> 
> Eric
> >
> > Thanks,
> > Jean
> >
> >> but it does not prevent the SW MSI region chosen by the kernel from
> >> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>
> >>   If they were
> >>> part of the common reserved regions then we could have VFIO choose a
> >>> SW_MSI region among the remaining free space.
> >> As Robin said this was the initial chosen approach
> >> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >> MSI mapping
> >> https://patchwork.kernel.org/patch/8121641/
> >>
> >> Some additional background about why the static SW MSI region chosen by
> >> the kernel was later chosen:
> >> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> ml
> >>
> >> Thanks
> >>
> >> Eric
> >>
> >>
> >>  It would just need a
> >>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>> example with a domain attribute.
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>>>
> >>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> doorbells
> >>>>> dynamically, so they aren't affected by reserved regions any more than
> >>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>> region, since as the comment says, it *is* the software that manages
> those.
> >>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> because
> >>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> DMA pages.
> >>>>>
> >>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> case
> >>>>> the kernel *doesn't* control the address space, but still needs some way
> >>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>> about, and give userspace a fighting chance of knowing what it's taken.
> >>>>> I think at the time we discussed the idea of adding something to the
> >>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>> needed to, but decided we could live without that initially. Perhaps now
> >>>>> the time has come?
> >>>> Yes, we see issues only with user-space drivers(DPDK) in which
> MSI_IOVA_BASE
> >>>> region is considered to map MSI registers. This patch helps us to fix the
> issue.
> >>>>
> >>>> Thanks,
> >>>> Srinath.
> >>>>>
> >>>>> Robin.
> >>>>>
> >>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
> line argument.
> >>>>>>
> >>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>> ---
> >>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>>>> index 4f1a350..5e59c9d 100644
> >>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>       "Disable bypass streams such that incoming transactions from
> devices that are not attached to an iommu domain will report an abort back to
> the device and will not be allowed to pass through the SMMU.");
> >>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>
> >>>>>>   struct arm_smmu_s2cr {
> >>>>>>       struct iommu_group              *group;
> >>>>>> @@ -1566,7 +1569,7 @@ static void
> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>       struct iommu_resv_region *region;
> >>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> >>>>>>
> >>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> MSI_IOVA_LENGTH,
> >>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> MSI_IOVA_LENGTH,
> >>>>>>                                        prot,
> IOMMU_RESV_SW_MSI);
> >>>>>>       if (!region)
> >>>>>>               return;
> >>>>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  9:15               ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28  9:15 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Robin Murphy, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Will Deacon, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 09:54
> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> Hi,
> 
> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> > [+ Shameer]
> >
> > On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >> Hi,
> >>
> >> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> <robin.murphy@arm.com> wrote:
> >>>>>
> >>>> Thanks Robin for your quick response.
> >>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>> This patch gives the provision to change default value of MSI IOVA base
> >>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> platform.
> >>>>>
> >>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>> map, and there was always the possibility that it wouldn't suit
> everything.
> >>>>>
> >>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> inaccessible
> >>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> dma-ranges
> >>>>>> property are reserved.
> >>>
> >>> I don't understand why we only reserve the PCIe windows for DMA
> domains.
> >>> Shouldn't VFIO also prevent userspace from mapping them?
> >>
> >> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> >> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> range
> >
> > Right but I was asking specifically about the IOVA reservation introduced
> > by commit aadad097cd46. They are not registered as reserved regions within
> > the IOMMU core, they are only taken into account by dma-iommu.c when
> > creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
> of
> > those regions and they won't be seen by vfio_iommu_resv_exclude().
> >
> > It looks like the PCIe regions used to be common until cd2c9fcf5c66
> > ("iommu/dma: Move PCI window region reservation back into dma specific
> > path.") But I couldn't find the justification for this commit.
> 
> Yes I noticed that as well when debugging the above mentioned case
> before and after cd2c9fcf5c66. I do not remember about the rationale of
> removing the DMA host brige windows from the resv regions. Did it break
> a legacy case?
> >

I think yes. And going through the ML discussions, this was done so because with the 
" vfio/type1: Add support for valid iova list management" series you reported
an issue with Seattle platform. See the full discussion here,

https://lore.kernel.org/patchwork/patch/889012/

Cheers,
Shameer

> > The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> > allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> problem
> > reported by Srinath, because they could well choose an IOVA within the
> > PCIe window...
> I agree with you
> 
> Thanks
> 
> Eric
> >
> > Thanks,
> > Jean
> >
> >> but it does not prevent the SW MSI region chosen by the kernel from
> >> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>
> >>   If they were
> >>> part of the common reserved regions then we could have VFIO choose a
> >>> SW_MSI region among the remaining free space.
> >> As Robin said this was the initial chosen approach
> >> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >> MSI mapping
> >> https://patchwork.kernel.org/patch/8121641/
> >>
> >> Some additional background about why the static SW MSI region chosen by
> >> the kernel was later chosen:
> >> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> ml
> >>
> >> Thanks
> >>
> >> Eric
> >>
> >>
> >>  It would just need a
> >>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>> example with a domain attribute.
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>>>
> >>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> doorbells
> >>>>> dynamically, so they aren't affected by reserved regions any more than
> >>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>> region, since as the comment says, it *is* the software that manages
> those.
> >>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> because
> >>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> DMA pages.
> >>>>>
> >>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> case
> >>>>> the kernel *doesn't* control the address space, but still needs some way
> >>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>> about, and give userspace a fighting chance of knowing what it's taken.
> >>>>> I think at the time we discussed the idea of adding something to the
> >>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>> needed to, but decided we could live without that initially. Perhaps now
> >>>>> the time has come?
> >>>> Yes, we see issues only with user-space drivers(DPDK) in which
> MSI_IOVA_BASE
> >>>> region is considered to map MSI registers. This patch helps us to fix the
> issue.
> >>>>
> >>>> Thanks,
> >>>> Srinath.
> >>>>>
> >>>>> Robin.
> >>>>>
> >>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
> line argument.
> >>>>>>
> >>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>> ---
> >>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>>>> index 4f1a350..5e59c9d 100644
> >>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>       "Disable bypass streams such that incoming transactions from
> devices that are not attached to an iommu domain will report an abort back to
> the device and will not be allowed to pass through the SMMU.");
> >>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>
> >>>>>>   struct arm_smmu_s2cr {
> >>>>>>       struct iommu_group              *group;
> >>>>>> @@ -1566,7 +1569,7 @@ static void
> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>       struct iommu_resv_region *region;
> >>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> >>>>>>
> >>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> MSI_IOVA_LENGTH,
> >>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> MSI_IOVA_LENGTH,
> >>>>>>                                        prot,
> IOMMU_RESV_SW_MSI);
> >>>>>>       if (!region)
> >>>>>>               return;
> >>>>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28  9:15               ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28  9:15 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Robin Murphy, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Will Deacon, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 09:54
> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> Hi,
> 
> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> > [+ Shameer]
> >
> > On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >> Hi,
> >>
> >> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> <robin.murphy@arm.com> wrote:
> >>>>>
> >>>> Thanks Robin for your quick response.
> >>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>> This patch gives the provision to change default value of MSI IOVA base
> >>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> platform.
> >>>>>
> >>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>> map, and there was always the possibility that it wouldn't suit
> everything.
> >>>>>
> >>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> inaccessible
> >>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> dma-ranges
> >>>>>> property are reserved.
> >>>
> >>> I don't understand why we only reserve the PCIe windows for DMA
> domains.
> >>> Shouldn't VFIO also prevent userspace from mapping them?
> >>
> >> VFIO prevents userspace from DMA mapping iovas within reserved regions:
> >> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> range
> >
> > Right but I was asking specifically about the IOVA reservation introduced
> > by commit aadad097cd46. They are not registered as reserved regions within
> > the IOMMU core, they are only taken into account by dma-iommu.c when
> > creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
> of
> > those regions and they won't be seen by vfio_iommu_resv_exclude().
> >
> > It looks like the PCIe regions used to be common until cd2c9fcf5c66
> > ("iommu/dma: Move PCI window region reservation back into dma specific
> > path.") But I couldn't find the justification for this commit.
> 
> Yes I noticed that as well when debugging the above mentioned case
> before and after cd2c9fcf5c66. I do not remember about the rationale of
> removing the DMA host brige windows from the resv regions. Did it break
> a legacy case?
> >

I think yes. And going through the ML discussions, this was done so because with the 
" vfio/type1: Add support for valid iova list management" series you reported
an issue with Seattle platform. See the full discussion here,

https://lore.kernel.org/patchwork/patch/889012/

Cheers,
Shameer

> > The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> > allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> problem
> > reported by Srinath, because they could well choose an IOVA within the
> > PCIe window...
> I agree with you
> 
> Thanks
> 
> Eric
> >
> > Thanks,
> > Jean
> >
> >> but it does not prevent the SW MSI region chosen by the kernel from
> >> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>
> >>   If they were
> >>> part of the common reserved regions then we could have VFIO choose a
> >>> SW_MSI region among the remaining free space.
> >> As Robin said this was the initial chosen approach
> >> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >> MSI mapping
> >> https://patchwork.kernel.org/patch/8121641/
> >>
> >> Some additional background about why the static SW MSI region chosen by
> >> the kernel was later chosen:
> >> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> ml
> >>
> >> Thanks
> >>
> >> Eric
> >>
> >>
> >>  It would just need a
> >>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>> example with a domain attribute.
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>>>
> >>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> doorbells
> >>>>> dynamically, so they aren't affected by reserved regions any more than
> >>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>> region, since as the comment says, it *is* the software that manages
> those.
> >>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> because
> >>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> DMA pages.
> >>>>>
> >>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> case
> >>>>> the kernel *doesn't* control the address space, but still needs some way
> >>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>> about, and give userspace a fighting chance of knowing what it's taken.
> >>>>> I think at the time we discussed the idea of adding something to the
> >>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>> needed to, but decided we could live without that initially. Perhaps now
> >>>>> the time has come?
> >>>> Yes, we see issues only with user-space drivers(DPDK) in which
> MSI_IOVA_BASE
> >>>> region is considered to map MSI registers. This patch helps us to fix the
> issue.
> >>>>
> >>>> Thanks,
> >>>> Srinath.
> >>>>>
> >>>>> Robin.
> >>>>>
> >>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
> >>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
> line argument.
> >>>>>>
> >>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>> ---
> >>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>>>>> index 4f1a350..5e59c9d 100644
> >>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>       "Disable bypass streams such that incoming transactions from
> devices that are not attached to an iommu domain will report an abort back to
> the device and will not be allowed to pass through the SMMU.");
> >>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>
> >>>>>>   struct arm_smmu_s2cr {
> >>>>>>       struct iommu_group              *group;
> >>>>>> @@ -1566,7 +1569,7 @@ static void
> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>       struct iommu_resv_region *region;
> >>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> >>>>>>
> >>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> MSI_IOVA_LENGTH,
> >>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> MSI_IOVA_LENGTH,
> >>>>>>                                        prot,
> IOMMU_RESV_SW_MSI);
> >>>>>>       if (!region)
> >>>>>>               return;
> >>>>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  9:15               ` Shameerali Kolothum Thodi
  (?)
  (?)
@ 2020-05-28 11:47               ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shameer,

On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 09:54
>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>> <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>> Hi,
>>
>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>> [+ Shameer]
>>>
>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>> Hi,
>>>>
>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>> <robin.murphy@arm.com> wrote:
>>>>>>>
>>>>>> Thanks Robin for your quick response.
>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
>>>>>>>
>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>> map, and there was always the possibility that it wouldn't suit
>> everything.
>>>>>>>
>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>> inaccessible
>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>> dma-ranges
>>>>>>>> property are reserved.
>>>>>
>>>>> I don't understand why we only reserve the PCIe windows for DMA
>> domains.
>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>
>>>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>> range
>>>
>>> Right but I was asking specifically about the IOVA reservation introduced
>>> by commit aadad097cd46. They are not registered as reserved regions within
>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
>> of
>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>
>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>> path.") But I couldn't find the justification for this commit.
>>
>> Yes I noticed that as well when debugging the above mentioned case
>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>> removing the DMA host brige windows from the resv regions. Did it break
>> a legacy case?
>>>
> 
> I think yes. And going through the ML discussions, this was done so because with the 
> " vfio/type1: Add support for valid iova list management" series you reported
> an issue with Seattle platform. See the full discussion here,
> 
> https://lore.kernel.org/patchwork/patch/889012/

Hey thank you for reminding me of the Seattle case :-) Now I also recall
that, if I am not wrong, this also caused some trouble on some x86
platforms as well, reported by Alex? Maybe we should still report PCI
host bridge windows in the reserved regions, if possible/feasible tag
them differently from other reserved regions and not reject any VFIO
DMA_MAP colliding with them?

Thanks

Eric
> 
> Cheers,
> Shameer
> 
>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>> problem
>>> reported by Srinath, because they could well choose an IOVA within the
>>> PCIe window...
>> I agree with you
>>
>> Thanks
>>
>> Eric
>>>
>>> Thanks,
>>> Jean
>>>
>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>
>>>>   If they were
>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>> SW_MSI region among the remaining free space.
>>>> As Robin said this was the initial chosen approach
>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>> MSI mapping
>>>> https://patchwork.kernel.org/patch/8121641/
>>>>
>>>> Some additional background about why the static SW MSI region chosen by
>>>> the kernel was later chosen:
>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>> ml
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>  It would just need a
>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>> example with a domain attribute.
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>>>
>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>> doorbells
>>>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>> region, since as the comment says, it *is* the software that manages
>> those.
>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>> because
>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>> DMA pages.
>>>>>>>
>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>> case
>>>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>>>> the time has come?
>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>> MSI_IOVA_BASE
>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>> issue.
>>>>>>
>>>>>> Thanks,
>>>>>> Srinath.
>>>>>>>
>>>>>>> Robin.
>>>>>>>
>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
>> line argument.
>>>>>>>>
>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>> ---
>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort back to
>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>
>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>       struct iommu_group              *group;
>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>> IOMMU_MMIO;
>>>>>>>>
>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>> MSI_IOVA_LENGTH,
>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>> MSI_IOVA_LENGTH,
>>>>>>>>                                        prot,
>> IOMMU_RESV_SW_MSI);
>>>>>>>>       if (!region)
>>>>>>>>               return;
>>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28  9:15               ` Shameerali Kolothum Thodi
  (?)
@ 2020-05-28 11:48                 ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 11:48 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Robin Murphy, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Will Deacon, Linux ARM



On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 09:54
>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>> <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>> Hi,
>>
>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>> [+ Shameer]
>>>
>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>> Hi,
>>>>
>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>> <robin.murphy@arm.com> wrote:
>>>>>>>
>>>>>> Thanks Robin for your quick response.
>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
>>>>>>>
>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>> map, and there was always the possibility that it wouldn't suit
>> everything.
>>>>>>>
>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>> inaccessible
>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>> dma-ranges
>>>>>>>> property are reserved.
>>>>>
>>>>> I don't understand why we only reserve the PCIe windows for DMA
>> domains.
>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>
>>>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>> range
>>>
>>> Right but I was asking specifically about the IOVA reservation introduced
>>> by commit aadad097cd46. They are not registered as reserved regions within
>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
>> of
>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>
>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>> path.") But I couldn't find the justification for this commit.
>>
>> Yes I noticed that as well when debugging the above mentioned case
>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>> removing the DMA host brige windows from the resv regions. Did it break
>> a legacy case?
>>>
> 
> I think yes. And going through the ML discussions, this was done so because with the 
> " vfio/type1: Add support for valid iova list management" series you reported
> an issue with Seattle platform. See the full discussion here,
> 
> https://lore.kernel.org/patchwork/patch/889012/

Hey thank you for reminding me of the Seattle case :-) Now I also recall
that, if I am not wrong, this also caused some trouble on some x86
platforms as well, reported by Alex? Maybe we should still report PCI
host bridge windows in the reserved regions, if possible/feasible tag
them differently from other reserved regions and not reject any VFIO
DMA_MAP colliding with them?

Thanks

Eric
> 
> Cheers,
> Shameer
> 
>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>> problem
>>> reported by Srinath, because they could well choose an IOVA within the
>>> PCIe window...
>> I agree with you
>>
>> Thanks
>>
>> Eric
>>>
>>> Thanks,
>>> Jean
>>>
>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>
>>>>   If they were
>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>> SW_MSI region among the remaining free space.
>>>> As Robin said this was the initial chosen approach
>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>> MSI mapping
>>>> https://patchwork.kernel.org/patch/8121641/
>>>>
>>>> Some additional background about why the static SW MSI region chosen by
>>>> the kernel was later chosen:
>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>> ml
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>  It would just need a
>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>> example with a domain attribute.
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>>>
>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>> doorbells
>>>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>> region, since as the comment says, it *is* the software that manages
>> those.
>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>> because
>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>> DMA pages.
>>>>>>>
>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>> case
>>>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>>>> the time has come?
>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>> MSI_IOVA_BASE
>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>> issue.
>>>>>>
>>>>>> Thanks,
>>>>>> Srinath.
>>>>>>>
>>>>>>> Robin.
>>>>>>>
>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
>> line argument.
>>>>>>>>
>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>> ---
>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort back to
>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>
>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>       struct iommu_group              *group;
>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>> IOMMU_MMIO;
>>>>>>>>
>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>> MSI_IOVA_LENGTH,
>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>> MSI_IOVA_LENGTH,
>>>>>>>>                                        prot,
>> IOMMU_RESV_SW_MSI);
>>>>>>>>       if (!region)
>>>>>>>>               return;
>>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 11:48                 ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 11:48 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Will Deacon, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Robin Murphy, Linux ARM



On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 09:54
>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>> <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>> Hi,
>>
>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>> [+ Shameer]
>>>
>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>> Hi,
>>>>
>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>> <robin.murphy@arm.com> wrote:
>>>>>>>
>>>>>> Thanks Robin for your quick response.
>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
>>>>>>>
>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>> map, and there was always the possibility that it wouldn't suit
>> everything.
>>>>>>>
>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>> inaccessible
>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>> dma-ranges
>>>>>>>> property are reserved.
>>>>>
>>>>> I don't understand why we only reserve the PCIe windows for DMA
>> domains.
>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>
>>>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>> range
>>>
>>> Right but I was asking specifically about the IOVA reservation introduced
>>> by commit aadad097cd46. They are not registered as reserved regions within
>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
>> of
>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>
>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>> path.") But I couldn't find the justification for this commit.
>>
>> Yes I noticed that as well when debugging the above mentioned case
>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>> removing the DMA host brige windows from the resv regions. Did it break
>> a legacy case?
>>>
> 
> I think yes. And going through the ML discussions, this was done so because with the 
> " vfio/type1: Add support for valid iova list management" series you reported
> an issue with Seattle platform. See the full discussion here,
> 
> https://lore.kernel.org/patchwork/patch/889012/

Hey thank you for reminding me of the Seattle case :-) Now I also recall
that, if I am not wrong, this also caused some trouble on some x86
platforms as well, reported by Alex? Maybe we should still report PCI
host bridge windows in the reserved regions, if possible/feasible tag
them differently from other reserved regions and not reject any VFIO
DMA_MAP colliding with them?

Thanks

Eric
> 
> Cheers,
> Shameer
> 
>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>> problem
>>> reported by Srinath, because they could well choose an IOVA within the
>>> PCIe window...
>> I agree with you
>>
>> Thanks
>>
>> Eric
>>>
>>> Thanks,
>>> Jean
>>>
>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>
>>>>   If they were
>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>> SW_MSI region among the remaining free space.
>>>> As Robin said this was the initial chosen approach
>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>> MSI mapping
>>>> https://patchwork.kernel.org/patch/8121641/
>>>>
>>>> Some additional background about why the static SW MSI region chosen by
>>>> the kernel was later chosen:
>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>> ml
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>  It would just need a
>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>> example with a domain attribute.
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>>>
>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>> doorbells
>>>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>> region, since as the comment says, it *is* the software that manages
>> those.
>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>> because
>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>> DMA pages.
>>>>>>>
>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>> case
>>>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>>>> the time has come?
>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>> MSI_IOVA_BASE
>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>> issue.
>>>>>>
>>>>>> Thanks,
>>>>>> Srinath.
>>>>>>>
>>>>>>> Robin.
>>>>>>>
>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
>> line argument.
>>>>>>>>
>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>> ---
>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort back to
>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>
>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>       struct iommu_group              *group;
>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>> IOMMU_MMIO;
>>>>>>>>
>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>> MSI_IOVA_LENGTH,
>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>> MSI_IOVA_LENGTH,
>>>>>>>>                                        prot,
>> IOMMU_RESV_SW_MSI);
>>>>>>>>       if (!region)
>>>>>>>>               return;
>>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 11:48                 ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 11:48 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Will Deacon, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Robin Murphy, Linux ARM



On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 09:54
>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>> <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>> <robin.murphy@arm.com>; Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>> Hi,
>>
>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>> [+ Shameer]
>>>
>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>> Hi,
>>>>
>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>> <robin.murphy@arm.com> wrote:
>>>>>>>
>>>>>> Thanks Robin for your quick response.
>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>> This patch gives the provision to change default value of MSI IOVA base
>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>> platform.
>>>>>>>
>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>> map, and there was always the possibility that it wouldn't suit
>> everything.
>>>>>>>
>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>> inaccessible
>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>> dma-ranges
>>>>>>>> property are reserved.
>>>>>
>>>>> I don't understand why we only reserve the PCIe windows for DMA
>> domains.
>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>
>>>> VFIO prevents userspace from DMA mapping iovas within reserved regions:
>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>> range
>>>
>>> Right but I was asking specifically about the IOVA reservation introduced
>>> by commit aadad097cd46. They are not registered as reserved regions within
>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't aware
>> of
>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>
>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>> path.") But I couldn't find the justification for this commit.
>>
>> Yes I noticed that as well when debugging the above mentioned case
>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>> removing the DMA host brige windows from the resv regions. Did it break
>> a legacy case?
>>>
> 
> I think yes. And going through the ML discussions, this was done so because with the 
> " vfio/type1: Add support for valid iova list management" series you reported
> an issue with Seattle platform. See the full discussion here,
> 
> https://lore.kernel.org/patchwork/patch/889012/

Hey thank you for reminding me of the Seattle case :-) Now I also recall
that, if I am not wrong, this also caused some trouble on some x86
platforms as well, reported by Alex? Maybe we should still report PCI
host bridge windows in the reserved regions, if possible/feasible tag
them differently from other reserved regions and not reject any VFIO
DMA_MAP colliding with them?

Thanks

Eric
> 
> Cheers,
> Shameer
> 
>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>> problem
>>> reported by Srinath, because they could well choose an IOVA within the
>>> PCIe window...
>> I agree with you
>>
>> Thanks
>>
>> Eric
>>>
>>> Thanks,
>>> Jean
>>>
>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>
>>>>   If they were
>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>> SW_MSI region among the remaining free space.
>>>> As Robin said this was the initial chosen approach
>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>> MSI mapping
>>>> https://patchwork.kernel.org/patch/8121641/
>>>>
>>>> Some additional background about why the static SW MSI region chosen by
>>>> the kernel was later chosen:
>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>> ml
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>
>>>>  It would just need a
>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>> example with a domain attribute.
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>>>
>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>> doorbells
>>>>>>> dynamically, so they aren't affected by reserved regions any more than
>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>> region, since as the comment says, it *is* the software that manages
>> those.
>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>> because
>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>> DMA pages.
>>>>>>>
>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>> case
>>>>>>> the kernel *doesn't* control the address space, but still needs some way
>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>> about, and give userspace a fighting chance of knowing what it's taken.
>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>> needed to, but decided we could live without that initially. Perhaps now
>>>>>>> the time has come?
>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>> MSI_IOVA_BASE
>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>> issue.
>>>>>>
>>>>>> Thanks,
>>>>>> Srinath.
>>>>>>>
>>>>>>> Robin.
>>>>>>>
>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it can
>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000" command
>> line argument.
>>>>>>>>
>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>> ---
>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>       "Disable bypass streams such that incoming transactions from
>> devices that are not attached to an iommu domain will report an abort back to
>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>
>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>       struct iommu_group              *group;
>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>> IOMMU_MMIO;
>>>>>>>>
>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>> MSI_IOVA_LENGTH,
>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>> MSI_IOVA_LENGTH,
>>>>>>>>                                        prot,
>> IOMMU_RESV_SW_MSI);
>>>>>>>>       if (!region)
>>>>>>>>               return;
>>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28 11:48                 ` Auger Eric
  (?)
@ 2020-05-28 12:09                   ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28 12:09 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Robin Murphy, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Will Deacon, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 12:48
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
> List <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
> Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> 
> 
> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Auger Eric [mailto:eric.auger@redhat.com]
> >> Sent: 28 May 2020 09:54
> >> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> >> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> >> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> >> <linux-kernel@vger.kernel.org>; Alex Williamson
> >> <alex.williamson@redhat.com>; Srinath Mannam
> >> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> >> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> >> <robin.murphy@arm.com>; Linux ARM
> <linux-arm-kernel@lists.infradead.org>
> >> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
> msi
> >> iova address
> >>
> >> Hi,
> >>
> >> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> >>> [+ Shameer]
> >>>
> >>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >>>> Hi,
> >>>>
> >>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> >> <robin.murphy@arm.com> wrote:
> >>>>>>>
> >>>>>> Thanks Robin for your quick response.
> >>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>>>> This patch gives the provision to change default value of MSI IOVA
> base
> >>>>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> >> platform.
> >>>>>>>
> >>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>>>> map, and there was always the possibility that it wouldn't suit
> >> everything.
> >>>>>>>
> >>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> >> inaccessible
> >>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> >> dma-ranges
> >>>>>>>> property are reserved.
> >>>>>
> >>>>> I don't understand why we only reserve the PCIe windows for DMA
> >> domains.
> >>>>> Shouldn't VFIO also prevent userspace from mapping them?
> >>>>
> >>>> VFIO prevents userspace from DMA mapping iovas within reserved
> regions:
> >>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> >> range
> >>>
> >>> Right but I was asking specifically about the IOVA reservation introduced
> >>> by commit aadad097cd46. They are not registered as reserved regions
> within
> >>> the IOMMU core, they are only taken into account by dma-iommu.c when
> >>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
> aware
> >> of
> >>> those regions and they won't be seen by vfio_iommu_resv_exclude().
> >>>
> >>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> >>> ("iommu/dma: Move PCI window region reservation back into dma specific
> >>> path.") But I couldn't find the justification for this commit.
> >>
> >> Yes I noticed that as well when debugging the above mentioned case
> >> before and after cd2c9fcf5c66. I do not remember about the rationale of
> >> removing the DMA host brige windows from the resv regions. Did it break
> >> a legacy case?
> >>>
> >
> > I think yes. And going through the ML discussions, this was done so because
> with the
> > " vfio/type1: Add support for valid iova list management" series you reported
> > an issue with Seattle platform. See the full discussion here,
> >
> > https://lore.kernel.org/patchwork/patch/889012/
> 
> Hey thank you for reminding me of the Seattle case :-) Now I also recall
> that, if I am not wrong, this also caused some trouble on some x86
> platforms as well, reported by Alex? 

True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
to exclude those[2]

Maybe we should still report PCI
> host bridge windows in the reserved regions, if possible/feasible tag
> them differently from other reserved regions and not reject any VFIO
> DMA_MAP colliding with them?

I guess that is possible. But current interface is to report the regions that are safe
from a IOMMU transaction point of view and I am not sure PCI window regions 
comes under that.

Thanks,
Shameer

1. https://lkml.org/lkml/2018/6/5/760
2. https://lore.kernel.org/patchwork/cover/1083072/

> Thanks
> 
> Eric
> >
> > Cheers,
> > Shameer
> >
> >>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> >>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> >> problem
> >>> reported by Srinath, because they could well choose an IOVA within the
> >>> PCIe window...
> >> I agree with you
> >>
> >> Thanks
> >>
> >> Eric
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>> but it does not prevent the SW MSI region chosen by the kernel from
> >>>> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>>>
> >>>>   If they were
> >>>>> part of the common reserved regions then we could have VFIO choose a
> >>>>> SW_MSI region among the remaining free space.
> >>>> As Robin said this was the initial chosen approach
> >>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >>>> MSI mapping
> >>>> https://patchwork.kernel.org/patch/8121641/
> >>>>
> >>>> Some additional background about why the static SW MSI region chosen
> by
> >>>> the kernel was later chosen:
> >>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>>>
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> >> ml
> >>>>
> >>>> Thanks
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>  It would just need a
> >>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>>>> example with a domain attribute.
> >>>>>
> >>>>> Thanks,
> >>>>> Jean
> >>>>>
> >>>>>>>
> >>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> >> doorbells
> >>>>>>> dynamically, so they aren't affected by reserved regions any more
> than
> >>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>>>> region, since as the comment says, it *is* the software that manages
> >> those.
> >>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> >> because
> >>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> >> DMA pages.
> >>>>>>>
> >>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> >> case
> >>>>>>> the kernel *doesn't* control the address space, but still needs some
> way
> >>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>>>> about, and give userspace a fighting chance of knowing what it's
> taken.
> >>>>>>> I think at the time we discussed the idea of adding something to the
> >>>>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>>>> needed to, but decided we could live without that initially. Perhaps
> now
> >>>>>>> the time has come?
> >>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
> >> MSI_IOVA_BASE
> >>>>>> region is considered to map MSI registers. This patch helps us to fix the
> >> issue.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Srinath.
> >>>>>>>
> >>>>>>> Robin.
> >>>>>>>
> >>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
> can
> >>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
> command
> >> line argument.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>>>> ---
> >>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
> b/drivers/iommu/arm-smmu.c
> >>>>>>>> index 4f1a350..5e59c9d 100644
> >>>>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>>>       "Disable bypass streams such that incoming transactions
> from
> >> devices that are not attached to an iommu domain will report an abort back
> to
> >> the device and will not be allowed to pass through the SMMU.");
> >>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>>>
> >>>>>>>>   struct arm_smmu_s2cr {
> >>>>>>>>       struct iommu_group              *group;
> >>>>>>>> @@ -1566,7 +1569,7 @@ static void
> >> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>>>       struct iommu_resv_region *region;
> >>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> >> IOMMU_MMIO;
> >>>>>>>>
> >>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> >> MSI_IOVA_LENGTH,
> >>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> >> MSI_IOVA_LENGTH,
> >>>>>>>>                                        prot,
> >> IOMMU_RESV_SW_MSI);
> >>>>>>>>       if (!region)
> >>>>>>>>               return;
> >>>>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-arm-kernel mailing list
> >>>>> linux-arm-kernel@lists.infradead.org
> >>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 12:09                   ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28 12:09 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Will Deacon, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Robin Murphy, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 12:48
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
> List <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
> Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> 
> 
> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Auger Eric [mailto:eric.auger@redhat.com]
> >> Sent: 28 May 2020 09:54
> >> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> >> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> >> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> >> <linux-kernel@vger.kernel.org>; Alex Williamson
> >> <alex.williamson@redhat.com>; Srinath Mannam
> >> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> >> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> >> <robin.murphy@arm.com>; Linux ARM
> <linux-arm-kernel@lists.infradead.org>
> >> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
> msi
> >> iova address
> >>
> >> Hi,
> >>
> >> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> >>> [+ Shameer]
> >>>
> >>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >>>> Hi,
> >>>>
> >>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> >> <robin.murphy@arm.com> wrote:
> >>>>>>>
> >>>>>> Thanks Robin for your quick response.
> >>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>>>> This patch gives the provision to change default value of MSI IOVA
> base
> >>>>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> >> platform.
> >>>>>>>
> >>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>>>> map, and there was always the possibility that it wouldn't suit
> >> everything.
> >>>>>>>
> >>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> >> inaccessible
> >>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> >> dma-ranges
> >>>>>>>> property are reserved.
> >>>>>
> >>>>> I don't understand why we only reserve the PCIe windows for DMA
> >> domains.
> >>>>> Shouldn't VFIO also prevent userspace from mapping them?
> >>>>
> >>>> VFIO prevents userspace from DMA mapping iovas within reserved
> regions:
> >>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> >> range
> >>>
> >>> Right but I was asking specifically about the IOVA reservation introduced
> >>> by commit aadad097cd46. They are not registered as reserved regions
> within
> >>> the IOMMU core, they are only taken into account by dma-iommu.c when
> >>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
> aware
> >> of
> >>> those regions and they won't be seen by vfio_iommu_resv_exclude().
> >>>
> >>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> >>> ("iommu/dma: Move PCI window region reservation back into dma specific
> >>> path.") But I couldn't find the justification for this commit.
> >>
> >> Yes I noticed that as well when debugging the above mentioned case
> >> before and after cd2c9fcf5c66. I do not remember about the rationale of
> >> removing the DMA host brige windows from the resv regions. Did it break
> >> a legacy case?
> >>>
> >
> > I think yes. And going through the ML discussions, this was done so because
> with the
> > " vfio/type1: Add support for valid iova list management" series you reported
> > an issue with Seattle platform. See the full discussion here,
> >
> > https://lore.kernel.org/patchwork/patch/889012/
> 
> Hey thank you for reminding me of the Seattle case :-) Now I also recall
> that, if I am not wrong, this also caused some trouble on some x86
> platforms as well, reported by Alex? 

True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
to exclude those[2]

Maybe we should still report PCI
> host bridge windows in the reserved regions, if possible/feasible tag
> them differently from other reserved regions and not reject any VFIO
> DMA_MAP colliding with them?

I guess that is possible. But current interface is to report the regions that are safe
from a IOMMU transaction point of view and I am not sure PCI window regions 
comes under that.

Thanks,
Shameer

1. https://lkml.org/lkml/2018/6/5/760
2. https://lore.kernel.org/patchwork/cover/1083072/

> Thanks
> 
> Eric
> >
> > Cheers,
> > Shameer
> >
> >>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> >>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> >> problem
> >>> reported by Srinath, because they could well choose an IOVA within the
> >>> PCIe window...
> >> I agree with you
> >>
> >> Thanks
> >>
> >> Eric
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>> but it does not prevent the SW MSI region chosen by the kernel from
> >>>> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>>>
> >>>>   If they were
> >>>>> part of the common reserved regions then we could have VFIO choose a
> >>>>> SW_MSI region among the remaining free space.
> >>>> As Robin said this was the initial chosen approach
> >>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >>>> MSI mapping
> >>>> https://patchwork.kernel.org/patch/8121641/
> >>>>
> >>>> Some additional background about why the static SW MSI region chosen
> by
> >>>> the kernel was later chosen:
> >>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>>>
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> >> ml
> >>>>
> >>>> Thanks
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>  It would just need a
> >>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>>>> example with a domain attribute.
> >>>>>
> >>>>> Thanks,
> >>>>> Jean
> >>>>>
> >>>>>>>
> >>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> >> doorbells
> >>>>>>> dynamically, so they aren't affected by reserved regions any more
> than
> >>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>>>> region, since as the comment says, it *is* the software that manages
> >> those.
> >>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> >> because
> >>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> >> DMA pages.
> >>>>>>>
> >>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> >> case
> >>>>>>> the kernel *doesn't* control the address space, but still needs some
> way
> >>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>>>> about, and give userspace a fighting chance of knowing what it's
> taken.
> >>>>>>> I think at the time we discussed the idea of adding something to the
> >>>>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>>>> needed to, but decided we could live without that initially. Perhaps
> now
> >>>>>>> the time has come?
> >>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
> >> MSI_IOVA_BASE
> >>>>>> region is considered to map MSI registers. This patch helps us to fix the
> >> issue.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Srinath.
> >>>>>>>
> >>>>>>> Robin.
> >>>>>>>
> >>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
> can
> >>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
> command
> >> line argument.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>>>> ---
> >>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
> b/drivers/iommu/arm-smmu.c
> >>>>>>>> index 4f1a350..5e59c9d 100644
> >>>>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>>>       "Disable bypass streams such that incoming transactions
> from
> >> devices that are not attached to an iommu domain will report an abort back
> to
> >> the device and will not be allowed to pass through the SMMU.");
> >>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>>>
> >>>>>>>>   struct arm_smmu_s2cr {
> >>>>>>>>       struct iommu_group              *group;
> >>>>>>>> @@ -1566,7 +1569,7 @@ static void
> >> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>>>       struct iommu_resv_region *region;
> >>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> >> IOMMU_MMIO;
> >>>>>>>>
> >>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> >> MSI_IOVA_LENGTH,
> >>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> >> MSI_IOVA_LENGTH,
> >>>>>>>>                                        prot,
> >> IOMMU_RESV_SW_MSI);
> >>>>>>>>       if (!region)
> >>>>>>>>               return;
> >>>>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-arm-kernel mailing list
> >>>>> linux-arm-kernel@lists.infradead.org
> >>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* RE: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 12:09                   ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 37+ messages in thread
From: Shameerali Kolothum Thodi @ 2020-05-28 12:09 UTC (permalink / raw)
  To: Auger Eric, Jean-Philippe Brucker
  Cc: Will Deacon, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Robin Murphy, Linux ARM



> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 28 May 2020 12:48
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
> List <linux-kernel@vger.kernel.org>; Alex Williamson
> <alex.williamson@redhat.com>; Srinath Mannam
> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
> Linux ARM <linux-arm-kernel@lists.infradead.org>
> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
> iova address
> 
> 
> 
> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Auger Eric [mailto:eric.auger@redhat.com]
> >> Sent: 28 May 2020 09:54
> >> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
> >> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
> >> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
> >> <linux-kernel@vger.kernel.org>; Alex Williamson
> >> <alex.williamson@redhat.com>; Srinath Mannam
> >> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
> >> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
> >> <robin.murphy@arm.com>; Linux ARM
> <linux-arm-kernel@lists.infradead.org>
> >> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
> msi
> >> iova address
> >>
> >> Hi,
> >>
> >> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
> >>> [+ Shameer]
> >>>
> >>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
> >>>> Hi,
> >>>>
> >>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
> >>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
> >>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
> >> <robin.murphy@arm.com> wrote:
> >>>>>>>
> >>>>>> Thanks Robin for your quick response.
> >>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
> >>>>>>>> This patch gives the provision to change default value of MSI IOVA
> base
> >>>>>>>> to platform's suitable IOVA using module parameter. The present
> >>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
> >> platform.
> >>>>>>>
> >>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
> >>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
> >>>>>>> map, and there was always the possibility that it wouldn't suit
> >> everything.
> >>>>>>>
> >>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
> >> inaccessible
> >>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
> >> dma-ranges
> >>>>>>>> property are reserved.
> >>>>>
> >>>>> I don't understand why we only reserve the PCIe windows for DMA
> >> domains.
> >>>>> Shouldn't VFIO also prevent userspace from mapping them?
> >>>>
> >>>> VFIO prevents userspace from DMA mapping iovas within reserved
> regions:
> >>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
> >> range
> >>>
> >>> Right but I was asking specifically about the IOVA reservation introduced
> >>> by commit aadad097cd46. They are not registered as reserved regions
> within
> >>> the IOMMU core, they are only taken into account by dma-iommu.c when
> >>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
> aware
> >> of
> >>> those regions and they won't be seen by vfio_iommu_resv_exclude().
> >>>
> >>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
> >>> ("iommu/dma: Move PCI window region reservation back into dma specific
> >>> path.") But I couldn't find the justification for this commit.
> >>
> >> Yes I noticed that as well when debugging the above mentioned case
> >> before and after cd2c9fcf5c66. I do not remember about the rationale of
> >> removing the DMA host brige windows from the resv regions. Did it break
> >> a legacy case?
> >>>
> >
> > I think yes. And going through the ML discussions, this was done so because
> with the
> > " vfio/type1: Add support for valid iova list management" series you reported
> > an issue with Seattle platform. See the full discussion here,
> >
> > https://lore.kernel.org/patchwork/patch/889012/
> 
> Hey thank you for reminding me of the Seattle case :-) Now I also recall
> that, if I am not wrong, this also caused some trouble on some x86
> platforms as well, reported by Alex? 

True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
to exclude those[2]

Maybe we should still report PCI
> host bridge windows in the reserved regions, if possible/feasible tag
> them differently from other reserved regions and not reject any VFIO
> DMA_MAP colliding with them?

I guess that is possible. But current interface is to report the regions that are safe
from a IOMMU transaction point of view and I am not sure PCI window regions 
comes under that.

Thanks,
Shameer

1. https://lkml.org/lkml/2018/6/5/760
2. https://lore.kernel.org/patchwork/cover/1083072/

> Thanks
> 
> Eric
> >
> > Cheers,
> > Shameer
> >
> >>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
> >>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
> >> problem
> >>> reported by Srinath, because they could well choose an IOVA within the
> >>> PCIe window...
> >> I agree with you
> >>
> >> Thanks
> >>
> >> Eric
> >>>
> >>> Thanks,
> >>> Jean
> >>>
> >>>> but it does not prevent the SW MSI region chosen by the kernel from
> >>>> colliding with other reserved regions (esp. PCIe host bridge windows).
> >>>>
> >>>>   If they were
> >>>>> part of the common reserved regions then we could have VFIO choose a
> >>>>> SW_MSI region among the remaining free space.
> >>>> As Robin said this was the initial chosen approach
> >>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
> >>>> MSI mapping
> >>>> https://patchwork.kernel.org/patch/8121641/
> >>>>
> >>>> Some additional background about why the static SW MSI region chosen
> by
> >>>> the kernel was later chosen:
> >>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
> >>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
> >>>>
> >>
> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
> >> ml
> >>>>
> >>>> Thanks
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>  It would just need a
> >>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
> >>>>> example with a domain attribute.
> >>>>>
> >>>>> Thanks,
> >>>>> Jean
> >>>>>
> >>>>>>>
> >>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
> >> doorbells
> >>>>>>> dynamically, so they aren't affected by reserved regions any more
> than
> >>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
> >>>>>>> region, since as the comment says, it *is* the software that manages
> >> those.
> >>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
> >> because
> >>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
> >> DMA pages.
> >>>>>>>
> >>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
> >> case
> >>>>>>> the kernel *doesn't* control the address space, but still needs some
> way
> >>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
> >>>>>>> about, and give userspace a fighting chance of knowing what it's
> taken.
> >>>>>>> I think at the time we discussed the idea of adding something to the
> >>>>>>> VFIO uapi such that userspace could move this around if it wanted or
> >>>>>>> needed to, but decided we could live without that initially. Perhaps
> now
> >>>>>>> the time has come?
> >>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
> >> MSI_IOVA_BASE
> >>>>>> region is considered to map MSI registers. This patch helps us to fix the
> >> issue.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Srinath.
> >>>>>>>
> >>>>>>> Robin.
> >>>>>>>
> >>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
> can
> >>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
> command
> >> line argument.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
> >>>>>>>> ---
> >>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
> >>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
> b/drivers/iommu/arm-smmu.c
> >>>>>>>> index 4f1a350..5e59c9d 100644
> >>>>>>>> --- a/drivers/iommu/arm-smmu.c
> >>>>>>>> +++ b/drivers/iommu/arm-smmu.c
> >>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
> >>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
> >>>>>>>>   MODULE_PARM_DESC(disable_bypass,
> >>>>>>>>       "Disable bypass streams such that incoming transactions
> from
> >> devices that are not attached to an iommu domain will report an abort back
> to
> >> the device and will not be allowed to pass through the SMMU.");
> >>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
> >>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
> >>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
> >>>>>>>>
> >>>>>>>>   struct arm_smmu_s2cr {
> >>>>>>>>       struct iommu_group              *group;
> >>>>>>>> @@ -1566,7 +1569,7 @@ static void
> >> arm_smmu_get_resv_regions(struct device *dev,
> >>>>>>>>       struct iommu_resv_region *region;
> >>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> >> IOMMU_MMIO;
> >>>>>>>>
> >>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
> >> MSI_IOVA_LENGTH,
> >>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
> >> MSI_IOVA_LENGTH,
> >>>>>>>>                                        prot,
> >> IOMMU_RESV_SW_MSI);
> >>>>>>>>       if (!region)
> >>>>>>>>               return;
> >>>>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> linux-arm-kernel mailing list
> >>>>> linux-arm-kernel@lists.infradead.org
> >>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> linux-arm-kernel mailing list
> >>> linux-arm-kernel@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >>>
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
  2020-05-28 12:09                   ` Shameerali Kolothum Thodi
  (?)
@ 2020-05-28 13:11                     ` Auger Eric
  -1 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 13:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Will Deacon, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Robin Murphy, Linux ARM

Hi Shameer,

On 5/28/20 2:09 PM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 12:48
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
>> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
>> List <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
>> Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>>
>>
>> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>> Sent: 28 May 2020 09:54
>>>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>>>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>>>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>>>> <linux-kernel@vger.kernel.org>; Alex Williamson
>>>> <alex.williamson@redhat.com>; Srinath Mannam
>>>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>>>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>>>> <robin.murphy@arm.com>; Linux ARM
>> <linux-arm-kernel@lists.infradead.org>
>>>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
>> msi
>>>> iova address
>>>>
>>>> Hi,
>>>>
>>>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>>>> [+ Shameer]
>>>>>
>>>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>>>> <robin.murphy@arm.com> wrote:
>>>>>>>>>
>>>>>>>> Thanks Robin for your quick response.
>>>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>>>> This patch gives the provision to change default value of MSI IOVA
>> base
>>>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>>>> platform.
>>>>>>>>>
>>>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>>>> map, and there was always the possibility that it wouldn't suit
>>>> everything.
>>>>>>>>>
>>>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>>>> inaccessible
>>>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>>>> dma-ranges
>>>>>>>>>> property are reserved.
>>>>>>>
>>>>>>> I don't understand why we only reserve the PCIe windows for DMA
>>>> domains.
>>>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>>>
>>>>>> VFIO prevents userspace from DMA mapping iovas within reserved
>> regions:
>>>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>>>> range
>>>>>
>>>>> Right but I was asking specifically about the IOVA reservation introduced
>>>>> by commit aadad097cd46. They are not registered as reserved regions
>> within
>>>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
>> aware
>>>> of
>>>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>>>
>>>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>>>> path.") But I couldn't find the justification for this commit.
>>>>
>>>> Yes I noticed that as well when debugging the above mentioned case
>>>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>>>> removing the DMA host brige windows from the resv regions. Did it break
>>>> a legacy case?
>>>>>
>>>
>>> I think yes. And going through the ML discussions, this was done so because
>> with the
>>> " vfio/type1: Add support for valid iova list management" series you reported
>>> an issue with Seattle platform. See the full discussion here,
>>>
>>> https://lore.kernel.org/patchwork/patch/889012/
>>
>> Hey thank you for reminding me of the Seattle case :-) Now I also recall
>> that, if I am not wrong, this also caused some trouble on some x86
>> platforms as well, reported by Alex? 
> 
> True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
> And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
> to exclude those[2]
I thought we also had the case of RESERVED regions but anyway.
> 
> Maybe we should still report PCI
>> host bridge windows in the reserved regions, if possible/feasible tag
>> them differently from other reserved regions and not reject any VFIO
>> DMA_MAP colliding with them?
> 
> I guess that is possible. But current interface is to report the regions that are safe
> from a IOMMU transaction point of view and I am not sure PCI window regions 
> comes under that.
yes only the sysfs interface could expose them at the moment.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
> 1. https://lkml.org/lkml/2018/6/5/760
> 2. https://lore.kernel.org/patchwork/cover/1083072/
> 
>> Thanks
>>
>> Eric
>>>
>>> Cheers,
>>> Shameer
>>>
>>>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>>>> problem
>>>>> reported by Srinath, because they could well choose an IOVA within the
>>>>> PCIe window...
>>>> I agree with you
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>>>
>>>>>>   If they were
>>>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>>>> SW_MSI region among the remaining free space.
>>>>>> As Robin said this was the initial chosen approach
>>>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>>>> MSI mapping
>>>>>> https://patchwork.kernel.org/patch/8121641/
>>>>>>
>>>>>> Some additional background about why the static SW MSI region chosen
>> by
>>>>>> the kernel was later chosen:
>>>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>>>
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>>>> ml
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>  It would just need a
>>>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>>>> example with a domain attribute.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jean
>>>>>>>
>>>>>>>>>
>>>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>>>> doorbells
>>>>>>>>> dynamically, so they aren't affected by reserved regions any more
>> than
>>>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>>>> region, since as the comment says, it *is* the software that manages
>>>> those.
>>>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>>>> because
>>>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>>>> DMA pages.
>>>>>>>>>
>>>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>>>> case
>>>>>>>>> the kernel *doesn't* control the address space, but still needs some
>> way
>>>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>>>> about, and give userspace a fighting chance of knowing what it's
>> taken.
>>>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>>>> needed to, but decided we could live without that initially. Perhaps
>> now
>>>>>>>>> the time has come?
>>>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>>>> MSI_IOVA_BASE
>>>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>>>> issue.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Srinath.
>>>>>>>>>
>>>>>>>>> Robin.
>>>>>>>>>
>>>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
>> can
>>>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
>> command
>>>> line argument.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>>>> ---
>>>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
>> b/drivers/iommu/arm-smmu.c
>>>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>>>       "Disable bypass streams such that incoming transactions
>> from
>>>> devices that are not attached to an iommu domain will report an abort back
>> to
>>>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>>>
>>>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>>>       struct iommu_group              *group;
>>>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>>>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>>>> IOMMU_MMIO;
>>>>>>>>>>
>>>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>>                                        prot,
>>>> IOMMU_RESV_SW_MSI);
>>>>>>>>>>       if (!region)
>>>>>>>>>>               return;
>>>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> linux-arm-kernel mailing list
>>>>>>> linux-arm-kernel@lists.infradead.org
>>>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 13:11                     ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 13:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Robin Murphy, iommu, Linux Kernel Mailing List, Alex Williamson,
	Srinath Mannam, BCM Kernel Feedback, Will Deacon, Linux ARM

Hi Shameer,

On 5/28/20 2:09 PM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 12:48
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
>> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
>> List <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
>> Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>>
>>
>> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>> Sent: 28 May 2020 09:54
>>>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>>>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>>>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>>>> <linux-kernel@vger.kernel.org>; Alex Williamson
>>>> <alex.williamson@redhat.com>; Srinath Mannam
>>>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>>>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>>>> <robin.murphy@arm.com>; Linux ARM
>> <linux-arm-kernel@lists.infradead.org>
>>>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
>> msi
>>>> iova address
>>>>
>>>> Hi,
>>>>
>>>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>>>> [+ Shameer]
>>>>>
>>>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>>>> <robin.murphy@arm.com> wrote:
>>>>>>>>>
>>>>>>>> Thanks Robin for your quick response.
>>>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>>>> This patch gives the provision to change default value of MSI IOVA
>> base
>>>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>>>> platform.
>>>>>>>>>
>>>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>>>> map, and there was always the possibility that it wouldn't suit
>>>> everything.
>>>>>>>>>
>>>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>>>> inaccessible
>>>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>>>> dma-ranges
>>>>>>>>>> property are reserved.
>>>>>>>
>>>>>>> I don't understand why we only reserve the PCIe windows for DMA
>>>> domains.
>>>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>>>
>>>>>> VFIO prevents userspace from DMA mapping iovas within reserved
>> regions:
>>>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>>>> range
>>>>>
>>>>> Right but I was asking specifically about the IOVA reservation introduced
>>>>> by commit aadad097cd46. They are not registered as reserved regions
>> within
>>>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
>> aware
>>>> of
>>>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>>>
>>>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>>>> path.") But I couldn't find the justification for this commit.
>>>>
>>>> Yes I noticed that as well when debugging the above mentioned case
>>>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>>>> removing the DMA host brige windows from the resv regions. Did it break
>>>> a legacy case?
>>>>>
>>>
>>> I think yes. And going through the ML discussions, this was done so because
>> with the
>>> " vfio/type1: Add support for valid iova list management" series you reported
>>> an issue with Seattle platform. See the full discussion here,
>>>
>>> https://lore.kernel.org/patchwork/patch/889012/
>>
>> Hey thank you for reminding me of the Seattle case :-) Now I also recall
>> that, if I am not wrong, this also caused some trouble on some x86
>> platforms as well, reported by Alex? 
> 
> True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
> And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
> to exclude those[2]
I thought we also had the case of RESERVED regions but anyway.
> 
> Maybe we should still report PCI
>> host bridge windows in the reserved regions, if possible/feasible tag
>> them differently from other reserved regions and not reject any VFIO
>> DMA_MAP colliding with them?
> 
> I guess that is possible. But current interface is to report the regions that are safe
> from a IOMMU transaction point of view and I am not sure PCI window regions 
> comes under that.
yes only the sysfs interface could expose them at the moment.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
> 1. https://lkml.org/lkml/2018/6/5/760
> 2. https://lore.kernel.org/patchwork/cover/1083072/
> 
>> Thanks
>>
>> Eric
>>>
>>> Cheers,
>>> Shameer
>>>
>>>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>>>> problem
>>>>> reported by Srinath, because they could well choose an IOVA within the
>>>>> PCIe window...
>>>> I agree with you
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>>>
>>>>>>   If they were
>>>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>>>> SW_MSI region among the remaining free space.
>>>>>> As Robin said this was the initial chosen approach
>>>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>>>> MSI mapping
>>>>>> https://patchwork.kernel.org/patch/8121641/
>>>>>>
>>>>>> Some additional background about why the static SW MSI region chosen
>> by
>>>>>> the kernel was later chosen:
>>>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>>>
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>>>> ml
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>  It would just need a
>>>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>>>> example with a domain attribute.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jean
>>>>>>>
>>>>>>>>>
>>>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>>>> doorbells
>>>>>>>>> dynamically, so they aren't affected by reserved regions any more
>> than
>>>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>>>> region, since as the comment says, it *is* the software that manages
>>>> those.
>>>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>>>> because
>>>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>>>> DMA pages.
>>>>>>>>>
>>>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>>>> case
>>>>>>>>> the kernel *doesn't* control the address space, but still needs some
>> way
>>>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>>>> about, and give userspace a fighting chance of knowing what it's
>> taken.
>>>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>>>> needed to, but decided we could live without that initially. Perhaps
>> now
>>>>>>>>> the time has come?
>>>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>>>> MSI_IOVA_BASE
>>>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>>>> issue.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Srinath.
>>>>>>>>>
>>>>>>>>> Robin.
>>>>>>>>>
>>>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
>> can
>>>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
>> command
>>>> line argument.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>>>> ---
>>>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
>> b/drivers/iommu/arm-smmu.c
>>>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>>>       "Disable bypass streams such that incoming transactions
>> from
>>>> devices that are not attached to an iommu domain will report an abort back
>> to
>>>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>>>
>>>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>>>       struct iommu_group              *group;
>>>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>>>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>>>> IOMMU_MMIO;
>>>>>>>>>>
>>>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>>                                        prot,
>>>> IOMMU_RESV_SW_MSI);
>>>>>>>>>>       if (!region)
>>>>>>>>>>               return;
>>>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> linux-arm-kernel mailing list
>>>>>>> linux-arm-kernel@lists.infradead.org
>>>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
@ 2020-05-28 13:11                     ` Auger Eric
  0 siblings, 0 replies; 37+ messages in thread
From: Auger Eric @ 2020-05-28 13:11 UTC (permalink / raw)
  To: Shameerali Kolothum Thodi, Jean-Philippe Brucker
  Cc: Robin Murphy, Joerg Roedel, iommu, Linux Kernel Mailing List,
	Alex Williamson, Srinath Mannam, BCM Kernel Feedback,
	Will Deacon, Linux ARM

Hi Shameer,

On 5/28/20 2:09 PM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 28 May 2020 12:48
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Robin Murphy <robin.murphy@arm.com>; Joerg Roedel
>> <joro@8bytes.org>; iommu@lists.linux-foundation.org; Linux Kernel Mailing
>> List <linux-kernel@vger.kernel.org>; Alex Williamson
>> <alex.williamson@redhat.com>; Srinath Mannam
>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>> <bcm-kernel-feedback-list@broadcom.com>; Will Deacon <will@kernel.org>;
>> Linux ARM <linux-arm-kernel@lists.infradead.org>
>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi
>> iova address
>>
>>
>>
>> On 5/28/20 11:15 AM, Shameerali Kolothum Thodi wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>>> Sent: 28 May 2020 09:54
>>>> To: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Cc: Will Deacon <will@kernel.org>; Joerg Roedel <joro@8bytes.org>;
>>>> iommu@lists.linux-foundation.org; Shameerali Kolothum Thodi
>>>> <shameerali.kolothum.thodi@huawei.com>; Linux Kernel Mailing List
>>>> <linux-kernel@vger.kernel.org>; Alex Williamson
>>>> <alex.williamson@redhat.com>; Srinath Mannam
>>>> <srinath.mannam@broadcom.com>; BCM Kernel Feedback
>>>> <bcm-kernel-feedback-list@broadcom.com>; Robin Murphy
>>>> <robin.murphy@arm.com>; Linux ARM
>> <linux-arm-kernel@lists.infradead.org>
>>>> Subject: Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set
>> msi
>>>> iova address
>>>>
>>>> Hi,
>>>>
>>>> On 5/28/20 10:38 AM, Jean-Philippe Brucker wrote:
>>>>> [+ Shameer]
>>>>>
>>>>> On Thu, May 28, 2020 at 09:43:46AM +0200, Auger Eric wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 5/28/20 9:23 AM, Jean-Philippe Brucker wrote:
>>>>>>> On Thu, May 28, 2020 at 10:45:14AM +0530, Srinath Mannam wrote:
>>>>>>>> On Wed, May 27, 2020 at 11:00 PM Robin Murphy
>>>> <robin.murphy@arm.com> wrote:
>>>>>>>>>
>>>>>>>> Thanks Robin for your quick response.
>>>>>>>>> On 2020-05-27 17:03, Srinath Mannam wrote:
>>>>>>>>>> This patch gives the provision to change default value of MSI IOVA
>> base
>>>>>>>>>> to platform's suitable IOVA using module parameter. The present
>>>>>>>>>> hardcoded MSI IOVA base may not be the accessible IOVA ranges of
>>>> platform.
>>>>>>>>>
>>>>>>>>> That in itself doesn't seem entirely unreasonable; IIRC the current
>>>>>>>>> address is just an arbitrary choice to fit nicely into Qemu's memory
>>>>>>>>> map, and there was always the possibility that it wouldn't suit
>>>> everything.
>>>>>>>>>
>>>>>>>>>> Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe
>>>> inaccessible
>>>>>>>>>> DMA address"), inaccessible IOVA address ranges parsed from
>>>> dma-ranges
>>>>>>>>>> property are reserved.
>>>>>>>
>>>>>>> I don't understand why we only reserve the PCIe windows for DMA
>>>> domains.
>>>>>>> Shouldn't VFIO also prevent userspace from mapping them?
>>>>>>
>>>>>> VFIO prevents userspace from DMA mapping iovas within reserved
>> regions:
>>>>>> 9b77e5c79840  vfio/type1: check dma map request is within a valid iova
>>>> range
>>>>>
>>>>> Right but I was asking specifically about the IOVA reservation introduced
>>>>> by commit aadad097cd46. They are not registered as reserved regions
>> within
>>>>> the IOMMU core, they are only taken into account by dma-iommu.c when
>>>>> creating a DMA domain. As VFIO uses UNMANAGED domains, it isn't
>> aware
>>>> of
>>>>> those regions and they won't be seen by vfio_iommu_resv_exclude().
>>>>>
>>>>> It looks like the PCIe regions used to be common until cd2c9fcf5c66
>>>>> ("iommu/dma: Move PCI window region reservation back into dma specific
>>>>> path.") But I couldn't find the justification for this commit.
>>>>
>>>> Yes I noticed that as well when debugging the above mentioned case
>>>> before and after cd2c9fcf5c66. I do not remember about the rationale of
>>>> removing the DMA host brige windows from the resv regions. Did it break
>>>> a legacy case?
>>>>>
>>>
>>> I think yes. And going through the ML discussions, this was done so because
>> with the
>>> " vfio/type1: Add support for valid iova list management" series you reported
>>> an issue with Seattle platform. See the full discussion here,
>>>
>>> https://lore.kernel.org/patchwork/patch/889012/
>>
>> Hey thank you for reminding me of the Seattle case :-) Now I also recall
>> that, if I am not wrong, this also caused some trouble on some x86
>> platforms as well, reported by Alex? 
> 
> True, Alex reported that VT-d RMRR ranges were causing issues[1] as well.
> And then you came with IOMMU_RESV_DIRECT_RELAXABLE regions
> to exclude those[2]
I thought we also had the case of RESERVED regions but anyway.
> 
> Maybe we should still report PCI
>> host bridge windows in the reserved regions, if possible/feasible tag
>> them differently from other reserved regions and not reject any VFIO
>> DMA_MAP colliding with them?
> 
> I guess that is possible. But current interface is to report the regions that are safe
> from a IOMMU transaction point of view and I am not sure PCI window regions 
> comes under that.
yes only the sysfs interface could expose them at the moment.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
> 1. https://lkml.org/lkml/2018/6/5/760
> 2. https://lore.kernel.org/patchwork/cover/1083072/
> 
>> Thanks
>>
>> Eric
>>>
>>> Cheers,
>>> Shameer
>>>
>>>>> The thing is, if VFIO isn't aware of the reserved PCIe windows, then
>>>>> allowing VFIO or userspace to choose MSI_IOVA_BASE won't solve the
>>>> problem
>>>>> reported by Srinath, because they could well choose an IOVA within the
>>>>> PCIe window...
>>>> I agree with you
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>>
>>>>> Thanks,
>>>>> Jean
>>>>>
>>>>>> but it does not prevent the SW MSI region chosen by the kernel from
>>>>>> colliding with other reserved regions (esp. PCIe host bridge windows).
>>>>>>
>>>>>>   If they were
>>>>>>> part of the common reserved regions then we could have VFIO choose a
>>>>>>> SW_MSI region among the remaining free space.
>>>>>> As Robin said this was the initial chosen approach
>>>>>> [PATCH 10/10] vfio: allow the user to register reserved iova range for
>>>>>> MSI mapping
>>>>>> https://patchwork.kernel.org/patch/8121641/
>>>>>>
>>>>>> Some additional background about why the static SW MSI region chosen
>> by
>>>>>> the kernel was later chosen:
>>>>>> Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM
>>>>>> PCIe/MSI passthrough on ARM/ARM64 (Alt II))
>>>>>>
>>>>
>> https://lists.linuxfoundation.org/pipermail/iommu/2016-November/019060.ht
>>>> ml
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>  It would just need a
>>>>>>> different way of asking the IOMMU driver if a SW_MSI is needed, for
>>>>>>> example with a domain attribute.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jean
>>>>>>>
>>>>>>>>>
>>>>>>>>> That, however, doesn't seem to fit here; iommu-dma maps MSI
>>>> doorbells
>>>>>>>>> dynamically, so they aren't affected by reserved regions any more
>> than
>>>>>>>>> regular DMA pages are. In fact, it explicitly ignores the software MSI
>>>>>>>>> region, since as the comment says, it *is* the software that manages
>>>> those.
>>>>>>>> Yes you are right, we don't see any issues with kernel drivers(PCI EP)
>>>> because
>>>>>>>> MSI IOVA allocated dynamically by honouring reserved regions same as
>>>> DMA pages.
>>>>>>>>>
>>>>>>>>> The MSI_IOVA_BASE region exists for VFIO, precisely because in that
>>>> case
>>>>>>>>> the kernel *doesn't* control the address space, but still needs some
>> way
>>>>>>>>> to steal a bit of it for MSIs that the guest doesn't necessarily know
>>>>>>>>> about, and give userspace a fighting chance of knowing what it's
>> taken.
>>>>>>>>> I think at the time we discussed the idea of adding something to the
>>>>>>>>> VFIO uapi such that userspace could move this around if it wanted or
>>>>>>>>> needed to, but decided we could live without that initially. Perhaps
>> now
>>>>>>>>> the time has come?
>>>>>>>> Yes, we see issues only with user-space drivers(DPDK) in which
>>>> MSI_IOVA_BASE
>>>>>>>> region is considered to map MSI registers. This patch helps us to fix the
>>>> issue.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Srinath.
>>>>>>>>>
>>>>>>>>> Robin.
>>>>>>>>>
>>>>>>>>>> If any platform has the limitaion to access default MSI IOVA, then it
>> can
>>>>>>>>>> be changed using "arm-smmu.msi_iova_base=0xa0000000"
>> command
>>>> line argument.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>>>>>>>>> ---
>>>>>>>>>>   drivers/iommu/arm-smmu.c | 5 ++++-
>>>>>>>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/iommu/arm-smmu.c
>> b/drivers/iommu/arm-smmu.c
>>>>>>>>>> index 4f1a350..5e59c9d 100644
>>>>>>>>>> --- a/drivers/iommu/arm-smmu.c
>>>>>>>>>> +++ b/drivers/iommu/arm-smmu.c
>>>>>>>>>> @@ -72,6 +72,9 @@ static bool disable_bypass =
>>>>>>>>>>   module_param(disable_bypass, bool, S_IRUGO);
>>>>>>>>>>   MODULE_PARM_DESC(disable_bypass,
>>>>>>>>>>       "Disable bypass streams such that incoming transactions
>> from
>>>> devices that are not attached to an iommu domain will report an abort back
>> to
>>>> the device and will not be allowed to pass through the SMMU.");
>>>>>>>>>> +static unsigned long msi_iova_base = MSI_IOVA_BASE;
>>>>>>>>>> +module_param(msi_iova_base, ulong, S_IRUGO);
>>>>>>>>>> +MODULE_PARM_DESC(msi_iova_base, "msi iova base address.");
>>>>>>>>>>
>>>>>>>>>>   struct arm_smmu_s2cr {
>>>>>>>>>>       struct iommu_group              *group;
>>>>>>>>>> @@ -1566,7 +1569,7 @@ static void
>>>> arm_smmu_get_resv_regions(struct device *dev,
>>>>>>>>>>       struct iommu_resv_region *region;
>>>>>>>>>>       int prot = IOMMU_WRITE | IOMMU_NOEXEC |
>>>> IOMMU_MMIO;
>>>>>>>>>>
>>>>>>>>>> -     region = iommu_alloc_resv_region(MSI_IOVA_BASE,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>> +     region = iommu_alloc_resv_region(msi_iova_base,
>>>> MSI_IOVA_LENGTH,
>>>>>>>>>>                                        prot,
>>>> IOMMU_RESV_SW_MSI);
>>>>>>>>>>       if (!region)
>>>>>>>>>>               return;
>>>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> linux-arm-kernel mailing list
>>>>>>> linux-arm-kernel@lists.infradead.org
>>>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> linux-arm-kernel mailing list
>>>>> linux-arm-kernel@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-05-28 13:11 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27 16:03 [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address Srinath Mannam
2020-05-27 16:03 ` Srinath Mannam
2020-05-27 16:03 ` Srinath Mannam via iommu
2020-05-27 17:30 ` Robin Murphy
2020-05-27 17:30   ` Robin Murphy
2020-05-27 17:30   ` Robin Murphy
2020-05-28  5:15   ` Srinath Mannam
2020-05-28  5:15     ` Srinath Mannam
2020-05-28  5:15     ` Srinath Mannam via iommu
2020-05-28  7:23     ` Jean-Philippe Brucker
2020-05-28  7:23       ` Jean-Philippe Brucker
2020-05-28  7:23       ` Jean-Philippe Brucker
2020-05-28  7:43       ` Auger Eric
2020-05-28  7:43         ` Auger Eric
2020-05-28  7:43         ` Auger Eric
2020-05-28  8:38         ` Jean-Philippe Brucker
2020-05-28  8:38           ` Jean-Philippe Brucker
2020-05-28  8:38           ` Jean-Philippe Brucker
2020-05-28  8:53           ` Auger Eric
2020-05-28  8:53             ` Auger Eric
2020-05-28  8:53             ` Auger Eric
2020-05-28  9:15             ` Shameerali Kolothum Thodi
2020-05-28  9:15               ` Shameerali Kolothum Thodi
2020-05-28  9:15               ` Shameerali Kolothum Thodi
2020-05-28 11:47               ` Auger Eric
2020-05-28 11:48               ` Auger Eric
2020-05-28 11:48                 ` Auger Eric
2020-05-28 11:48                 ` Auger Eric
2020-05-28 12:09                 ` Shameerali Kolothum Thodi
2020-05-28 12:09                   ` Shameerali Kolothum Thodi
2020-05-28 12:09                   ` Shameerali Kolothum Thodi
2020-05-28 13:11                   ` Auger Eric
2020-05-28 13:11                     ` Auger Eric
2020-05-28 13:11                     ` Auger Eric
2020-05-28  7:21   ` Auger Eric
2020-05-28  7:21     ` Auger Eric
2020-05-28  7:21     ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.