* [PATCH v4 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook
2018-08-06 12:26 [PATCH v4 0/5] add non-strict mode support for arm-smmu-v3 Zhen Lei
@ 2018-08-06 12:27 ` Zhen Lei
2018-08-09 10:25 ` Robin Murphy
2018-08-06 12:27 ` [PATCH v4 2/5] iommu/dma: add support for non-strict mode Zhen Lei
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Zhen Lei @ 2018-08-06 12:27 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
.flush_iotlb_all can not just wait for previous tlbi operations to be
completed, but should also invalid all TLBs of the related domain.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm-smmu-v3.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4810f61..2f1304b 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
return ops->unmap(ops, iova, size);
}
+static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
+{
+ struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+ if (smmu_domain->smmu)
+ arm_smmu_tlb_inv_context(smmu_domain);
+}
+
static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
{
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
@@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev,
.map = arm_smmu_map,
.unmap = arm_smmu_unmap,
.map_sg = default_iommu_map_sg,
- .flush_iotlb_all = arm_smmu_iotlb_sync,
+ .flush_iotlb_all = arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys = arm_smmu_iova_to_phys,
.add_device = arm_smmu_add_device,
--
1.8.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook
2018-08-06 12:27 ` [PATCH v4 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook Zhen Lei
@ 2018-08-09 10:25 ` Robin Murphy
0 siblings, 0 replies; 15+ messages in thread
From: Robin Murphy @ 2018-08-09 10:25 UTC (permalink / raw)
To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 06/08/18 13:27, Zhen Lei wrote:
> .flush_iotlb_all can not just wait for previous tlbi operations to be
> completed, but should also invalid all TLBs of the related domain.
I think it was like that because the only caller in practice was
iommu_group_create_direct_mappings(), and at that point any relevant
invalidations would have already been issued anyway. Once
flush_iotlb_all actually does what it's supposed to we'll get a bit of
unavoidable over-invalidation on that path, but it's no big deal.
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> drivers/iommu/arm-smmu-v3.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 4810f61..2f1304b 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -1770,6 +1770,14 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
> return ops->unmap(ops, iova, size);
> }
>
> +static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> + if (smmu_domain->smmu)
> + arm_smmu_tlb_inv_context(smmu_domain);
> +}
> +
> static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> {
> struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
> @@ -1998,7 +2006,7 @@ static void arm_smmu_put_resv_regions(struct device *dev,
> .map = arm_smmu_map,
> .unmap = arm_smmu_unmap,
> .map_sg = default_iommu_map_sg,
> - .flush_iotlb_all = arm_smmu_iotlb_sync,
> + .flush_iotlb_all = arm_smmu_flush_iotlb_all,
> .iotlb_sync = arm_smmu_iotlb_sync,
> .iova_to_phys = arm_smmu_iova_to_phys,
> .add_device = arm_smmu_add_device,
> --
> 1.8.3
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 2/5] iommu/dma: add support for non-strict mode
2018-08-06 12:26 [PATCH v4 0/5] add non-strict mode support for arm-smmu-v3 Zhen Lei
2018-08-06 12:27 ` [PATCH v4 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook Zhen Lei
@ 2018-08-06 12:27 ` Zhen Lei
2018-08-09 10:46 ` Robin Murphy
2018-08-06 12:27 ` [PATCH v4 3/5] iommu/io-pgtable-arm: " Zhen Lei
` (2 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Zhen Lei @ 2018-08-06 12:27 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
1. Save the related domain pointer in struct iommu_dma_cookie, make iovad
capable call domain->ops->flush_iotlb_all to flush TLB.
2. During the iommu domain initialization phase, base on domain->non_strict
field to check whether non-strict mode is supported or not. If so, call
init_iova_flush_queue to register iovad->flush_cb callback.
3. All unmap(contains iova-free) APIs will finally invoke __iommu_dma_unmap
-->iommu_dma_free_iova. If the domain is non-strict, call queue_iova to
put off iova freeing.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/dma-iommu.c | 23 +++++++++++++++++++++++
drivers/iommu/iommu.c | 1 +
include/linux/iommu.h | 1 +
3 files changed, 25 insertions(+)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ddcbbdb..213e62a 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -55,6 +55,7 @@ struct iommu_dma_cookie {
};
struct list_head msi_page_list;
spinlock_t msi_lock;
+ struct iommu_domain *domain;
};
static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
@@ -257,6 +258,17 @@ static int iova_reserve_iommu_regions(struct device *dev,
return ret;
}
+static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
+{
+ struct iommu_dma_cookie *cookie;
+ struct iommu_domain *domain;
+
+ cookie = container_of(iovad, struct iommu_dma_cookie, iovad);
+ domain = cookie->domain;
+
+ domain->ops->flush_iotlb_all(domain);
+}
+
/**
* iommu_dma_init_domain - Initialise a DMA mapping domain
* @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
@@ -308,6 +320,14 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
}
init_iova_domain(iovad, 1UL << order, base_pfn);
+
+ if (domain->non_strict) {
+ BUG_ON(!domain->ops->flush_iotlb_all);
+
+ cookie->domain = domain;
+ init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL);
+ }
+
if (!dev)
return 0;
@@ -390,6 +410,9 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
/* The MSI case is only ever cleaning up its most recent allocation */
if (cookie->type == IOMMU_DMA_MSI_COOKIE)
cookie->msi_iova -= size;
+ else if (cookie->domain && cookie->domain->non_strict)
+ queue_iova(iovad, iova_pfn(iovad, iova),
+ size >> iova_shift(iovad), 0);
else
free_iova_fast(iovad, iova_pfn(iovad, iova),
size >> iova_shift(iovad));
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 63b3756..7811fde 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1263,6 +1263,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
domain->ops = bus->iommu_ops;
domain->type = type;
+ domain->non_strict = 0;
/* Assume all sizes by default; the driver may override this later */
domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 19938ee..0a0fb48 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -88,6 +88,7 @@ struct iommu_domain_geometry {
struct iommu_domain {
unsigned type;
+ int non_strict;
const struct iommu_ops *ops;
unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */
iommu_fault_handler_t handler;
--
1.8.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/5] iommu/dma: add support for non-strict mode
2018-08-06 12:27 ` [PATCH v4 2/5] iommu/dma: add support for non-strict mode Zhen Lei
@ 2018-08-09 10:46 ` Robin Murphy
2018-08-09 11:01 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2018-08-09 10:46 UTC (permalink / raw)
To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 06/08/18 13:27, Zhen Lei wrote:
> 1. Save the related domain pointer in struct iommu_dma_cookie, make iovad
> capable call domain->ops->flush_iotlb_all to flush TLB.
> 2. During the iommu domain initialization phase, base on domain->non_strict
> field to check whether non-strict mode is supported or not. If so, call
> init_iova_flush_queue to register iovad->flush_cb callback.
> 3. All unmap(contains iova-free) APIs will finally invoke __iommu_dma_unmap
> -->iommu_dma_free_iova. If the domain is non-strict, call queue_iova to
> put off iova freeing.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> drivers/iommu/dma-iommu.c | 23 +++++++++++++++++++++++
> drivers/iommu/iommu.c | 1 +
> include/linux/iommu.h | 1 +
> 3 files changed, 25 insertions(+)
>
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index ddcbbdb..213e62a 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -55,6 +55,7 @@ struct iommu_dma_cookie {
> };
> struct list_head msi_page_list;
> spinlock_t msi_lock;
> + struct iommu_domain *domain;
> };
>
> static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
> @@ -257,6 +258,17 @@ static int iova_reserve_iommu_regions(struct device *dev,
> return ret;
> }
>
> +static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
> +{
> + struct iommu_dma_cookie *cookie;
> + struct iommu_domain *domain;
> +
> + cookie = container_of(iovad, struct iommu_dma_cookie, iovad);
> + domain = cookie->domain;
> +
> + domain->ops->flush_iotlb_all(domain);
> +}
> +
> /**
> * iommu_dma_init_domain - Initialise a DMA mapping domain
> * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
> @@ -308,6 +320,14 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
> }
>
> init_iova_domain(iovad, 1UL << order, base_pfn);
> +
> + if (domain->non_strict) {
> + BUG_ON(!domain->ops->flush_iotlb_all);
> +
> + cookie->domain = domain;
cookie->domain will only be non-NULL if domain->non_strict is true...
> + init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL);
> + }
> +
> if (!dev)
> return 0;
>
> @@ -390,6 +410,9 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
> /* The MSI case is only ever cleaning up its most recent allocation */
> if (cookie->type == IOMMU_DMA_MSI_COOKIE)
> cookie->msi_iova -= size;
> + else if (cookie->domain && cookie->domain->non_strict)
...so we don't need to re-check non_strict every time here.
> + queue_iova(iovad, iova_pfn(iovad, iova),
> + size >> iova_shift(iovad), 0);
> else
> free_iova_fast(iovad, iova_pfn(iovad, iova),
> size >> iova_shift(iovad));
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 63b3756..7811fde 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1263,6 +1263,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>
> domain->ops = bus->iommu_ops;
> domain->type = type;
> + domain->non_strict = 0;
> /* Assume all sizes by default; the driver may override this later */
> domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
>
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 19938ee..0a0fb48 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -88,6 +88,7 @@ struct iommu_domain_geometry {
>
> struct iommu_domain {
> unsigned type;
> + int non_strict;
bool?
Robin.
> const struct iommu_ops *ops;
> unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */
> iommu_fault_handler_t handler;
> --
> 1.8.3
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 2/5] iommu/dma: add support for non-strict mode
2018-08-09 10:46 ` Robin Murphy
@ 2018-08-09 11:01 ` Leizhen (ThunderTown)
0 siblings, 0 replies; 15+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-09 11:01 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 2018/8/9 18:46, Robin Murphy wrote:
> On 06/08/18 13:27, Zhen Lei wrote:
>> 1. Save the related domain pointer in struct iommu_dma_cookie, make iovad
>> capable call domain->ops->flush_iotlb_all to flush TLB.
>> 2. During the iommu domain initialization phase, base on domain->non_strict
>> field to check whether non-strict mode is supported or not. If so, call
>> init_iova_flush_queue to register iovad->flush_cb callback.
>> 3. All unmap(contains iova-free) APIs will finally invoke __iommu_dma_unmap
>> -->iommu_dma_free_iova. If the domain is non-strict, call queue_iova to
>> put off iova freeing.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>> drivers/iommu/dma-iommu.c | 23 +++++++++++++++++++++++
>> drivers/iommu/iommu.c | 1 +
>> include/linux/iommu.h | 1 +
>> 3 files changed, 25 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index ddcbbdb..213e62a 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -55,6 +55,7 @@ struct iommu_dma_cookie {
>> };
>> struct list_head msi_page_list;
>> spinlock_t msi_lock;
>> + struct iommu_domain *domain;
>> };
>>
>> static inline size_t cookie_msi_granule(struct iommu_dma_cookie *cookie)
>> @@ -257,6 +258,17 @@ static int iova_reserve_iommu_regions(struct device *dev,
>> return ret;
>> }
>>
>> +static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
>> +{
>> + struct iommu_dma_cookie *cookie;
>> + struct iommu_domain *domain;
>> +
>> + cookie = container_of(iovad, struct iommu_dma_cookie, iovad);
>> + domain = cookie->domain;
>> +
>> + domain->ops->flush_iotlb_all(domain);
>> +}
>> +
>> /**
>> * iommu_dma_init_domain - Initialise a DMA mapping domain
>> * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
>> @@ -308,6 +320,14 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
>> }
>>
>> init_iova_domain(iovad, 1UL << order, base_pfn);
>> +
>> + if (domain->non_strict) {
>> + BUG_ON(!domain->ops->flush_iotlb_all);
>> +
>> + cookie->domain = domain;
>
> cookie->domain will only be non-NULL if domain->non_strict is true...
>
>> + init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, NULL);
>> + }
>> +
>> if (!dev)
>> return 0;
>>
>> @@ -390,6 +410,9 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
>> /* The MSI case is only ever cleaning up its most recent allocation */
>> if (cookie->type == IOMMU_DMA_MSI_COOKIE)
>> cookie->msi_iova -= size;
>> + else if (cookie->domain && cookie->domain->non_strict)
>
> ...so we don't need to re-check non_strict every time here.
OK, I will change it to a comment.
>
>> + queue_iova(iovad, iova_pfn(iovad, iova),
>> + size >> iova_shift(iovad), 0);
>> else
>> free_iova_fast(iovad, iova_pfn(iovad, iova),
>> size >> iova_shift(iovad));
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 63b3756..7811fde 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1263,6 +1263,7 @@ static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus,
>>
>> domain->ops = bus->iommu_ops;
>> domain->type = type;
>> + domain->non_strict = 0;
>> /* Assume all sizes by default; the driver may override this later */
>> domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap;
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 19938ee..0a0fb48 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -88,6 +88,7 @@ struct iommu_domain_geometry {
>>
>> struct iommu_domain {
>> unsigned type;
>> + int non_strict;
>
> bool?
OK
>
> Robin.
>
>> const struct iommu_ops *ops;
>> unsigned long pgsize_bitmap; /* Bitmap of page sizes in use */
>> iommu_fault_handler_t handler;
>> --
>> 1.8.3
>>
>>
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 3/5] iommu/io-pgtable-arm: add support for non-strict mode
2018-08-06 12:26 [PATCH v4 0/5] add non-strict mode support for arm-smmu-v3 Zhen Lei
2018-08-06 12:27 ` [PATCH v4 1/5] iommu/arm-smmu-v3: fix the implementation of flush_iotlb_all hook Zhen Lei
2018-08-06 12:27 ` [PATCH v4 2/5] iommu/dma: add support for non-strict mode Zhen Lei
@ 2018-08-06 12:27 ` Zhen Lei
2018-08-09 10:54 ` Robin Murphy
2018-08-06 12:27 ` [PATCH v4 4/5] iommu/arm-smmu-v3: " Zhen Lei
2018-08-06 12:27 ` [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu" Zhen Lei
4 siblings, 1 reply; 15+ messages in thread
From: Zhen Lei @ 2018-08-06 12:27 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
To support the non-strict mode, now we only tlbi and sync for the strict
mode. But for the non-leaf case, always follow strict mode.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/io-pgtable-arm.c | 27 ++++++++++++++++++---------
drivers/iommu/io-pgtable.h | 3 +++
2 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 010a254..bb61bef 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -292,7 +292,7 @@ static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
unsigned long iova, size_t size, int lvl,
- arm_lpae_iopte *ptep);
+ arm_lpae_iopte *ptep, bool strict);
static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
phys_addr_t paddr, arm_lpae_iopte prot,
@@ -319,6 +319,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
arm_lpae_iopte prot, int lvl,
arm_lpae_iopte *ptep)
{
+ size_t unmapped;
arm_lpae_iopte pte = *ptep;
if (iopte_leaf(pte, lvl)) {
@@ -334,7 +335,8 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
- if (WARN_ON(__arm_lpae_unmap(data, iova, sz, lvl, tblp) != sz))
+ unmapped = __arm_lpae_unmap(data, iova, sz, lvl, tblp, true);
+ if (WARN_ON(unmapped != sz))
return -EINVAL;
}
@@ -576,15 +578,17 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
}
if (unmap_idx < 0)
- return __arm_lpae_unmap(data, iova, size, lvl, tablep);
+ return __arm_lpae_unmap(data, iova, size, lvl, tablep, true);
io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
+ io_pgtable_tlb_sync(&data->iop);
+
return size;
}
static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
unsigned long iova, size_t size, int lvl,
- arm_lpae_iopte *ptep)
+ arm_lpae_iopte *ptep, bool strict)
{
arm_lpae_iopte pte;
struct io_pgtable *iop = &data->iop;
@@ -609,7 +613,7 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
io_pgtable_tlb_sync(iop);
ptep = iopte_deref(pte, data);
__arm_lpae_free_pgtable(data, lvl + 1, ptep);
- } else {
+ } else if (strict) {
io_pgtable_tlb_add_flush(iop, iova, size, size, true);
}
@@ -625,12 +629,13 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
/* Keep on walkin' */
ptep = iopte_deref(pte, data);
- return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep);
+ return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep, strict);
}
static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
size_t size)
{
+ bool strict;
struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
arm_lpae_iopte *ptep = data->pgd;
int lvl = ARM_LPAE_START_LVL(data);
@@ -638,7 +643,9 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))
return 0;
- return __arm_lpae_unmap(data, iova, size, lvl, ptep);
+ strict = !(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT);
+
+ return __arm_lpae_unmap(data, iova, size, lvl, ptep, strict);
}
static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
@@ -771,7 +778,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
u64 reg;
struct arm_lpae_io_pgtable *data;
- if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA))
+ if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
+ IO_PGTABLE_QUIRK_NON_STRICT))
return NULL;
data = arm_lpae_alloc_pgtable(cfg);
@@ -863,7 +871,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
struct arm_lpae_io_pgtable *data;
/* The NS quirk doesn't apply at stage 2 */
- if (cfg->quirks & ~IO_PGTABLE_QUIRK_NO_DMA)
+ if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
+ IO_PGTABLE_QUIRK_NON_STRICT))
return NULL;
data = arm_lpae_alloc_pgtable(cfg);
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 2df7909..beb14a3 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -71,12 +71,15 @@ struct io_pgtable_cfg {
* be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
* software-emulated IOMMU), such that pagetable updates need not
* be treated as explicit DMA data.
+ * IO_PGTABLE_QUIRK_NON_STRICT: Put off TLBs invalidation and release
+ * memory first.
*/
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
#define IO_PGTABLE_QUIRK_TLBI_ON_MAP BIT(2)
#define IO_PGTABLE_QUIRK_ARM_MTK_4GB BIT(3)
#define IO_PGTABLE_QUIRK_NO_DMA BIT(4)
+ #define IO_PGTABLE_QUIRK_NON_STRICT BIT(5)
unsigned long quirks;
unsigned long pgsize_bitmap;
unsigned int ias;
--
1.8.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 3/5] iommu/io-pgtable-arm: add support for non-strict mode
2018-08-06 12:27 ` [PATCH v4 3/5] iommu/io-pgtable-arm: " Zhen Lei
@ 2018-08-09 10:54 ` Robin Murphy
2018-08-09 11:20 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2018-08-09 10:54 UTC (permalink / raw)
To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 06/08/18 13:27, Zhen Lei wrote:
> To support the non-strict mode, now we only tlbi and sync for the strict
> mode. But for the non-leaf case, always follow strict mode.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> drivers/iommu/io-pgtable-arm.c | 27 ++++++++++++++++++---------
> drivers/iommu/io-pgtable.h | 3 +++
> 2 files changed, 21 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 010a254..bb61bef 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -292,7 +292,7 @@ static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
>
> static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> unsigned long iova, size_t size, int lvl,
> - arm_lpae_iopte *ptep);
> + arm_lpae_iopte *ptep, bool strict);
>
> static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
> phys_addr_t paddr, arm_lpae_iopte prot,
> @@ -319,6 +319,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
> arm_lpae_iopte prot, int lvl,
> arm_lpae_iopte *ptep)
> {
> + size_t unmapped;
> arm_lpae_iopte pte = *ptep;
>
> if (iopte_leaf(pte, lvl)) {
> @@ -334,7 +335,8 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
> size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
>
> tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
> - if (WARN_ON(__arm_lpae_unmap(data, iova, sz, lvl, tblp) != sz))
> + unmapped = __arm_lpae_unmap(data, iova, sz, lvl, tblp, true);
> + if (WARN_ON(unmapped != sz))
What's the extra local variable for?
> return -EINVAL;
> }
>
> @@ -576,15 +578,17 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
> }
>
> if (unmap_idx < 0)
> - return __arm_lpae_unmap(data, iova, size, lvl, tablep);
> + return __arm_lpae_unmap(data, iova, size, lvl, tablep, true);
>
> io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
> + io_pgtable_tlb_sync(&data->iop);
> +
> return size;
> }
>
> static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> unsigned long iova, size_t size, int lvl,
> - arm_lpae_iopte *ptep)
> + arm_lpae_iopte *ptep, bool strict)
> {
> arm_lpae_iopte pte;
> struct io_pgtable *iop = &data->iop;
> @@ -609,7 +613,7 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> io_pgtable_tlb_sync(iop);
> ptep = iopte_deref(pte, data);
> __arm_lpae_free_pgtable(data, lvl + 1, ptep);
> - } else {
> + } else if (strict) {
Since this is the only place we ever actually evaluate "strict", can't
we just test iop->cfg.quirks directly at this point instead of playing
pass-the-parcel with the extra argument?
Robin.
> io_pgtable_tlb_add_flush(iop, iova, size, size, true);
> }
>
> @@ -625,12 +629,13 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>
> /* Keep on walkin' */
> ptep = iopte_deref(pte, data);
> - return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep);
> + return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep, strict);
> }
>
> static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
> size_t size)
> {
> + bool strict;
> struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> arm_lpae_iopte *ptep = data->pgd;
> int lvl = ARM_LPAE_START_LVL(data);
> @@ -638,7 +643,9 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
> if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))
> return 0;
>
> - return __arm_lpae_unmap(data, iova, size, lvl, ptep);
> + strict = !(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT);
> +
> + return __arm_lpae_unmap(data, iova, size, lvl, ptep, strict);
> }
>
> static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> @@ -771,7 +778,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
> u64 reg;
> struct arm_lpae_io_pgtable *data;
>
> - if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA))
> + if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
> + IO_PGTABLE_QUIRK_NON_STRICT))
> return NULL;
>
> data = arm_lpae_alloc_pgtable(cfg);
> @@ -863,7 +871,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
> struct arm_lpae_io_pgtable *data;
>
> /* The NS quirk doesn't apply at stage 2 */
> - if (cfg->quirks & ~IO_PGTABLE_QUIRK_NO_DMA)
> + if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
> + IO_PGTABLE_QUIRK_NON_STRICT))
> return NULL;
>
> data = arm_lpae_alloc_pgtable(cfg);
> diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
> index 2df7909..beb14a3 100644
> --- a/drivers/iommu/io-pgtable.h
> +++ b/drivers/iommu/io-pgtable.h
> @@ -71,12 +71,15 @@ struct io_pgtable_cfg {
> * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
> * software-emulated IOMMU), such that pagetable updates need not
> * be treated as explicit DMA data.
> + * IO_PGTABLE_QUIRK_NON_STRICT: Put off TLBs invalidation and release
> + * memory first.
> */
> #define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
> #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
> #define IO_PGTABLE_QUIRK_TLBI_ON_MAP BIT(2)
> #define IO_PGTABLE_QUIRK_ARM_MTK_4GB BIT(3)
> #define IO_PGTABLE_QUIRK_NO_DMA BIT(4)
> + #define IO_PGTABLE_QUIRK_NON_STRICT BIT(5)
> unsigned long quirks;
> unsigned long pgsize_bitmap;
> unsigned int ias;
> --
> 1.8.3
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 3/5] iommu/io-pgtable-arm: add support for non-strict mode
2018-08-09 10:54 ` Robin Murphy
@ 2018-08-09 11:20 ` Leizhen (ThunderTown)
0 siblings, 0 replies; 15+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-09 11:20 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 2018/8/9 18:54, Robin Murphy wrote:
> On 06/08/18 13:27, Zhen Lei wrote:
>> To support the non-strict mode, now we only tlbi and sync for the strict
>> mode. But for the non-leaf case, always follow strict mode.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>> drivers/iommu/io-pgtable-arm.c | 27 ++++++++++++++++++---------
>> drivers/iommu/io-pgtable.h | 3 +++
>> 2 files changed, 21 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 010a254..bb61bef 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -292,7 +292,7 @@ static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
>>
>> static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>> unsigned long iova, size_t size, int lvl,
>> - arm_lpae_iopte *ptep);
>> + arm_lpae_iopte *ptep, bool strict);
>>
>> static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>> phys_addr_t paddr, arm_lpae_iopte prot,
>> @@ -319,6 +319,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>> arm_lpae_iopte prot, int lvl,
>> arm_lpae_iopte *ptep)
>> {
>> + size_t unmapped;
>> arm_lpae_iopte pte = *ptep;
>>
>> if (iopte_leaf(pte, lvl)) {
>> @@ -334,7 +335,8 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>> size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
>>
>> tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
>> - if (WARN_ON(__arm_lpae_unmap(data, iova, sz, lvl, tblp) != sz))
>> + unmapped = __arm_lpae_unmap(data, iova, sz, lvl, tblp, true);
>> + if (WARN_ON(unmapped != sz))
>
> What's the extra local variable for?
in order to remove the warning: more than 80 characters a line
>
>> return -EINVAL;
>> }
>>
>> @@ -576,15 +578,17 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>> }
>>
>> if (unmap_idx < 0)
>> - return __arm_lpae_unmap(data, iova, size, lvl, tablep);
>> + return __arm_lpae_unmap(data, iova, size, lvl, tablep, true);
>>
>> io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
>> + io_pgtable_tlb_sync(&data->iop);
>> +
>> return size;
>> }
>>
>> static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>> unsigned long iova, size_t size, int lvl,
>> - arm_lpae_iopte *ptep)
>> + arm_lpae_iopte *ptep, bool strict)
>> {
>> arm_lpae_iopte pte;
>> struct io_pgtable *iop = &data->iop;
>> @@ -609,7 +613,7 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>> io_pgtable_tlb_sync(iop);
>> ptep = iopte_deref(pte, data);
>> __arm_lpae_free_pgtable(data, lvl + 1, ptep);
>> - } else {
>> + } else if (strict) {
>
> Since this is the only place we ever actually evaluate "strict", can't we just test iop->cfg.quirks directly at this point instead of playing pass-the-parcel with the extra argument?
Wonderful, you're right!
>
> Robin.
>
>> io_pgtable_tlb_add_flush(iop, iova, size, size, true);
>> }
>>
>> @@ -625,12 +629,13 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>>
>> /* Keep on walkin' */
>> ptep = iopte_deref(pte, data);
>> - return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep);
>> + return __arm_lpae_unmap(data, iova, size, lvl + 1, ptep, strict);
>> }
>>
>> static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>> size_t size)
>> {
>> + bool strict;
>> struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>> arm_lpae_iopte *ptep = data->pgd;
>> int lvl = ARM_LPAE_START_LVL(data);
>> @@ -638,7 +643,9 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
>> if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))
>> return 0;
>>
>> - return __arm_lpae_unmap(data, iova, size, lvl, ptep);
>> + strict = !(data->iop.cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT);
>> +
>> + return __arm_lpae_unmap(data, iova, size, lvl, ptep, strict);
>> }
>>
>> static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
>> @@ -771,7 +778,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
>> u64 reg;
>> struct arm_lpae_io_pgtable *data;
>>
>> - if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA))
>> + if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
>> + IO_PGTABLE_QUIRK_NON_STRICT))
>> return NULL;
>>
>> data = arm_lpae_alloc_pgtable(cfg);
>> @@ -863,7 +871,8 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
>> struct arm_lpae_io_pgtable *data;
>>
>> /* The NS quirk doesn't apply at stage 2 */
>> - if (cfg->quirks & ~IO_PGTABLE_QUIRK_NO_DMA)
>> + if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA |
>> + IO_PGTABLE_QUIRK_NON_STRICT))
>> return NULL;
>>
>> data = arm_lpae_alloc_pgtable(cfg);
>> diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
>> index 2df7909..beb14a3 100644
>> --- a/drivers/iommu/io-pgtable.h
>> +++ b/drivers/iommu/io-pgtable.h
>> @@ -71,12 +71,15 @@ struct io_pgtable_cfg {
>> * be accessed by a fully cache-coherent IOMMU or CPU (e.g. for a
>> * software-emulated IOMMU), such that pagetable updates need not
>> * be treated as explicit DMA data.
>> + * IO_PGTABLE_QUIRK_NON_STRICT: Put off TLBs invalidation and release
>> + * memory first.
>> */
>> #define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
>> #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
>> #define IO_PGTABLE_QUIRK_TLBI_ON_MAP BIT(2)
>> #define IO_PGTABLE_QUIRK_ARM_MTK_4GB BIT(3)
>> #define IO_PGTABLE_QUIRK_NO_DMA BIT(4)
>> + #define IO_PGTABLE_QUIRK_NON_STRICT BIT(5)
>> unsigned long quirks;
>> unsigned long pgsize_bitmap;
>> unsigned int ias;
>> --
>> 1.8.3
>>
>>
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 4/5] iommu/arm-smmu-v3: add support for non-strict mode
2018-08-06 12:26 [PATCH v4 0/5] add non-strict mode support for arm-smmu-v3 Zhen Lei
` (2 preceding siblings ...)
2018-08-06 12:27 ` [PATCH v4 3/5] iommu/io-pgtable-arm: " Zhen Lei
@ 2018-08-06 12:27 ` Zhen Lei
2018-08-09 11:06 ` Robin Murphy
2018-08-06 12:27 ` [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu" Zhen Lei
4 siblings, 1 reply; 15+ messages in thread
From: Zhen Lei @ 2018-08-06 12:27 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
Dynamically choose strict or non-strict mode for page table config based
on the iommu domain type.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
drivers/iommu/arm-smmu-v3.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 2f1304b..904bc1e 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1622,6 +1622,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
+ if (domain->type == IOMMU_DOMAIN_DMA) {
+ domain->non_strict = 1;
+ pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+ }
+
pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
if (!pgtbl_ops)
return -ENOMEM;
@@ -1782,7 +1787,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
{
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
- if (smmu)
+ if (smmu && !domain->non_strict)
__arm_smmu_tlb_sync(smmu);
}
--
1.8.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/5] iommu/arm-smmu-v3: add support for non-strict mode
2018-08-06 12:27 ` [PATCH v4 4/5] iommu/arm-smmu-v3: " Zhen Lei
@ 2018-08-09 11:06 ` Robin Murphy
2018-08-14 1:49 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2018-08-09 11:06 UTC (permalink / raw)
To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 06/08/18 13:27, Zhen Lei wrote:
> Dynamically choose strict or non-strict mode for page table config based
> on the iommu domain type.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> drivers/iommu/arm-smmu-v3.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 2f1304b..904bc1e 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -1622,6 +1622,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
> pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
>
> + if (domain->type == IOMMU_DOMAIN_DMA) {
> + domain->non_strict = 1;
> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> + }
> +
> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> if (!pgtbl_ops)
> return -ENOMEM;
> @@ -1782,7 +1787,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> {
> struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>
> - if (smmu)
> + if (smmu && !domain->non_strict)
That doesn't smell right - even in non-strict domains we still need
stuff like walk cache invalidations for non-leaf unmaps to be
synchronous, so we can't just ignore all sync operations at the driver
level. I think the right thing to do to elide the "normal" sync on unmap
is to first convert __iommu_dma_unmap to use
iommu_unmap_fast()/iommu_tlb_sync(), then make it not issue that sync at
all for non-strict domains.
Robin.
> __arm_smmu_tlb_sync(smmu);
> }
>
> --
> 1.8.3
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 4/5] iommu/arm-smmu-v3: add support for non-strict mode
2018-08-09 11:06 ` Robin Murphy
@ 2018-08-14 1:49 ` Leizhen (ThunderTown)
0 siblings, 0 replies; 15+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-14 1:49 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 2018/8/9 19:06, Robin Murphy wrote:
> On 06/08/18 13:27, Zhen Lei wrote:
>> Dynamically choose strict or non-strict mode for page table config based
>> on the iommu domain type.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>> drivers/iommu/arm-smmu-v3.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 2f1304b..904bc1e 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -1622,6 +1622,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>> if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
>> pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
>>
>> + if (domain->type == IOMMU_DOMAIN_DMA) {
>> + domain->non_strict = 1;
>> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>> + }
>> +
>> pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
>> if (!pgtbl_ops)
>> return -ENOMEM;
>> @@ -1782,7 +1787,7 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
>> {
>> struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>>
>> - if (smmu)
>> + if (smmu && !domain->non_strict)
>
> That doesn't smell right - even in non-strict domains we still need stuff like walk cache invalidations for non-leaf unmaps to be synchronous, so we can't just ignore all sync operations at the driver level. I think the right thing to do to elide the "normal" sync on unmap is to first convert __iommu_dma_unmap to use iommu_unmap_fast()/iommu_tlb_sync(), then make it not issue that sync at all for non-strict domains.
OK, I will try it.
>
> Robin.
>
>> __arm_smmu_tlb_sync(smmu);
>> }
>>
>> --
>> 1.8.3
>>
>>
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu"
2018-08-06 12:26 [PATCH v4 0/5] add non-strict mode support for arm-smmu-v3 Zhen Lei
` (3 preceding siblings ...)
2018-08-06 12:27 ` [PATCH v4 4/5] iommu/arm-smmu-v3: " Zhen Lei
@ 2018-08-06 12:27 ` Zhen Lei
2018-08-09 11:08 ` Robin Murphy
4 siblings, 1 reply; 15+ messages in thread
From: Zhen Lei @ 2018-08-06 12:27 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: Zhen Lei, LinuxArm, Hanjun Guo, Libin
Add a bootup option to make the system manager can choose which mode to
be used. The default mode is strict.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
Documentation/admin-guide/kernel-parameters.txt | 9 +++++++++
drivers/iommu/arm-smmu-v3.c | 17 ++++++++++++++++-
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 533ff5c..426e989 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1720,6 +1720,15 @@
nobypass [PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.
+ arm_iommu= [ARM64]
+ non-strict [Default Off]
+ Put off TLBs invalidation and release memory first.
+ It's good for scatter-gather performance but lacks full
+ isolation, an untrusted device can access the reused
+ memory because the TLBs may still valid. Please take
+ full consideration before choosing this mode.
+ Note that, VFIO will always use strict mode.
+
iommu.passthrough=
[ARM64] Configure DMA to bypass the IOMMU by default.
Format: { "0" | "1" }
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 904bc1e..9a30892 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -631,6 +631,21 @@ struct arm_smmu_option_prop {
{ 0, NULL},
};
+static u32 smmu_non_strict __read_mostly;
+
+static int __init arm_smmu_setup(char *str)
+{
+ if (!strncmp(str, "non-strict", 10)) {
+ smmu_non_strict = 1;
+ pr_warn("WARNING: iommu non-strict mode is chosen.\n"
+ "It's good for scatter-gather performance but lacks full isolation\n");
+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
+ }
+
+ return 0;
+}
+early_param("arm_iommu", arm_smmu_setup);
+
static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
struct arm_smmu_device *smmu)
{
@@ -1622,7 +1637,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
- if (domain->type == IOMMU_DOMAIN_DMA) {
+ if ((domain->type == IOMMU_DOMAIN_DMA) && smmu_non_strict) {
domain->non_strict = 1;
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
}
--
1.8.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu"
2018-08-06 12:27 ` [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu" Zhen Lei
@ 2018-08-09 11:08 ` Robin Murphy
2018-08-13 7:50 ` Leizhen (ThunderTown)
0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2018-08-09 11:08 UTC (permalink / raw)
To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 06/08/18 13:27, Zhen Lei wrote:
> Add a bootup option to make the system manager can choose which mode to
> be used. The default mode is strict.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 9 +++++++++
> drivers/iommu/arm-smmu-v3.c | 17 ++++++++++++++++-
> 2 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 533ff5c..426e989 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1720,6 +1720,15 @@
> nobypass [PPC/POWERNV]
> Disable IOMMU bypass, using IOMMU for PCI devices.
>
> + arm_iommu= [ARM64]
> + non-strict [Default Off]
Again, I'd much rather have "iommu.non_strict= { "0" | "1" }" in line
with the passthrough option.
Robin.
> + Put off TLBs invalidation and release memory first.
> + It's good for scatter-gather performance but lacks full
> + isolation, an untrusted device can access the reused
> + memory because the TLBs may still valid. Please take
> + full consideration before choosing this mode.
> + Note that, VFIO will always use strict mode.
> +
> iommu.passthrough=
> [ARM64] Configure DMA to bypass the IOMMU by default.
> Format: { "0" | "1" }
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 904bc1e..9a30892 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -631,6 +631,21 @@ struct arm_smmu_option_prop {
> { 0, NULL},
> };
>
> +static u32 smmu_non_strict __read_mostly;
> +
> +static int __init arm_smmu_setup(char *str)
> +{
> + if (!strncmp(str, "non-strict", 10)) {
> + smmu_non_strict = 1;
> + pr_warn("WARNING: iommu non-strict mode is chosen.\n"
> + "It's good for scatter-gather performance but lacks full isolation\n");
> + add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
> + }
> +
> + return 0;
> +}
> +early_param("arm_iommu", arm_smmu_setup);
> +
> static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
> struct arm_smmu_device *smmu)
> {
> @@ -1622,7 +1637,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
> if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
> pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
>
> - if (domain->type == IOMMU_DOMAIN_DMA) {
> + if ((domain->type == IOMMU_DOMAIN_DMA) && smmu_non_strict) {
> domain->non_strict = 1;
> pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> }
> --
> 1.8.3
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 5/5] iommu/arm-smmu-v3: add bootup option "arm_iommu"
2018-08-09 11:08 ` Robin Murphy
@ 2018-08-13 7:50 ` Leizhen (ThunderTown)
0 siblings, 0 replies; 15+ messages in thread
From: Leizhen (ThunderTown) @ 2018-08-13 7:50 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
linux-kernel
Cc: LinuxArm, Hanjun Guo, Libin
On 2018/8/9 19:08, Robin Murphy wrote:
> On 06/08/18 13:27, Zhen Lei wrote:
>> Add a bootup option to make the system manager can choose which mode to
>> be used. The default mode is strict.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 9 +++++++++
>> drivers/iommu/arm-smmu-v3.c | 17 ++++++++++++++++-
>> 2 files changed, 25 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 533ff5c..426e989 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -1720,6 +1720,15 @@
>> nobypass [PPC/POWERNV]
>> Disable IOMMU bypass, using IOMMU for PCI devices.
>>
>> + arm_iommu= [ARM64]
>> + non-strict [Default Off]
>
> Again, I'd much rather have "iommu.non_strict= { "0" | "1" }" in line with the passthrough option.
OK,I will change it in the next version.
>
> Robin.
>
>> + Put off TLBs invalidation and release memory first.
>> + It's good for scatter-gather performance but lacks full
>> + isolation, an untrusted device can access the reused
>> + memory because the TLBs may still valid. Please take
>> + full consideration before choosing this mode.
>> + Note that, VFIO will always use strict mode.
>> +
>> iommu.passthrough=
>> [ARM64] Configure DMA to bypass the IOMMU by default.
>> Format: { "0" | "1" }
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 904bc1e..9a30892 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -631,6 +631,21 @@ struct arm_smmu_option_prop {
>> { 0, NULL},
>> };
>>
>> +static u32 smmu_non_strict __read_mostly;
>> +
>> +static int __init arm_smmu_setup(char *str)
>> +{
>> + if (!strncmp(str, "non-strict", 10)) {
>> + smmu_non_strict = 1;
>> + pr_warn("WARNING: iommu non-strict mode is chosen.\n"
>> + "It's good for scatter-gather performance but lacks full isolation\n");
>> + add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
>> + }
>> +
>> + return 0;
>> +}
>> +early_param("arm_iommu", arm_smmu_setup);
>> +
>> static inline void __iomem *arm_smmu_page1_fixup(unsigned long offset,
>> struct arm_smmu_device *smmu)
>> {
>> @@ -1622,7 +1637,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>> if (smmu->features & ARM_SMMU_FEAT_COHERENCY)
>> pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
>>
>> - if (domain->type == IOMMU_DOMAIN_DMA) {
>> + if ((domain->type == IOMMU_DOMAIN_DMA) && smmu_non_strict) {
>> domain->non_strict = 1;
>> pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>> }
>> --
>> 1.8.3
>>
>>
>
> .
>
--
Thanks!
BestRegards
^ permalink raw reply [flat|nested] 15+ messages in thread