All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
@ 2021-08-23  7:50 Liu, Renwei
  2021-08-31 14:46 ` Eric Auger
  0 siblings, 1 reply; 5+ messages in thread
From: Liu, Renwei @ 2021-08-23  7:50 UTC (permalink / raw)
  To: Eric Auger, Peter Maydell
  Cc: qemu-arm, Wen, Jianxian, qemu-devel, Li, Chunming

Simplify range invalidation which can avoid to iterate over all
iotlb entries multi-times. For instance invalidations patterns like
"invalidate 32 4kB pages starting from 0xffacd000" need to iterate over
all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only needs
to iterate over all iotlb entries once with new implementation.

Signed-off-by: Renwei Liu <renwei.liu@verisilicon.com>
---
v2:
 - Remove file mode change.

 hw/arm/smmu-common.c   |  6 +++---
 hw/arm/smmu-internal.h |  2 +-
 hw/arm/smmuv3.c        | 22 ++++------------------
 3 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 0459850a93..ccb085f83c 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -142,8 +142,8 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value,
     if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) {
         return false;
     }
-    return ((info->iova & ~entry->addr_mask) == entry->iova) ||
-           ((entry->iova & ~info->mask) == info->iova);
+    return (entry->iova >= info->iova) &&
+           ((entry->iova + entry->addr_mask) < (info->iova + info->range));
 }
 
 inline void
@@ -167,7 +167,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
 
     SMMUIOTLBPageInvInfo info = {
         .asid = asid, .iova = iova,
-        .mask = (num_pages * 1 << granule) - 1};
+        .range = num_pages * 1 << granule};
 
     g_hash_table_foreach_remove(s->iotlb,
                                 smmu_hash_remove_by_asid_iova,
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
index 2d75b31953..f0e3a777af 100644
--- a/hw/arm/smmu-internal.h
+++ b/hw/arm/smmu-internal.h
@@ -101,7 +101,7 @@ uint64_t iova_level_offset(uint64_t iova, int inputsize,
 typedef struct SMMUIOTLBPageInvInfo {
     int asid;
     uint64_t iova;
-    uint64_t mask;
+    uint64_t range;
 } SMMUIOTLBPageInvInfo;
 
 typedef struct SMMUSIDRange {
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 01b60bee49..0b009107d1 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -857,7 +857,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova,
 
 static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 {
-    dma_addr_t end, addr = CMD_ADDR(cmd);
+    dma_addr_t addr = CMD_ADDR(cmd);
     uint8_t type = CMD_TYPE(cmd);
     uint16_t vmid = CMD_VMID(cmd);
     uint8_t scale = CMD_SCALE(cmd);
@@ -866,7 +866,6 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
     bool leaf = CMD_LEAF(cmd);
     uint8_t tg = CMD_TG(cmd);
     uint64_t num_pages;
-    uint8_t granule;
     int asid = -1;
 
     if (type == SMMU_CMD_TLBI_NH_VA) {
@@ -880,23 +879,10 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
         return;
     }
 
-    /* RIL in use */
-
     num_pages = (num + 1) * BIT_ULL(scale);
-    granule = tg * 2 + 10;
-
-    /* Split invalidations into ^2 range invalidations */
-    end = addr + (num_pages << granule) - 1;
-
-    while (addr != end + 1) {
-        uint64_t mask = dma_aligned_pow2_mask(addr, end, 64);
-
-        num_pages = (mask + 1) >> granule;
-        trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
-        smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
-        smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
-        addr += mask + 1;
-    }
+    trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
+    smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
+    smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
 }
 
 static gboolean
-- 
2.32.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
  2021-08-23  7:50 [PATCH v2] hw/arm/smmuv3: Simplify range invalidation Liu, Renwei
@ 2021-08-31 14:46 ` Eric Auger
  2021-09-01  6:33   ` Liu, Renwei
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Auger @ 2021-08-31 14:46 UTC (permalink / raw)
  To: Liu, Renwei, Peter Maydell
  Cc: qemu-arm, Wen, Jianxian, qemu-devel, Li, Chunming

Hi Liu,

On 8/23/21 9:50 AM, Liu, Renwei wrote:
> Simplify range invalidation which can avoid to iterate over all
> iotlb entries multi-times. For instance invalidations patterns like
> "invalidate 32 4kB pages starting from 0xffacd000" need to iterate over
> all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only needs
> to iterate over all iotlb entries once with new implementation.

This wouldn't work. This reverts commit
6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two range")
which is mandated for VFIO and virtio to work. IOTLB invalidations must
be naturally aligned and with a power of 2 range, hence this iteration.

Thanks

Eric
>
> Signed-off-by: Renwei Liu <renwei.liu@verisilicon.com>
> ---
> v2:
>  - Remove file mode change.
>
>  hw/arm/smmu-common.c   |  6 +++---
>  hw/arm/smmu-internal.h |  2 +-
>  hw/arm/smmuv3.c        | 22 ++++------------------
>  3 files changed, 8 insertions(+), 22 deletions(-)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 0459850a93..ccb085f83c 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -142,8 +142,8 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value,
>      if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) {
>          return false;
>      }
> -    return ((info->iova & ~entry->addr_mask) == entry->iova) ||
> -           ((entry->iova & ~info->mask) == info->iova);
> +    return (entry->iova >= info->iova) &&
> +           ((entry->iova + entry->addr_mask) < (info->iova + info->range));
>  }
>  
>  inline void
> @@ -167,7 +167,7 @@ smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
>  
>      SMMUIOTLBPageInvInfo info = {
>          .asid = asid, .iova = iova,
> -        .mask = (num_pages * 1 << granule) - 1};
> +        .range = num_pages * 1 << granule};
>  
>      g_hash_table_foreach_remove(s->iotlb,
>                                  smmu_hash_remove_by_asid_iova,
> diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
> index 2d75b31953..f0e3a777af 100644
> --- a/hw/arm/smmu-internal.h
> +++ b/hw/arm/smmu-internal.h
> @@ -101,7 +101,7 @@ uint64_t iova_level_offset(uint64_t iova, int inputsize,
>  typedef struct SMMUIOTLBPageInvInfo {
>      int asid;
>      uint64_t iova;
> -    uint64_t mask;
> +    uint64_t range;
>  } SMMUIOTLBPageInvInfo;
>  
>  typedef struct SMMUSIDRange {
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 01b60bee49..0b009107d1 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -857,7 +857,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova,
>  
>  static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
>  {
> -    dma_addr_t end, addr = CMD_ADDR(cmd);
> +    dma_addr_t addr = CMD_ADDR(cmd);
>      uint8_t type = CMD_TYPE(cmd);
>      uint16_t vmid = CMD_VMID(cmd);
>      uint8_t scale = CMD_SCALE(cmd);
> @@ -866,7 +866,6 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
>      bool leaf = CMD_LEAF(cmd);
>      uint8_t tg = CMD_TG(cmd);
>      uint64_t num_pages;
> -    uint8_t granule;
>      int asid = -1;
>  
>      if (type == SMMU_CMD_TLBI_NH_VA) {
> @@ -880,23 +879,10 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
>          return;
>      }
>  
> -    /* RIL in use */
> -
>      num_pages = (num + 1) * BIT_ULL(scale);
> -    granule = tg * 2 + 10;
> -
> -    /* Split invalidations into ^2 range invalidations */
> -    end = addr + (num_pages << granule) - 1;
> -
> -    while (addr != end + 1) {
> -        uint64_t mask = dma_aligned_pow2_mask(addr, end, 64);
> -
> -        num_pages = (mask + 1) >> granule;
> -        trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
> -        smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
> -        smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
> -        addr += mask + 1;
> -    }
> +    trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
> +    smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
> +    smmu_iotlb_inv_iova(s, asid, addr, tg, num_pages, ttl);
>  }
>  
>  static gboolean



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
  2021-08-31 14:46 ` Eric Auger
@ 2021-09-01  6:33   ` Liu, Renwei
  2021-09-01 13:14     ` Eric Auger
  0 siblings, 1 reply; 5+ messages in thread
From: Liu, Renwei @ 2021-09-01  6:33 UTC (permalink / raw)
  To: eric.auger, Peter Maydell
  Cc: qemu-arm, Wen, Jianxian, qemu-devel, Li, Chunming

> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: Tuesday, August 31, 2021 10:46 PM
> To: Liu, Renwei; Peter Maydell
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen,
> Jianxian
> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
> 
> Hi Liu,
> 
> On 8/23/21 9:50 AM, Liu, Renwei wrote:
> > Simplify range invalidation which can avoid to iterate over all
> > iotlb entries multi-times. For instance invalidations patterns like
> > "invalidate 32 4kB pages starting from 0xffacd000" need to iterate
> over
> > all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only
> needs
> > to iterate over all iotlb entries once with new implementation.
> 
> This wouldn't work. This reverts commit
> 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two
> range")
> which is mandated for VFIO and virtio to work. IOTLB invalidations must
> be naturally aligned and with a power of 2 range, hence this iteration.
> 
> Thanks
>
> Eric
Hi Eric,

Could you try the patch firstly? I want to know whether it's failed
in your application scenario with this implementation.
I agree with you that IOTLB entry must be naturally aligned and
with a power of 2 range. But we can invalidate multi IOTLB entries
in one iteration. We check the overlap between invalidation range
and IOTLB range, not check mask. The final result is same with
your implementation (split to multi times with a power of 2 range).
I wonder why we can't implement it directly when the application can
send an invalidation command with a non power of 2 range.
We have tested it in our application scenario and not find any fail.

In addition, from the code implementation, smmu_iotlb_inv_iova()
should be OK. In another call smmuv3_inv_notifiers_iova() ->
smmuv3_notify_iova() -> memory_region_notify_iommu_one(),
it also checks range overlap. So it should be OK if the range
is not a power of 2.

Could you take a look at it again?

Thanks
Renwei Liu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
  2021-09-01  6:33   ` Liu, Renwei
@ 2021-09-01 13:14     ` Eric Auger
  2021-09-02  1:47       ` Liu, Renwei
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Auger @ 2021-09-01 13:14 UTC (permalink / raw)
  To: Liu, Renwei, Peter Maydell
  Cc: qemu-arm, Wen, Jianxian, qemu-devel, Li, Chunming

Hi,

On 9/1/21 8:33 AM, Liu, Renwei wrote:
>> -----Original Message-----
>> From: Eric Auger [mailto:eric.auger@redhat.com]
>> Sent: Tuesday, August 31, 2021 10:46 PM
>> To: Liu, Renwei; Peter Maydell
>> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen,
>> Jianxian
>> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
>>
>> Hi Liu,
>>
>> On 8/23/21 9:50 AM, Liu, Renwei wrote:
>>> Simplify range invalidation which can avoid to iterate over all
>>> iotlb entries multi-times. For instance invalidations patterns like
>>> "invalidate 32 4kB pages starting from 0xffacd000" need to iterate
>> over
>>> all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only
>> needs
>>> to iterate over all iotlb entries once with new implementation.
>> This wouldn't work. This reverts commit
>> 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two
>> range")
>> which is mandated for VFIO and virtio to work. IOTLB invalidations must
>> be naturally aligned and with a power of 2 range, hence this iteration.
>>
>> Thanks
>>
>> Eric
> Hi Eric,
>
> Could you try the patch firstly? I want to know whether it's failed
> in your application scenario with this implementation.
There are many test cases, virtio-pci, vhost, VFIO, ...
> I agree with you that IOTLB entry must be naturally aligned and
> with a power of 2 range. But we can invalidate multi IOTLB entries
> in one iteration. We check the overlap between invalidation range
> and IOTLB range, not check mask.
This smmu_hash_remove_by_asid_iova() change only affects the internal
SMMUv3 IOTLB hash table lookup. However there are also IOTLB
invalidation notifications sent to components who registered notifiers,
ie. smmuv3_notify_iova path.
>  The final result is same with
> your implementation (split to multi times with a power of 2 range).
> I wonder why we can't implement it directly when the application can
> send an invalidation command with a non power of 2 range.
> We have tested it in our application scenario and not find any fail.
Assume the driver invalidates 5 * 4kB pages =0x5000 range.  Without the
loop you removed

in smmuv3_notify_iova()  event.entry.addr_mask = num_pages * (1 <<
granule) - 1 = 0x4FFF. This addr_mask  is an invalid mask
this entry is passed to components who registered invalidation notifiers
such as vhost or vfio. for instance in VFIO you have '&' ops on the
addr_mask.
addr_mask is expected to be a mask of a power of 2 range.

Does it clarify?

Thanks

Eric
>
> In addition, from the code implementation, smmu_iotlb_inv_iova()
> should be OK. In another call smmuv3_inv_notifiers_iova() ->
> smmuv3_notify_iova() -> memory_region_notify_iommu_one(),
> it also checks range overlap. So it should be OK if the range
> is not a power of 2.
>
> Could you take a look at it again?
>
> Thanks
> Renwei Liu



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
  2021-09-01 13:14     ` Eric Auger
@ 2021-09-02  1:47       ` Liu, Renwei
  0 siblings, 0 replies; 5+ messages in thread
From: Liu, Renwei @ 2021-09-02  1:47 UTC (permalink / raw)
  To: eric.auger, Peter Maydell
  Cc: qemu-arm, Wen, Jianxian, qemu-devel, Li, Chunming

> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@redhat.com]
> Sent: Wednesday, September 01, 2021 9:14 PM
> To: Liu, Renwei; Peter Maydell
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen,
> Jianxian
> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
> 
> Hi,
> 
> On 9/1/21 8:33 AM, Liu, Renwei wrote:
> >> -----Original Message-----
> >> From: Eric Auger [mailto:eric.auger@redhat.com]
> >> Sent: Tuesday, August 31, 2021 10:46 PM
> >> To: Liu, Renwei; Peter Maydell
> >> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Li, Chunming; Wen,
> >> Jianxian
> >> Subject: Re: [PATCH v2] hw/arm/smmuv3: Simplify range invalidation
> >>
> >> Hi Liu,
> >>
> >> On 8/23/21 9:50 AM, Liu, Renwei wrote:
> >>> Simplify range invalidation which can avoid to iterate over all
> >>> iotlb entries multi-times. For instance invalidations patterns like
> >>> "invalidate 32 4kB pages starting from 0xffacd000" need to iterate
> >> over
> >>> all iotlb entries 6 times (num_pages: 1, 2, 16, 8, 4, 1). It only
> >> needs
> >>> to iterate over all iotlb entries once with new implementation.
> >> This wouldn't work. This reverts commit
> >> 6d9cd115b9df ("hw/arm/smmuv3: Enforce invalidation on a power of two
> >> range")
> >> which is mandated for VFIO and virtio to work. IOTLB invalidations
> must
> >> be naturally aligned and with a power of 2 range, hence this
> iteration.
> >>
> >> Thanks
> >>
> >> Eric
> > Hi Eric,
> >
> > Could you try the patch firstly? I want to know whether it's failed
> > in your application scenario with this implementation.
> There are many test cases, virtio-pci, vhost, VFIO, ...
> > I agree with you that IOTLB entry must be naturally aligned and
> > with a power of 2 range. But we can invalidate multi IOTLB entries
> > in one iteration. We check the overlap between invalidation range
> > and IOTLB range, not check mask.
> This smmu_hash_remove_by_asid_iova() change only affects the internal
> SMMUv3 IOTLB hash table lookup. However there are also IOTLB
> invalidation notifications sent to components who registered notifiers,
> ie. smmuv3_notify_iova path.
> >  The final result is same with
> > your implementation (split to multi times with a power of 2 range).
> > I wonder why we can't implement it directly when the application can
> > send an invalidation command with a non power of 2 range.
> > We have tested it in our application scenario and not find any fail.
> Assume the driver invalidates 5 * 4kB pages =0x5000 range.  Without the
> loop you removed
> 
> in smmuv3_notify_iova()  event.entry.addr_mask = num_pages * (1 <<
> granule) - 1 = 0x4FFF. This addr_mask  is an invalid mask
> this entry is passed to components who registered invalidation
> notifiers
> such as vhost or vfio. for instance in VFIO you have '&' ops on the
> addr_mask.
> addr_mask is expected to be a mask of a power of 2 range.
> 
> Does it clarify?
> 
> Thanks
> 
> Eric
Hi Eric

Got it, thanks a lot for your clarification.
I don't consider the further notifier from vhost or vfio indeed,
because they are not registered in our application scenario.
Let's keep the previous implementation and ignore this patch.

Thanks
Renwei Liu

> >
> > In addition, from the code implementation, smmu_iotlb_inv_iova()
> > should be OK. In another call smmuv3_inv_notifiers_iova() ->
> > smmuv3_notify_iova() -> memory_region_notify_iommu_one(),
> > it also checks range overlap. So it should be OK if the range
> > is not a power of 2.
> >
> > Could you take a look at it again?
> >
> > Thanks
> > Renwei Liu


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-02  1:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23  7:50 [PATCH v2] hw/arm/smmuv3: Simplify range invalidation Liu, Renwei
2021-08-31 14:46 ` Eric Auger
2021-09-01  6:33   ` Liu, Renwei
2021-09-01 13:14     ` Eric Auger
2021-09-02  1:47       ` Liu, Renwei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.