qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] memory: do not do out of bound notification
@ 2019-06-19  8:49 Yan Zhao
  2019-06-19 13:17 ` Auger Eric
  0 siblings, 1 reply; 17+ messages in thread
From: Yan Zhao @ 2019-06-19  8:49 UTC (permalink / raw)
  To: pbonzini; +Cc: Yan Zhao, qemu-devel

even if an entry overlaps with notifier's range, should not map/unmap
out of bound part in the entry.

This would cause problem in below case:
1. initially there are two notifiers with ranges
0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.

2. in vfio, memory_region_register_iommu_notifier() is followed by
memory_region_iommu_replay(), which will first call address space unmap,
and walk and add back all entries in vtd shadow page table. e.g.
(1) for notifier 0-0xfedfffff,
    IOVAs from 0 - 0xffffffff get unmapped,
    and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
(2) for notifier 0xfef00000-0xffffffffffffffff
    IOVAs from 0 - 0x7fffffffff get unmapped,
    but IOVAs from 0x3c000000 - 0x3c1fffff cannot get mapped back.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 memory.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/memory.c b/memory.c
index 07c8315..a6b9da6 100644
--- a/memory.c
+++ b/memory.c
@@ -1948,6 +1948,14 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
         return;
     }
 
+    if (entry->iova < notifier->start) {
+        entry->iova = notifier->start;
+    }
+
+    if (entry->iova + entry->addr_mask > notifier->end) {
+        entry->addr_mask = notifier->end - entry->iova;
+    }
+
     if (entry->perm & IOMMU_RW) {
         printf("map %lx %lx\n", entry->iova, entry->iova + entry->addr_mask);
         request_flags = IOMMU_NOTIFIER_MAP;
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-19  8:49 [Qemu-devel] [PATCH] memory: do not do out of bound notification Yan Zhao
@ 2019-06-19 13:17 ` Auger Eric
  2019-06-20  1:46   ` Yan Zhao
  2019-06-20  4:02   ` Peter Xu
  0 siblings, 2 replies; 17+ messages in thread
From: Auger Eric @ 2019-06-19 13:17 UTC (permalink / raw)
  To: Yan Zhao, pbonzini; +Cc: qemu-devel, Peter Xu

Hi Yan,

[+ Peter]

On 6/19/19 10:49 AM, Yan Zhao wrote:
> even if an entry overlaps with notifier's range, should not map/unmap
> out of bound part in the entry.

I don't think the patch was based on the master as the trace at the very
end if not part of the upstream code.
> 
> This would cause problem in below case:
> 1. initially there are two notifiers with ranges
> 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> 
> 2. in vfio, memory_region_register_iommu_notifier() is followed by
> memory_region_iommu_replay(), which will first call address space unmap,
> and walk and add back all entries in vtd shadow page table. e.g.
> (1) for notifier 0-0xfedfffff,
>     IOVAs from 0 - 0xffffffff get unmapped,
>     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped

While the patch looks sensible, the issue is the notifier scope used in
vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
the size is recomputed (either using n = 64 - clz64(size) for the 1st
notifier or n = s->aw_bits for the 2d) and also the entry (especially
for the 2d notifier where it becomes 0) to get a proper alignment.

vtd_page_walk sends notifications per block or page (with valid
addr_mask) so stays within the notifier.

Modifying the entry->iova/addr_mask again in memory_region_notify_one
leads to unaligned start address / addr_mask. I don't think we want that.

Can't we modity the vtd_address_space_unmap() implementation to split
the invalidation in smaller chunks instead?


Thanks

Eric


> (2) for notifier 0xfef00000-0xffffffffffffffff
>     IOVAs from 0 - 0x7fffffffff get unmapped,>     but IOVAs from 0x3c000000 - 0x3c1fffff cannot get mapped back.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  memory.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/memory.c b/memory.c
> index 07c8315..a6b9da6 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1948,6 +1948,14 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
>          return;
>      }
>  
> +    if (entry->iova < notifier->start) {
> +        entry->iova = notifier->start;
> +    }
> +
> +    if (entry->iova + entry->addr_mask > notifier->end) {
> +        entry->addr_mask = notifier->end - entry->iova;> +    }
> +
>      if (entry->perm & IOMMU_RW) {
>          printf("map %lx %lx\n", entry->iova, entry->iova + entry->addr_mask);
>          request_flags = IOMMU_NOTIFIER_MAP;

> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-19 13:17 ` Auger Eric
@ 2019-06-20  1:46   ` Yan Zhao
  2019-06-20  4:02   ` Peter Xu
  1 sibling, 0 replies; 17+ messages in thread
From: Yan Zhao @ 2019-06-20  1:46 UTC (permalink / raw)
  To: Auger Eric; +Cc: pbonzini, qemu-devel, Peter Xu

hi Eric,
Thanks for your reply.

On Wed, Jun 19, 2019 at 09:17:41PM +0800, Auger Eric wrote:
> Hi Yan,
> 
> [+ Peter]
> 
> On 6/19/19 10:49 AM, Yan Zhao wrote:
> > even if an entry overlaps with notifier's range, should not map/unmap
> > out of bound part in the entry.
> 
> I don't think the patch was based on the master as the trace at the very
> end if not part of the upstream code.
> > 
It's indeed based on the latest master branch. but I added a debug log
and forgot to remove that before sending out the patch. sorry for that :)


> > This would cause problem in below case:
> > 1. initially there are two notifiers with ranges
> > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> > 
> > 2. in vfio, memory_region_register_iommu_notifier() is followed by
> > memory_region_iommu_replay(), which will first call address space unmap,
> > and walk and add back all entries in vtd shadow page table. e.g.
> > (1) for notifier 0-0xfedfffff,
> >     IOVAs from 0 - 0xffffffff get unmapped,
> >     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
> 
> While the patch looks sensible, the issue is the notifier scope used in
> vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
> the size is recomputed (either using n = 64 - clz64(size) for the 1st
> notifier or n = s->aw_bits for the 2d) and also the entry (especially
> for the 2d notifier where it becomes 0) to get a proper alignment.
>
maybe the size is calculated right, but 0 for the 2d notifier is because
this line below ?
 entry.iova = n->start & ~(size - 1);

> vtd_page_walk sends notifications per block or page (with valid
> addr_mask) so stays within the notifier.
> 
> Modifying the entry->iova/addr_mask again in memory_region_notify_one
> leads to unaligned start address / addr_mask. I don't think we want that.
>
if the notifier's start and end is aligned, and entry->iova/addr_mask is
aligned before modification,  then after modification, the start addr
/addr_mask are still aligned ?

> Can't we modity the vtd_address_space_unmap() implementation to split
> the invalidation in smaller chunks instead?
>
as this is an API, it cannot reply on the caller to ensure the entry is
within its address range. Do you think it's reasonable?

Thanks
Yan


> Thanks
> 
> Eric
> 
> 
> > (2) for notifier 0xfef00000-0xffffffffffffffff
> >     IOVAs from 0 - 0x7fffffffff get unmapped,>     but IOVAs from 0x3c000000 - 0x3c1fffff cannot get mapped back.
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  memory.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/memory.c b/memory.c
> > index 07c8315..a6b9da6 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1948,6 +1948,14 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
> >          return;
> >      }
> >  
> > +    if (entry->iova < notifier->start) {
> > +        entry->iova = notifier->start;
> > +    }
> > +
> > +    if (entry->iova + entry->addr_mask > notifier->end) {
> > +        entry->addr_mask = notifier->end - entry->iova;> +    }
> > +
> >      if (entry->perm & IOMMU_RW) {
> >          printf("map %lx %lx\n", entry->iova, entry->iova + entry->addr_mask);
> >          request_flags = IOMMU_NOTIFIER_MAP;
> 
> > 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-19 13:17 ` Auger Eric
  2019-06-20  1:46   ` Yan Zhao
@ 2019-06-20  4:02   ` Peter Xu
  2019-06-20  4:14     ` Yan Zhao
  2019-06-20  8:35     ` Paolo Bonzini
  1 sibling, 2 replies; 17+ messages in thread
From: Peter Xu @ 2019-06-20  4:02 UTC (permalink / raw)
  To: Auger Eric; +Cc: pbonzini, Yan Zhao, qemu-devel

On Wed, Jun 19, 2019 at 03:17:41PM +0200, Auger Eric wrote:
> Hi Yan,
> 
> [+ Peter]
> 
> On 6/19/19 10:49 AM, Yan Zhao wrote:
> > even if an entry overlaps with notifier's range, should not map/unmap
> > out of bound part in the entry.
> 
> I don't think the patch was based on the master as the trace at the very
> end if not part of the upstream code.
> > 
> > This would cause problem in below case:
> > 1. initially there are two notifiers with ranges
> > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> > 
> > 2. in vfio, memory_region_register_iommu_notifier() is followed by
> > memory_region_iommu_replay(), which will first call address space unmap,
> > and walk and add back all entries in vtd shadow page table. e.g.
> > (1) for notifier 0-0xfedfffff,
> >     IOVAs from 0 - 0xffffffff get unmapped,
> >     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
> 
> While the patch looks sensible, the issue is the notifier scope used in
> vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
> the size is recomputed (either using n = 64 - clz64(size) for the 1st
> notifier or n = s->aw_bits for the 2d) and also the entry (especially
> for the 2d notifier where it becomes 0) to get a proper alignment.
> 
> vtd_page_walk sends notifications per block or page (with valid
> addr_mask) so stays within the notifier.
> 
> Modifying the entry->iova/addr_mask again in memory_region_notify_one
> leads to unaligned start address / addr_mask. I don't think we want that.
> 
> Can't we modity the vtd_address_space_unmap() implementation to split
> the invalidation in smaller chunks instead?

Seems workable, to be explicit - we can even cut it into chunks with
different size to be efficient.  Like, this range:

  0x0e00_0000 - 0x1_0000_0000 (size 0xf200_0000)

can be one of this:

  0x0e000000 - 0x1000_0000 (size 0x0200_0000)

plus one of this:

  0x1000_0000 - 0x1_0000_0000 (size 0xf000_0000)

Yan, could you help explain the issue better on how to reproduce and
what's the error when the problem occurs?  For example, is that
happened when a device hot-plugged into an existing VFIO container
(with some mapped IOVAs)?  Did you get host DMA errors later on?

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  4:02   ` Peter Xu
@ 2019-06-20  4:14     ` Yan Zhao
  2019-06-20  8:14       ` Peter Xu
  2019-06-20  8:35     ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Yan Zhao @ 2019-06-20  4:14 UTC (permalink / raw)
  To: Peter Xu; +Cc: Auger Eric, qemu-devel, pbonzini

On Thu, Jun 20, 2019 at 12:02:30PM +0800, Peter Xu wrote:
> On Wed, Jun 19, 2019 at 03:17:41PM +0200, Auger Eric wrote:
> > Hi Yan,
> > 
> > [+ Peter]
> > 
> > On 6/19/19 10:49 AM, Yan Zhao wrote:
> > > even if an entry overlaps with notifier's range, should not map/unmap
> > > out of bound part in the entry.
> > 
> > I don't think the patch was based on the master as the trace at the very
> > end if not part of the upstream code.
> > > 
> > > This would cause problem in below case:
> > > 1. initially there are two notifiers with ranges
> > > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> > > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> > > 
> > > 2. in vfio, memory_region_register_iommu_notifier() is followed by
> > > memory_region_iommu_replay(), which will first call address space unmap,
> > > and walk and add back all entries in vtd shadow page table. e.g.
> > > (1) for notifier 0-0xfedfffff,
> > >     IOVAs from 0 - 0xffffffff get unmapped,
> > >     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
> > 
> > While the patch looks sensible, the issue is the notifier scope used in
> > vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
> > the size is recomputed (either using n = 64 - clz64(size) for the 1st
> > notifier or n = s->aw_bits for the 2d) and also the entry (especially
> > for the 2d notifier where it becomes 0) to get a proper alignment.
> > 
> > vtd_page_walk sends notifications per block or page (with valid
> > addr_mask) so stays within the notifier.
> > 
> > Modifying the entry->iova/addr_mask again in memory_region_notify_one
> > leads to unaligned start address / addr_mask. I don't think we want that.
> > 
> > Can't we modity the vtd_address_space_unmap() implementation to split
> > the invalidation in smaller chunks instead?
> 
> Seems workable, to be explicit - we can even cut it into chunks with
> different size to be efficient.  Like, this range:
> 
>   0x0e00_0000 - 0x1_0000_0000 (size 0xf200_0000)
> 
> can be one of this:
> 
>   0x0e000000 - 0x1000_0000 (size 0x0200_0000)
> 
> plus one of this:
> 
>   0x1000_0000 - 0x1_0000_0000 (size 0xf000_0000)
> 
> Yan, could you help explain the issue better on how to reproduce and
> what's the error when the problem occurs?  For example, is that
> happened when a device hot-plugged into an existing VFIO container
> (with some mapped IOVAs)?  Did you get host DMA errors later on?
> 
> Thanks,
> 
> -- 
> Peter Xu

Hi Peter
it happens when there's an RMRR region in my guest iommu driver.
if not adding this range check, IOVAs in this region would be unmapped and DMA
faults are met in host.

Thanks
Yan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  8:14       ` Peter Xu
@ 2019-06-20  8:13         ` Yan Zhao
  0 siblings, 0 replies; 17+ messages in thread
From: Yan Zhao @ 2019-06-20  8:13 UTC (permalink / raw)
  To: Peter Xu; +Cc: Auger Eric, qemu-devel, pbonzini

On Thu, Jun 20, 2019 at 04:14:37PM +0800, Peter Xu wrote:
> On Thu, Jun 20, 2019 at 12:14:00AM -0400, Yan Zhao wrote:
> > On Thu, Jun 20, 2019 at 12:02:30PM +0800, Peter Xu wrote:
> > > On Wed, Jun 19, 2019 at 03:17:41PM +0200, Auger Eric wrote:
> > > > Hi Yan,
> > > > 
> > > > [+ Peter]
> > > > 
> > > > On 6/19/19 10:49 AM, Yan Zhao wrote:
> > > > > even if an entry overlaps with notifier's range, should not map/unmap
> > > > > out of bound part in the entry.
> > > > 
> > > > I don't think the patch was based on the master as the trace at the very
> > > > end if not part of the upstream code.
> > > > > 
> > > > > This would cause problem in below case:
> > > > > 1. initially there are two notifiers with ranges
> > > > > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> > > > > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> > > > > 
> > > > > 2. in vfio, memory_region_register_iommu_notifier() is followed by
> > > > > memory_region_iommu_replay(), which will first call address space unmap,
> > > > > and walk and add back all entries in vtd shadow page table. e.g.
> > > > > (1) for notifier 0-0xfedfffff,
> > > > >     IOVAs from 0 - 0xffffffff get unmapped,
> > > > >     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
> > > > 
> > > > While the patch looks sensible, the issue is the notifier scope used in
> > > > vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
> > > > the size is recomputed (either using n = 64 - clz64(size) for the 1st
> > > > notifier or n = s->aw_bits for the 2d) and also the entry (especially
> > > > for the 2d notifier where it becomes 0) to get a proper alignment.
> > > > 
> > > > vtd_page_walk sends notifications per block or page (with valid
> > > > addr_mask) so stays within the notifier.
> > > > 
> > > > Modifying the entry->iova/addr_mask again in memory_region_notify_one
> > > > leads to unaligned start address / addr_mask. I don't think we want that.
> > > > 
> > > > Can't we modity the vtd_address_space_unmap() implementation to split
> > > > the invalidation in smaller chunks instead?
> > > 
> > > Seems workable, to be explicit - we can even cut it into chunks with
> > > different size to be efficient.  Like, this range:
> > > 
> > >   0x0e00_0000 - 0x1_0000_0000 (size 0xf200_0000)
> > > 
> > > can be one of this:
> > > 
> > >   0x0e000000 - 0x1000_0000 (size 0x0200_0000)
> > > 
> > > plus one of this:
> > > 
> > >   0x1000_0000 - 0x1_0000_0000 (size 0xf000_0000)
> > > 
> > > Yan, could you help explain the issue better on how to reproduce and
> > > what's the error when the problem occurs?  For example, is that
> > > happened when a device hot-plugged into an existing VFIO container
> > > (with some mapped IOVAs)?  Did you get host DMA errors later on?
> > > 
> > > Thanks,
> > > 
> > > -- 
> > > Peter Xu
> > 
> > Hi Peter
> > it happens when there's an RMRR region in my guest iommu driver.
> 
> Do you mean a RMRR region in the ACPI table?  AFAIK current QEMU VT-d
> does not have RMRR at all, so that's a customized QEMU?

it can be a customized QEMU with RMRR region in ACPI table. or simply
hardcode in guest kernel.

> 
> > if not adding this range check, IOVAs in this region would be unmapped and DMA
> > faults are met in host.
> 
> I see, thanks.
> 
> -- 
> Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  4:14     ` Yan Zhao
@ 2019-06-20  8:14       ` Peter Xu
  2019-06-20  8:13         ` Yan Zhao
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2019-06-20  8:14 UTC (permalink / raw)
  To: Yan Zhao; +Cc: Auger Eric, qemu-devel, pbonzini

On Thu, Jun 20, 2019 at 12:14:00AM -0400, Yan Zhao wrote:
> On Thu, Jun 20, 2019 at 12:02:30PM +0800, Peter Xu wrote:
> > On Wed, Jun 19, 2019 at 03:17:41PM +0200, Auger Eric wrote:
> > > Hi Yan,
> > > 
> > > [+ Peter]
> > > 
> > > On 6/19/19 10:49 AM, Yan Zhao wrote:
> > > > even if an entry overlaps with notifier's range, should not map/unmap
> > > > out of bound part in the entry.
> > > 
> > > I don't think the patch was based on the master as the trace at the very
> > > end if not part of the upstream code.
> > > > 
> > > > This would cause problem in below case:
> > > > 1. initially there are two notifiers with ranges
> > > > 0-0xfedfffff, 0xfef00000-0xffffffffffffffff,
> > > > IOVAs from 0x3c000000 - 0x3c1fffff is in shadow page table.
> > > > 
> > > > 2. in vfio, memory_region_register_iommu_notifier() is followed by
> > > > memory_region_iommu_replay(), which will first call address space unmap,
> > > > and walk and add back all entries in vtd shadow page table. e.g.
> > > > (1) for notifier 0-0xfedfffff,
> > > >     IOVAs from 0 - 0xffffffff get unmapped,
> > > >     and IOVAs from 0x3c000000 - 0x3c1fffff get mapped
> > > 
> > > While the patch looks sensible, the issue is the notifier scope used in
> > > vtd_address_space_unmap is not a valid mask (ctpop64(size) != 1). Then
> > > the size is recomputed (either using n = 64 - clz64(size) for the 1st
> > > notifier or n = s->aw_bits for the 2d) and also the entry (especially
> > > for the 2d notifier where it becomes 0) to get a proper alignment.
> > > 
> > > vtd_page_walk sends notifications per block or page (with valid
> > > addr_mask) so stays within the notifier.
> > > 
> > > Modifying the entry->iova/addr_mask again in memory_region_notify_one
> > > leads to unaligned start address / addr_mask. I don't think we want that.
> > > 
> > > Can't we modity the vtd_address_space_unmap() implementation to split
> > > the invalidation in smaller chunks instead?
> > 
> > Seems workable, to be explicit - we can even cut it into chunks with
> > different size to be efficient.  Like, this range:
> > 
> >   0x0e00_0000 - 0x1_0000_0000 (size 0xf200_0000)
> > 
> > can be one of this:
> > 
> >   0x0e000000 - 0x1000_0000 (size 0x0200_0000)
> > 
> > plus one of this:
> > 
> >   0x1000_0000 - 0x1_0000_0000 (size 0xf000_0000)
> > 
> > Yan, could you help explain the issue better on how to reproduce and
> > what's the error when the problem occurs?  For example, is that
> > happened when a device hot-plugged into an existing VFIO container
> > (with some mapped IOVAs)?  Did you get host DMA errors later on?
> > 
> > Thanks,
> > 
> > -- 
> > Peter Xu
> 
> Hi Peter
> it happens when there's an RMRR region in my guest iommu driver.

Do you mean a RMRR region in the ACPI table?  AFAIK current QEMU VT-d
does not have RMRR at all, so that's a customized QEMU?

> if not adding this range check, IOVAs in this region would be unmapped and DMA
> faults are met in host.

I see, thanks.

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  4:02   ` Peter Xu
  2019-06-20  4:14     ` Yan Zhao
@ 2019-06-20  8:35     ` Paolo Bonzini
  2019-06-20 10:57       ` Yan Zhao
  2019-06-20 12:59       ` Peter Xu
  1 sibling, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2019-06-20  8:35 UTC (permalink / raw)
  To: Peter Xu, Auger Eric; +Cc: Yan Zhao, qemu-devel

On 20/06/19 06:02, Peter Xu wrote:
> Seems workable, to be explicit - we can even cut it into chunks with
> different size to be efficient.

Yes, this is not hard (completely untested):

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 44b1231157..541538bc6c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     }
 
     assert(start <= end);
-    size = end - start;
+    while (end > start) {
+        size = end - start;
+        /* Only keep the lowest bit of either size or start.  */
+        size = MIN(size & -size, start & -start);
+        /* Should not happen, but limit to address width too just in case */
+        size = MIN(size, 1ULL << s->aw_bits);
 
-    if (ctpop64(size) != 1) {
-        /*
-         * This size cannot format a correct mask. Let's enlarge it to
-         * suite the minimum available mask.
-         */
-        int n = 64 - clz64(size);
-        if (n > s->aw_bits) {
-            /* should not happen, but in case it happens, limit it */
-            n = s->aw_bits;
-        }
-        size = 1ULL << n;
-    }
+        assert((start & (size - 1)) == 0);
 
-    entry.target_as = &address_space_memory;
-    /* Adjust iova for the size */
-    entry.iova = n->start & ~(size - 1);
-    /* This field is meaningless for unmap */
-    entry.translated_addr = 0;
-    entry.perm = IOMMU_NONE;
-    entry.addr_mask = size - 1;
+        entry.target_as = &address_space_memory;
+        entry.iova = start;
+        /* This field is meaningless for unmap */
+        entry.translated_addr = 0;
+        entry.perm = IOMMU_NONE;
+        entry.addr_mask = size - 1;
 
-    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
-                             VTD_PCI_SLOT(as->devfn),
-                             VTD_PCI_FUNC(as->devfn),
-                             entry.iova, size);
+        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
+                                 VTD_PCI_SLOT(as->devfn),
+                                 VTD_PCI_FUNC(as->devfn),
+                                 entry.iova, size);
 
-    map.iova = entry.iova;
-    map.size = entry.addr_mask;
-    iova_tree_remove(as->iova_tree, &map);
+        map.iova = entry.iova;
+        map.size = entry.addr_mask;
+        iova_tree_remove(as->iova_tree, &map);
 
-    memory_region_notify_one(n, &entry);
+        memory_region_notify_one(n, &entry);
+        start += size;
+    }
 }
 
 static void vtd_address_space_unmap_all(IntelIOMMUState *s)


Yan,

if something like this works for you, let me know and I will submit it
as a proper patch.

Paolo


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  8:35     ` Paolo Bonzini
@ 2019-06-20 10:57       ` Yan Zhao
  2019-06-20 12:04         ` Paolo Bonzini
  2019-06-20 12:59       ` Peter Xu
  1 sibling, 1 reply; 17+ messages in thread
From: Yan Zhao @ 2019-06-20 10:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Auger Eric, qemu-devel, Peter Xu

On Thu, Jun 20, 2019 at 04:35:29PM +0800, Paolo Bonzini wrote:
> On 20/06/19 06:02, Peter Xu wrote:
> > Seems workable, to be explicit - we can even cut it into chunks with
> > different size to be efficient.
> 
> Yes, this is not hard (completely untested):
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 44b1231157..541538bc6c 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      }
>  
>      assert(start <= end);
> -    size = end - start;
> +    while (end > start) {
> +        size = end - start;
> +        /* Only keep the lowest bit of either size or start.  */
> +        size = MIN(size & -size, start & -start);
> +        /* Should not happen, but limit to address width too just in case */
> +        size = MIN(size, 1ULL << s->aw_bits);
>  
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> +        assert((start & (size - 1)) == 0);
>  
> -    entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> -    /* This field is meaningless for unmap */
> -    entry.translated_addr = 0;
> -    entry.perm = IOMMU_NONE;
> -    entry.addr_mask = size - 1;
> +        entry.target_as = &address_space_memory;
> +        entry.iova = start;
> +        /* This field is meaningless for unmap */
> +        entry.translated_addr = 0;
> +        entry.perm = IOMMU_NONE;
> +        entry.addr_mask = size - 1;
>  
> -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> -                             VTD_PCI_SLOT(as->devfn),
> -                             VTD_PCI_FUNC(as->devfn),
> -                             entry.iova, size);
> +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> +                                 VTD_PCI_SLOT(as->devfn),
> +                                 VTD_PCI_FUNC(as->devfn),
> +                                 entry.iova, size);
>  
> -    map.iova = entry.iova;
> -    map.size = entry.addr_mask;
> -    iova_tree_remove(as->iova_tree, &map);
> +        map.iova = entry.iova;
> +        map.size = entry.addr_mask;
> +        iova_tree_remove(as->iova_tree, &map);
>  
> -    memory_region_notify_one(n, &entry);
> +        memory_region_notify_one(n, &entry);
> +        start += size;
> +    }
>  }
>  
>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> 
> 
> Yan,
> 
> if something like this works for you, let me know and I will submit it
> as a proper patch.
> 
> Paolo

hi Paolo

Thanks and I'll try it tomorrow and let you know the result.
But may I know why it cannot simply be like below?

Thanks
Yan


diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index b0d8a1c..2956db6 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3374,7 +3374,6 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     IntelIOMMUState *s = as->iommu_state;
     DMAMap map;

     /*
      * Note: all the codes in this function has a assumption that IOVA
      * bits are no more than VTD_MGAW bits (which is restricted by
@@ -3392,23 +3391,8 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     assert(start <= end);
     size = end - start;

-    if (ctpop64(size) != 1) {
-        /*
-         * This size cannot format a correct mask. Let's enlarge it to
-         * suite the minimum available mask.
-         */
-        int n = 64 - clz64(size);
-        if (n > s->aw_bits) {
-            /* should not happen, but in case it happens, limit it */
-            n = s->aw_bits;
-        }
-        size = 1ULL << n;
-    }
-
-
     entry.target_as = &address_space_memory;
-    /* Adjust iova for the size */
-    entry.iova = n->start & ~(size - 1);
+    entry.iova = n->start;
     /* This field is meaningless for unmap */
     entry.translated_addr = 0;
     entry.perm = IOMMU_NONE;





^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 10:57       ` Yan Zhao
@ 2019-06-20 12:04         ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2019-06-20 12:04 UTC (permalink / raw)
  To: Yan Zhao; +Cc: Auger Eric, qemu-devel, Peter Xu

On 20/06/19 12:57, Yan Zhao wrote:
> On Thu, Jun 20, 2019 at 04:35:29PM +0800, Paolo Bonzini wrote:
>> On 20/06/19 06:02, Peter Xu wrote:
>>> Seems workable, to be explicit - we can even cut it into chunks with
>>> different size to be efficient.
>>
>> Yes, this is not hard (completely untested):
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 44b1231157..541538bc6c 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>>      }
>>  
>>      assert(start <= end);
>> -    size = end - start;
>> +    while (end > start) {
>> +        size = end - start;
>> +        /* Only keep the lowest bit of either size or start.  */
>> +        size = MIN(size & -size, start & -start);
>> +        /* Should not happen, but limit to address width too just in case */
>> +        size = MIN(size, 1ULL << s->aw_bits);
>>  
>> -    if (ctpop64(size) != 1) {
>> -        /*
>> -         * This size cannot format a correct mask. Let's enlarge it to
>> -         * suite the minimum available mask.
>> -         */
>> -        int n = 64 - clz64(size);
>> -        if (n > s->aw_bits) {
>> -            /* should not happen, but in case it happens, limit it */
>> -            n = s->aw_bits;
>> -        }
>> -        size = 1ULL << n;
>> -    }
>> +        assert((start & (size - 1)) == 0);
>>  
>> -    entry.target_as = &address_space_memory;
>> -    /* Adjust iova for the size */
>> -    entry.iova = n->start & ~(size - 1);
>> -    /* This field is meaningless for unmap */
>> -    entry.translated_addr = 0;
>> -    entry.perm = IOMMU_NONE;
>> -    entry.addr_mask = size - 1;
>> +        entry.target_as = &address_space_memory;
>> +        entry.iova = start;
>> +        /* This field is meaningless for unmap */
>> +        entry.translated_addr = 0;
>> +        entry.perm = IOMMU_NONE;
>> +        entry.addr_mask = size - 1;
>>  
>> -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>> -                             VTD_PCI_SLOT(as->devfn),
>> -                             VTD_PCI_FUNC(as->devfn),
>> -                             entry.iova, size);
>> +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>> +                                 VTD_PCI_SLOT(as->devfn),
>> +                                 VTD_PCI_FUNC(as->devfn),
>> +                                 entry.iova, size);
>>  
>> -    map.iova = entry.iova;
>> -    map.size = entry.addr_mask;
>> -    iova_tree_remove(as->iova_tree, &map);
>> +        map.iova = entry.iova;
>> +        map.size = entry.addr_mask;
>> +        iova_tree_remove(as->iova_tree, &map);
>>  
>> -    memory_region_notify_one(n, &entry);
>> +        memory_region_notify_one(n, &entry);
>> +        start += size;
>> +    }
>>  }
>>  
>>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
>>
>>
>> Yan,
>>
>> if something like this works for you, let me know and I will submit it
>> as a proper patch.
> 
> Thanks and I'll try it tomorrow and let you know the result.
> But may I know why it cannot simply be like below?

Because the API is that addr_mask is a power of two minus 1.

Paolo

> Thanks
> Yan
> 
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index b0d8a1c..2956db6 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3374,7 +3374,6 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      IntelIOMMUState *s = as->iommu_state;
>      DMAMap map;
> 
>      /*
>       * Note: all the codes in this function has a assumption that IOVA
>       * bits are no more than VTD_MGAW bits (which is restricted by
> @@ -3392,23 +3391,8 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      assert(start <= end);
>      size = end - start;
> 
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> -
> -
>      entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> +    entry.iova = n->start;
>      /* This field is meaningless for unmap */
>      entry.translated_addr = 0;
>      entry.perm = IOMMU_NONE;
> 
> 
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20  8:35     ` Paolo Bonzini
  2019-06-20 10:57       ` Yan Zhao
@ 2019-06-20 12:59       ` Peter Xu
  2019-06-20 13:04         ` Peter Xu
                           ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Peter Xu @ 2019-06-20 12:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Auger Eric, Yan Zhao, qemu-devel

On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> On 20/06/19 06:02, Peter Xu wrote:
> > Seems workable, to be explicit - we can even cut it into chunks with
> > different size to be efficient.
> 
> Yes, this is not hard (completely untested):
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 44b1231157..541538bc6c 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      }
>  
>      assert(start <= end);
> -    size = end - start;
> +    while (end > start) {
> +        size = end - start;
> +        /* Only keep the lowest bit of either size or start.  */
> +        size = MIN(size & -size, start & -start);

I feel like this can be problematic.  I'm imaging:

start=0x1000_0000, size=0x1000_1000

This will get size=0x1000 but actually we can do size=0x1000_0000 as
the first.

> +        /* Should not happen, but limit to address width too just in case */
> +        size = MIN(size, 1ULL << s->aw_bits);
>  
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> +        assert((start & (size - 1)) == 0);
>  
> -    entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> -    /* This field is meaningless for unmap */
> -    entry.translated_addr = 0;
> -    entry.perm = IOMMU_NONE;
> -    entry.addr_mask = size - 1;
> +        entry.target_as = &address_space_memory;
> +        entry.iova = start;
> +        /* This field is meaningless for unmap */
> +        entry.translated_addr = 0;
> +        entry.perm = IOMMU_NONE;
> +        entry.addr_mask = size - 1;

(some of the fields can be moved out of loop because they are
 constants)

>  
> -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> -                             VTD_PCI_SLOT(as->devfn),
> -                             VTD_PCI_FUNC(as->devfn),
> -                             entry.iova, size);
> +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> +                                 VTD_PCI_SLOT(as->devfn),
> +                                 VTD_PCI_FUNC(as->devfn),
> +                                 entry.iova, size);

Can move this out because this is a trace only so we don't have
restriction on mask?

>  
> -    map.iova = entry.iova;
> -    map.size = entry.addr_mask;
> -    iova_tree_remove(as->iova_tree, &map);
> +        map.iova = entry.iova;
> +        map.size = entry.addr_mask;
> +        iova_tree_remove(as->iova_tree, &map);

Same here?

>  
> -    memory_region_notify_one(n, &entry);
> +        memory_region_notify_one(n, &entry);
> +        start += size;
> +    }
>  }
>  
>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> 
> 
> Yan,
> 
> if something like this works for you, let me know and I will submit it
> as a proper patch.
> 
> Paolo

Since during review I'm thinking how to generate a correct sequence of
these masks... here's my try below with above issues fixed... :)

I've tried compile but not tested.  Yan can test it, or I can do it
too tomorrow after I find some machines.

Thanks,

------------------------------------------------------------
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 44b1231157..cfbd225f0a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
     return vtd_dev_as;
 }

+static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
+{
+    /* Tries to find smallest mask from start first */
+    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
+
+    assert(size && gaw > 0 && gaw < 64);
+
+    /* Zero start, or too big */
+    if (!rmask || rmask > max_mask) {
+        rmask = max_mask;
+    }
+
+    /* If the start mask worked, then use it */
+    if (rmask <= size) {
+        return rmask;
+    }
+
+    /* Find the largest page mask from size */
+    return 1ULL << (63 - clz64(size));
+}
+
 /* Unmap the whole range in the notifier's scope. */
 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 {
     IOMMUTLBEntry entry;
-    hwaddr size;
+    hwaddr size, remain;
     hwaddr start = n->start;
     hwaddr end = n->end;
     IntelIOMMUState *s = as->iommu_state;
@@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     }

     assert(start <= end);
-    size = end - start;
-
-    if (ctpop64(size) != 1) {
-        /*
-         * This size cannot format a correct mask. Let's enlarge it to
-         * suite the minimum available mask.
-         */
-        int n = 64 - clz64(size);
-        if (n > s->aw_bits) {
-            /* should not happen, but in case it happens, limit it */
-            n = s->aw_bits;
-        }
-        size = 1ULL << n;
-    }
-
+    size = remain = end - start;
     entry.target_as = &address_space_memory;
-    /* Adjust iova for the size */
-    entry.iova = n->start & ~(size - 1);
+    entry.perm = IOMMU_NONE;
     /* This field is meaningless for unmap */
     entry.translated_addr = 0;
-    entry.perm = IOMMU_NONE;
-    entry.addr_mask = size - 1;
+
+    while (remain) {
+        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
+
+        entry.iova = start;
+        entry.addr_mask = mask - 1;
+        memory_region_notify_one(n, &entry);
+    }

     trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
                              VTD_PCI_SLOT(as->devfn),
                              VTD_PCI_FUNC(as->devfn),
-                             entry.iova, size);
+                             n->start, size);

-    map.iova = entry.iova;
-    map.size = entry.addr_mask;
+    map.iova = n->start;
+    map.size = size;
     iova_tree_remove(as->iova_tree, &map);
-
-    memory_region_notify_one(n, &entry);
 }

 static void vtd_address_space_unmap_all(IntelIOMMUState *s)
------------------------------------------------------------

Regards,

-- 
Peter Xu


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 12:59       ` Peter Xu
@ 2019-06-20 13:04         ` Peter Xu
  2019-06-24  5:22           ` Yan Zhao
  2019-06-20 13:14         ` Paolo Bonzini
  2019-06-21  7:57         ` Yan Zhao
  2 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2019-06-20 13:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Auger Eric, Yan Zhao, qemu-devel

On Thu, Jun 20, 2019 at 08:59:55PM +0800, Peter Xu wrote:
> On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> > On 20/06/19 06:02, Peter Xu wrote:
> > > Seems workable, to be explicit - we can even cut it into chunks with
> > > different size to be efficient.
> > 
> > Yes, this is not hard (completely untested):
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 44b1231157..541538bc6c 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> >      }
> >  
> >      assert(start <= end);
> > -    size = end - start;
> > +    while (end > start) {
> > +        size = end - start;
> > +        /* Only keep the lowest bit of either size or start.  */
> > +        size = MIN(size & -size, start & -start);
> 
> I feel like this can be problematic.  I'm imaging:
> 
> start=0x1000_0000, size=0x1000_1000
> 
> This will get size=0x1000 but actually we can do size=0x1000_0000 as
> the first.
> 
> > +        /* Should not happen, but limit to address width too just in case */
> > +        size = MIN(size, 1ULL << s->aw_bits);
> >  
> > -    if (ctpop64(size) != 1) {
> > -        /*
> > -         * This size cannot format a correct mask. Let's enlarge it to
> > -         * suite the minimum available mask.
> > -         */
> > -        int n = 64 - clz64(size);
> > -        if (n > s->aw_bits) {
> > -            /* should not happen, but in case it happens, limit it */
> > -            n = s->aw_bits;
> > -        }
> > -        size = 1ULL << n;
> > -    }
> > +        assert((start & (size - 1)) == 0);
> >  
> > -    entry.target_as = &address_space_memory;
> > -    /* Adjust iova for the size */
> > -    entry.iova = n->start & ~(size - 1);
> > -    /* This field is meaningless for unmap */
> > -    entry.translated_addr = 0;
> > -    entry.perm = IOMMU_NONE;
> > -    entry.addr_mask = size - 1;
> > +        entry.target_as = &address_space_memory;
> > +        entry.iova = start;
> > +        /* This field is meaningless for unmap */
> > +        entry.translated_addr = 0;
> > +        entry.perm = IOMMU_NONE;
> > +        entry.addr_mask = size - 1;
> 
> (some of the fields can be moved out of loop because they are
>  constants)
> 
> >  
> > -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > -                             VTD_PCI_SLOT(as->devfn),
> > -                             VTD_PCI_FUNC(as->devfn),
> > -                             entry.iova, size);
> > +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > +                                 VTD_PCI_SLOT(as->devfn),
> > +                                 VTD_PCI_FUNC(as->devfn),
> > +                                 entry.iova, size);
> 
> Can move this out because this is a trace only so we don't have
> restriction on mask?
> 
> >  
> > -    map.iova = entry.iova;
> > -    map.size = entry.addr_mask;
> > -    iova_tree_remove(as->iova_tree, &map);
> > +        map.iova = entry.iova;
> > +        map.size = entry.addr_mask;
> > +        iova_tree_remove(as->iova_tree, &map);
> 
> Same here?
> 
> >  
> > -    memory_region_notify_one(n, &entry);
> > +        memory_region_notify_one(n, &entry);
> > +        start += size;
> > +    }
> >  }
> >  
> >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > 
> > 
> > Yan,
> > 
> > if something like this works for you, let me know and I will submit it
> > as a proper patch.
> > 
> > Paolo
> 
> Since during review I'm thinking how to generate a correct sequence of
> these masks... here's my try below with above issues fixed... :)
> 
> I've tried compile but not tested.  Yan can test it, or I can do it
> too tomorrow after I find some machines.
> 
> Thanks,
> 
> ------------------------------------------------------------
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 44b1231157..cfbd225f0a 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
>      return vtd_dev_as;
>  }
> 
> +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> +{
> +    /* Tries to find smallest mask from start first */
> +    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> +
> +    assert(size && gaw > 0 && gaw < 64);
> +
> +    /* Zero start, or too big */
> +    if (!rmask || rmask > max_mask) {
> +        rmask = max_mask;
> +    }
> +
> +    /* If the start mask worked, then use it */
> +    if (rmask <= size) {
> +        return rmask;
> +    }
> +
> +    /* Find the largest page mask from size */
> +    return 1ULL << (63 - clz64(size));
> +}
> +
>  /* Unmap the whole range in the notifier's scope. */
>  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>  {
>      IOMMUTLBEntry entry;
> -    hwaddr size;
> +    hwaddr size, remain;
>      hwaddr start = n->start;
>      hwaddr end = n->end;
>      IntelIOMMUState *s = as->iommu_state;
> @@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      }
> 
>      assert(start <= end);
> -    size = end - start;
> -
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> -
> +    size = remain = end - start;
>      entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> +    entry.perm = IOMMU_NONE;
>      /* This field is meaningless for unmap */
>      entry.translated_addr = 0;
> -    entry.perm = IOMMU_NONE;
> -    entry.addr_mask = size - 1;
> +
> +    while (remain) {
> +        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> +
> +        entry.iova = start;
> +        entry.addr_mask = mask - 1;
> +        memory_region_notify_one(n, &entry);

Sorry, I at least missed these lines:

           start += mask;
           remain -= mask;

> +    }
> 
>      trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>                               VTD_PCI_SLOT(as->devfn),
>                               VTD_PCI_FUNC(as->devfn),
> -                             entry.iova, size);
> +                             n->start, size);
> 
> -    map.iova = entry.iova;
> -    map.size = entry.addr_mask;
> +    map.iova = n->start;
> +    map.size = size;
>      iova_tree_remove(as->iova_tree, &map);
> -
> -    memory_region_notify_one(n, &entry);
>  }
> 
>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> ------------------------------------------------------------
> 
> Regards,
> 
> -- 
> Peter Xu

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 12:59       ` Peter Xu
  2019-06-20 13:04         ` Peter Xu
@ 2019-06-20 13:14         ` Paolo Bonzini
  2019-06-21  2:36           ` Peter Xu
  2019-06-21  7:57         ` Yan Zhao
  2 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2019-06-20 13:14 UTC (permalink / raw)
  To: Peter Xu; +Cc: Auger Eric, Yan Zhao, qemu-devel

On 20/06/19 14:59, Peter Xu wrote:
> I feel like this can be problematic.  I'm imaging:
> 
> start=0x1000_0000, size=0x1000_1000
> 
> This will get size=0x1000 but actually we can do size=0x1000_0000 as
> the first.

Right, we can do:

/*
 * If a naturally aligned region starting at "start" ends before "end",
 * use it.  Otherwise, keep the lowest bit of size.
 */
if (size > (start & -start))
    size = start & -start;
else
    size = size & -size;

>>
>> +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>> +                                 VTD_PCI_SLOT(as->devfn),
>> +                                 VTD_PCI_FUNC(as->devfn),
>> +                                 entry.iova, size);
> 
> Can move this out because this is a trace only so we don't have
> restriction on mask?
> 
>>
>> -    map.iova = entry.iova;
>> -    map.size = entry.addr_mask;
>> -    iova_tree_remove(as->iova_tree, &map);
>> +        map.iova = entry.iova;
>> +        map.size = entry.addr_mask;
>> +        iova_tree_remove(as->iova_tree, &map);
> 
> Same here?
> 

Yes, I would move these and the iova_tree_remove outside the loop, while
keeping entry's initialization inside looks cleaner.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 13:14         ` Paolo Bonzini
@ 2019-06-21  2:36           ` Peter Xu
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Xu @ 2019-06-21  2:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Auger Eric, Yan Zhao, qemu-devel

On Thu, Jun 20, 2019 at 03:14:04PM +0200, Paolo Bonzini wrote:
> On 20/06/19 14:59, Peter Xu wrote:
> > I feel like this can be problematic.  I'm imaging:
> > 
> > start=0x1000_0000, size=0x1000_1000
> > 
> > This will get size=0x1000 but actually we can do size=0x1000_0000 as
> > the first.
> 
> Right, we can do:
> 
> /*
>  * If a naturally aligned region starting at "start" ends before "end",
>  * use it.  Otherwise, keep the lowest bit of size.
>  */
> if (size > (start & -start))
>     size = start & -start;

May need to consider start==0, otherwise size will be zero here?

> else
>     size = size & -size;

Should use MSB rather than LSB of size?

> 
> >>
> >> +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> >> +                                 VTD_PCI_SLOT(as->devfn),
> >> +                                 VTD_PCI_FUNC(as->devfn),
> >> +                                 entry.iova, size);
> > 
> > Can move this out because this is a trace only so we don't have
> > restriction on mask?
> > 
> >>
> >> -    map.iova = entry.iova;
> >> -    map.size = entry.addr_mask;
> >> -    iova_tree_remove(as->iova_tree, &map);
> >> +        map.iova = entry.iova;
> >> +        map.size = entry.addr_mask;
> >> +        iova_tree_remove(as->iova_tree, &map);
> > 
> > Same here?
> > 
> 
> Yes, I would move these and the iova_tree_remove outside the loop, while
> keeping entry's initialization inside looks cleaner.

Yeah that's ok to me too.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 12:59       ` Peter Xu
  2019-06-20 13:04         ` Peter Xu
  2019-06-20 13:14         ` Paolo Bonzini
@ 2019-06-21  7:57         ` Yan Zhao
  2 siblings, 0 replies; 17+ messages in thread
From: Yan Zhao @ 2019-06-21  7:57 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Auger Eric

On Thu, Jun 20, 2019 at 08:59:55PM +0800, Peter Xu wrote:
> On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> > On 20/06/19 06:02, Peter Xu wrote:
> > > Seems workable, to be explicit - we can even cut it into chunks with
> > > different size to be efficient.
> > 
> > Yes, this is not hard (completely untested):
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 44b1231157..541538bc6c 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> >      }
> >  
> >      assert(start <= end);
> > -    size = end - start;
> > +    while (end > start) {
> > +        size = end - start;
> > +        /* Only keep the lowest bit of either size or start.  */
> > +        size = MIN(size & -size, start & -start);
> 
> I feel like this can be problematic.  I'm imaging:
> 
> start=0x1000_0000, size=0x1000_1000
> 
> This will get size=0x1000 but actually we can do size=0x1000_0000 as
> the first.
> 
> > +        /* Should not happen, but limit to address width too just in case */
> > +        size = MIN(size, 1ULL << s->aw_bits);
> >  
> > -    if (ctpop64(size) != 1) {
> > -        /*
> > -         * This size cannot format a correct mask. Let's enlarge it to
> > -         * suite the minimum available mask.
> > -         */
> > -        int n = 64 - clz64(size);
> > -        if (n > s->aw_bits) {
> > -            /* should not happen, but in case it happens, limit it */
> > -            n = s->aw_bits;
> > -        }
> > -        size = 1ULL << n;
> > -    }
> > +        assert((start & (size - 1)) == 0);
> >  
> > -    entry.target_as = &address_space_memory;
> > -    /* Adjust iova for the size */
> > -    entry.iova = n->start & ~(size - 1);
> > -    /* This field is meaningless for unmap */
> > -    entry.translated_addr = 0;
> > -    entry.perm = IOMMU_NONE;
> > -    entry.addr_mask = size - 1;
> > +        entry.target_as = &address_space_memory;
> > +        entry.iova = start;
> > +        /* This field is meaningless for unmap */
> > +        entry.translated_addr = 0;
> > +        entry.perm = IOMMU_NONE;
> > +        entry.addr_mask = size - 1;
> 
> (some of the fields can be moved out of loop because they are
>  constants)
> 
> >  
> > -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > -                             VTD_PCI_SLOT(as->devfn),
> > -                             VTD_PCI_FUNC(as->devfn),
> > -                             entry.iova, size);
> > +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > +                                 VTD_PCI_SLOT(as->devfn),
> > +                                 VTD_PCI_FUNC(as->devfn),
> > +                                 entry.iova, size);
> 
> Can move this out because this is a trace only so we don't have
> restriction on mask?
> 
> >  
> > -    map.iova = entry.iova;
> > -    map.size = entry.addr_mask;
> > -    iova_tree_remove(as->iova_tree, &map);
> > +        map.iova = entry.iova;
> > +        map.size = entry.addr_mask;
> > +        iova_tree_remove(as->iova_tree, &map);
> 
> Same here?
> 
> >  
> > -    memory_region_notify_one(n, &entry);
> > +        memory_region_notify_one(n, &entry);
> > +        start += size;
> > +    }
> >  }
> >  
> >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > 
> > 
> > Yan,
> > 
> > if something like this works for you, let me know and I will submit it
> > as a proper patch.
> > 
> > Paolo
> 
> Since during review I'm thinking how to generate a correct sequence of
> these masks... here's my try below with above issues fixed... :)
> 
> I've tried compile but not tested.  Yan can test it, or I can do it
> too tomorrow after I find some machines.
> 
> Thanks,
>
hi Peter and Paolo,
I can help test it next week. Didn't manage to find a time for it today.

Thanks
Yan

> ------------------------------------------------------------
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 44b1231157..cfbd225f0a 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
>      return vtd_dev_as;
>  }
> 
> +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> +{
> +    /* Tries to find smallest mask from start first */
> +    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> +
> +    assert(size && gaw > 0 && gaw < 64);
> +
> +    /* Zero start, or too big */
> +    if (!rmask || rmask > max_mask) {
> +        rmask = max_mask;
> +    }
> +
> +    /* If the start mask worked, then use it */
> +    if (rmask <= size) {
> +        return rmask;
> +    }
> +
> +    /* Find the largest page mask from size */
> +    return 1ULL << (63 - clz64(size));
> +}
> +
>  /* Unmap the whole range in the notifier's scope. */
>  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>  {
>      IOMMUTLBEntry entry;
> -    hwaddr size;
> +    hwaddr size, remain;
>      hwaddr start = n->start;
>      hwaddr end = n->end;
>      IntelIOMMUState *s = as->iommu_state;
> @@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>      }
> 
>      assert(start <= end);
> -    size = end - start;
> -
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> -
> +    size = remain = end - start;
>      entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> +    entry.perm = IOMMU_NONE;
>      /* This field is meaningless for unmap */
>      entry.translated_addr = 0;
> -    entry.perm = IOMMU_NONE;
> -    entry.addr_mask = size - 1;
> +
> +    while (remain) {
> +        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> +
> +        entry.iova = start;
> +        entry.addr_mask = mask - 1;
> +        memory_region_notify_one(n, &entry);
> +    }
> 
>      trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>                               VTD_PCI_SLOT(as->devfn),
>                               VTD_PCI_FUNC(as->devfn),
> -                             entry.iova, size);
> +                             n->start, size);
> 
> -    map.iova = entry.iova;
> -    map.size = entry.addr_mask;
> +    map.iova = n->start;
> +    map.size = size;
>      iova_tree_remove(as->iova_tree, &map);
> -
> -    memory_region_notify_one(n, &entry);
>  }
> 
>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> ------------------------------------------------------------
> 
> Regards,
> 
> -- 
> Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-20 13:04         ` Peter Xu
@ 2019-06-24  5:22           ` Yan Zhao
  2019-06-24  6:14             ` Peter Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Yan Zhao @ 2019-06-24  5:22 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Auger Eric

On Thu, Jun 20, 2019 at 09:04:43PM +0800, Peter Xu wrote:
> On Thu, Jun 20, 2019 at 08:59:55PM +0800, Peter Xu wrote:
> > On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> > > On 20/06/19 06:02, Peter Xu wrote:
> > > > Seems workable, to be explicit - we can even cut it into chunks with
> > > > different size to be efficient.
> > > 
> > > Yes, this is not hard (completely untested):
> > > 
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 44b1231157..541538bc6c 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > >      }
> > >  
> > >      assert(start <= end);
> > > -    size = end - start;
> > > +    while (end > start) {
> > > +        size = end - start;
> > > +        /* Only keep the lowest bit of either size or start.  */
> > > +        size = MIN(size & -size, start & -start);
> > 
> > I feel like this can be problematic.  I'm imaging:
> > 
> > start=0x1000_0000, size=0x1000_1000
> > 
> > This will get size=0x1000 but actually we can do size=0x1000_0000 as
> > the first.
> > 
> > > +        /* Should not happen, but limit to address width too just in case */
> > > +        size = MIN(size, 1ULL << s->aw_bits);
> > >  
> > > -    if (ctpop64(size) != 1) {
> > > -        /*
> > > -         * This size cannot format a correct mask. Let's enlarge it to
> > > -         * suite the minimum available mask.
> > > -         */
> > > -        int n = 64 - clz64(size);
> > > -        if (n > s->aw_bits) {
> > > -            /* should not happen, but in case it happens, limit it */
> > > -            n = s->aw_bits;
> > > -        }
> > > -        size = 1ULL << n;
> > > -    }
> > > +        assert((start & (size - 1)) == 0);
> > >  
> > > -    entry.target_as = &address_space_memory;
> > > -    /* Adjust iova for the size */
> > > -    entry.iova = n->start & ~(size - 1);
> > > -    /* This field is meaningless for unmap */
> > > -    entry.translated_addr = 0;
> > > -    entry.perm = IOMMU_NONE;
> > > -    entry.addr_mask = size - 1;
> > > +        entry.target_as = &address_space_memory;
> > > +        entry.iova = start;
> > > +        /* This field is meaningless for unmap */
> > > +        entry.translated_addr = 0;
> > > +        entry.perm = IOMMU_NONE;
> > > +        entry.addr_mask = size - 1;
> > 
> > (some of the fields can be moved out of loop because they are
> >  constants)
> > 
> > >  
> > > -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > -                             VTD_PCI_SLOT(as->devfn),
> > > -                             VTD_PCI_FUNC(as->devfn),
> > > -                             entry.iova, size);
> > > +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > +                                 VTD_PCI_SLOT(as->devfn),
> > > +                                 VTD_PCI_FUNC(as->devfn),
> > > +                                 entry.iova, size);
> > 
> > Can move this out because this is a trace only so we don't have
> > restriction on mask?
> > 
> > >  
> > > -    map.iova = entry.iova;
> > > -    map.size = entry.addr_mask;
> > > -    iova_tree_remove(as->iova_tree, &map);
> > > +        map.iova = entry.iova;
> > > +        map.size = entry.addr_mask;
> > > +        iova_tree_remove(as->iova_tree, &map);
> > 
> > Same here?
> > 
> > >  
> > > -    memory_region_notify_one(n, &entry);
> > > +        memory_region_notify_one(n, &entry);
> > > +        start += size;
> > > +    }
> > >  }
> > >  
> > >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > > 
> > > 
> > > Yan,
> > > 
> > > if something like this works for you, let me know and I will submit it
> > > as a proper patch.
> > > 
> > > Paolo
> > 
> > Since during review I'm thinking how to generate a correct sequence of
> > these masks... here's my try below with above issues fixed... :)
> > 
> > I've tried compile but not tested.  Yan can test it, or I can do it
> > too tomorrow after I find some machines.
> > 
> > Thanks,
> > 
> > ------------------------------------------------------------
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 44b1231157..cfbd225f0a 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
> >      return vtd_dev_as;
> >  }
> > 
> > +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> > +{
> > +    /* Tries to find smallest mask from start first */
> > +    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> > +
> > +    assert(size && gaw > 0 && gaw < 64);
> > +
> > +    /* Zero start, or too big */
> > +    if (!rmask || rmask > max_mask) {
> > +        rmask = max_mask;
> > +    }
> > +
> > +    /* If the start mask worked, then use it */
> > +    if (rmask <= size) {
> > +        return rmask;
> > +    }
> > +
> > +    /* Find the largest page mask from size */
> > +    return 1ULL << (63 - clz64(size));
> > +}
> > +
> >  /* Unmap the whole range in the notifier's scope. */
> >  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> >  {
> >      IOMMUTLBEntry entry;
> > -    hwaddr size;
> > +    hwaddr size, remain;
> >      hwaddr start = n->start;
> >      hwaddr end = n->end;
> >      IntelIOMMUState *s = as->iommu_state;
> > @@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> >      }
> > 
> >      assert(start <= end);
> > -    size = end - start;
> > -
> > -    if (ctpop64(size) != 1) {
> > -        /*
> > -         * This size cannot format a correct mask. Let's enlarge it to
> > -         * suite the minimum available mask.
> > -         */
> > -        int n = 64 - clz64(size);
> > -        if (n > s->aw_bits) {
> > -            /* should not happen, but in case it happens, limit it */
> > -            n = s->aw_bits;
> > -        }
> > -        size = 1ULL << n;
> > -    }
> > -
> > +    size = remain = end - start;
> >      entry.target_as = &address_space_memory;
> > -    /* Adjust iova for the size */
> > -    entry.iova = n->start & ~(size - 1);
> > +    entry.perm = IOMMU_NONE;
> >      /* This field is meaningless for unmap */
> >      entry.translated_addr = 0;
> > -    entry.perm = IOMMU_NONE;
> > -    entry.addr_mask = size - 1;
> > +
> > +    while (remain) {
> > +        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> > +
> > +        entry.iova = start;
> > +        entry.addr_mask = mask - 1;
> > +        memory_region_notify_one(n, &entry);
> 
> Sorry, I at least missed these lines:
> 
>            start += mask;
>            remain -= mask;
> 
> > +    }
> > 
> >      trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> >                               VTD_PCI_SLOT(as->devfn),
> >                               VTD_PCI_FUNC(as->devfn),
> > -                             entry.iova, size);
> > +                             n->start, size);
> > 
> > -    map.iova = entry.iova;
> > -    map.size = entry.addr_mask;
> > +    map.iova = n->start;
> > +    map.size = size;
> >      iova_tree_remove(as->iova_tree, &map);
> > -
> > -    memory_region_notify_one(n, &entry);
> >  }
> > 
> >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > ------------------------------------------------------------
> > 
> > Regards,
> > 
> > -- 
> > Peter Xu
> 
> Regards,
> 
> -- 
> Peter Xu

hi Peter and Paolo,
I tested with code and it's fine in my side.
It's base on your version with some minor modifications, such as size is
now  (end - start + 1) now.
Thanks
Yan

+static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
+{
+    /* Tries to find smallest mask from start first */
+    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
+    assert(size && gaw > 0 && gaw < 64);
+    /* Zero start, or too big */
+    if (!rmask || rmask > max_mask) {
+        rmask = max_mask;
+    }
+    /* If the start mask worked, then use it */
+    if (rmask <= size) {
+        return rmask;
+    }
+
+    /* Find the largest page mask from size */
+    return 1ULL << (63 - clz64(size));
+}
+
 /* Unmap the whole range in the notifier's scope. */
 static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
 {
     IOMMUTLBEntry entry;
-    hwaddr size;
+    hwaddr size, remain;
     hwaddr start = n->start;
     hwaddr end = n->end;
     IntelIOMMUState *s = as->iommu_state;
@@ -3380,48 +3398,46 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
      * VT-d spec), otherwise we need to consider overflow of 64 bits.
      */

-    if (end > VTD_ADDRESS_SIZE(s->aw_bits)) {
+    if (end > VTD_ADDRESS_SIZE(s->aw_bits) - 1) {
         /*
          * Don't need to unmap regions that is bigger than the whole
          * VT-d supported address space size
          */
-        end = VTD_ADDRESS_SIZE(s->aw_bits);
+        end = VTD_ADDRESS_SIZE(s->aw_bits) - 1;
     }

     assert(start <= end);
-    size = end - start;

-    if (ctpop64(size) != 1) {
-        /*
-         * This size cannot format a correct mask. Let's enlarge it to
-         * suite the minimum available mask.
-         */
-        int n = 64 - clz64(size);
-        if (n > s->aw_bits) {
-            /* should not happen, but in case it happens, limit it */
-            n = s->aw_bits;
-        }
-        size = 1ULL << n;
-    }
+    size = remain = end - start + 1;

     entry.target_as = &address_space_memory;
-    /* Adjust iova for the size */
-    entry.iova = n->start & ~(size - 1);
+
+    entry.perm = IOMMU_NONE;
     /* This field is meaningless for unmap */
     entry.translated_addr = 0;
-    entry.perm = IOMMU_NONE;
-    entry.addr_mask = size - 1;
+
+    while (remain >= VTD_PAGE_SIZE) {
+        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
+
+        entry.iova = start;
+        entry.addr_mask = mask - 1;
+        memory_region_notify_one(n, &entry);
+        start += mask;
+        remain -= mask;
+    }
+
+    if (remain) {
+        warn_report("Unmapping unaligned range %lx-%lx", start, end);
+    }

     trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
                              VTD_PCI_SLOT(as->devfn),
                              VTD_PCI_FUNC(as->devfn),
-                             entry.iova, size);
-
-    map.iova = entry.iova;
-    map.size = entry.addr_mask;
+                             n->start, size);
+    map.iova = n->start;
+    map.size = size;
     iova_tree_remove(as->iova_tree, &map);

-    memory_region_notify_one(n, &entry);
 }

 static void vtd_address_space_unmap_all(IntelIOMMUState *s)
--
2.7.4





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
  2019-06-24  5:22           ` Yan Zhao
@ 2019-06-24  6:14             ` Peter Xu
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Xu @ 2019-06-24  6:14 UTC (permalink / raw)
  To: Yan Zhao; +Cc: Paolo Bonzini, qemu-devel, Auger Eric

On Mon, Jun 24, 2019 at 01:22:55AM -0400, Yan Zhao wrote:
> On Thu, Jun 20, 2019 at 09:04:43PM +0800, Peter Xu wrote:
> > On Thu, Jun 20, 2019 at 08:59:55PM +0800, Peter Xu wrote:
> > > On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> > > > On 20/06/19 06:02, Peter Xu wrote:
> > > > > Seems workable, to be explicit - we can even cut it into chunks with
> > > > > different size to be efficient.
> > > > 
> > > > Yes, this is not hard (completely untested):
> > > > 
> > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > index 44b1231157..541538bc6c 100644
> > > > --- a/hw/i386/intel_iommu.c
> > > > +++ b/hw/i386/intel_iommu.c
> > > > @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > > >      }
> > > >  
> > > >      assert(start <= end);
> > > > -    size = end - start;
> > > > +    while (end > start) {
> > > > +        size = end - start;
> > > > +        /* Only keep the lowest bit of either size or start.  */
> > > > +        size = MIN(size & -size, start & -start);
> > > 
> > > I feel like this can be problematic.  I'm imaging:
> > > 
> > > start=0x1000_0000, size=0x1000_1000
> > > 
> > > This will get size=0x1000 but actually we can do size=0x1000_0000 as
> > > the first.
> > > 
> > > > +        /* Should not happen, but limit to address width too just in case */
> > > > +        size = MIN(size, 1ULL << s->aw_bits);
> > > >  
> > > > -    if (ctpop64(size) != 1) {
> > > > -        /*
> > > > -         * This size cannot format a correct mask. Let's enlarge it to
> > > > -         * suite the minimum available mask.
> > > > -         */
> > > > -        int n = 64 - clz64(size);
> > > > -        if (n > s->aw_bits) {
> > > > -            /* should not happen, but in case it happens, limit it */
> > > > -            n = s->aw_bits;
> > > > -        }
> > > > -        size = 1ULL << n;
> > > > -    }
> > > > +        assert((start & (size - 1)) == 0);
> > > >  
> > > > -    entry.target_as = &address_space_memory;
> > > > -    /* Adjust iova for the size */
> > > > -    entry.iova = n->start & ~(size - 1);
> > > > -    /* This field is meaningless for unmap */
> > > > -    entry.translated_addr = 0;
> > > > -    entry.perm = IOMMU_NONE;
> > > > -    entry.addr_mask = size - 1;
> > > > +        entry.target_as = &address_space_memory;
> > > > +        entry.iova = start;
> > > > +        /* This field is meaningless for unmap */
> > > > +        entry.translated_addr = 0;
> > > > +        entry.perm = IOMMU_NONE;
> > > > +        entry.addr_mask = size - 1;
> > > 
> > > (some of the fields can be moved out of loop because they are
> > >  constants)
> > > 
> > > >  
> > > > -    trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > > -                             VTD_PCI_SLOT(as->devfn),
> > > > -                             VTD_PCI_FUNC(as->devfn),
> > > > -                             entry.iova, size);
> > > > +        trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > > +                                 VTD_PCI_SLOT(as->devfn),
> > > > +                                 VTD_PCI_FUNC(as->devfn),
> > > > +                                 entry.iova, size);
> > > 
> > > Can move this out because this is a trace only so we don't have
> > > restriction on mask?
> > > 
> > > >  
> > > > -    map.iova = entry.iova;
> > > > -    map.size = entry.addr_mask;
> > > > -    iova_tree_remove(as->iova_tree, &map);
> > > > +        map.iova = entry.iova;
> > > > +        map.size = entry.addr_mask;
> > > > +        iova_tree_remove(as->iova_tree, &map);
> > > 
> > > Same here?
> > > 
> > > >  
> > > > -    memory_region_notify_one(n, &entry);
> > > > +        memory_region_notify_one(n, &entry);
> > > > +        start += size;
> > > > +    }
> > > >  }
> > > >  
> > > >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > > > 
> > > > 
> > > > Yan,
> > > > 
> > > > if something like this works for you, let me know and I will submit it
> > > > as a proper patch.
> > > > 
> > > > Paolo
> > > 
> > > Since during review I'm thinking how to generate a correct sequence of
> > > these masks... here's my try below with above issues fixed... :)
> > > 
> > > I've tried compile but not tested.  Yan can test it, or I can do it
> > > too tomorrow after I find some machines.
> > > 
> > > Thanks,
> > > 
> > > ------------------------------------------------------------
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 44b1231157..cfbd225f0a 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
> > >      return vtd_dev_as;
> > >  }
> > > 
> > > +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> > > +{
> > > +    /* Tries to find smallest mask from start first */
> > > +    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> > > +
> > > +    assert(size && gaw > 0 && gaw < 64);
> > > +
> > > +    /* Zero start, or too big */
> > > +    if (!rmask || rmask > max_mask) {
> > > +        rmask = max_mask;
> > > +    }
> > > +
> > > +    /* If the start mask worked, then use it */
> > > +    if (rmask <= size) {
> > > +        return rmask;
> > > +    }
> > > +
> > > +    /* Find the largest page mask from size */
> > > +    return 1ULL << (63 - clz64(size));
> > > +}
> > > +
> > >  /* Unmap the whole range in the notifier's scope. */
> > >  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > >  {
> > >      IOMMUTLBEntry entry;
> > > -    hwaddr size;
> > > +    hwaddr size, remain;
> > >      hwaddr start = n->start;
> > >      hwaddr end = n->end;
> > >      IntelIOMMUState *s = as->iommu_state;
> > > @@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > >      }
> > > 
> > >      assert(start <= end);
> > > -    size = end - start;
> > > -
> > > -    if (ctpop64(size) != 1) {
> > > -        /*
> > > -         * This size cannot format a correct mask. Let's enlarge it to
> > > -         * suite the minimum available mask.
> > > -         */
> > > -        int n = 64 - clz64(size);
> > > -        if (n > s->aw_bits) {
> > > -            /* should not happen, but in case it happens, limit it */
> > > -            n = s->aw_bits;
> > > -        }
> > > -        size = 1ULL << n;
> > > -    }
> > > -
> > > +    size = remain = end - start;
> > >      entry.target_as = &address_space_memory;
> > > -    /* Adjust iova for the size */
> > > -    entry.iova = n->start & ~(size - 1);
> > > +    entry.perm = IOMMU_NONE;
> > >      /* This field is meaningless for unmap */
> > >      entry.translated_addr = 0;
> > > -    entry.perm = IOMMU_NONE;
> > > -    entry.addr_mask = size - 1;
> > > +
> > > +    while (remain) {
> > > +        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> > > +
> > > +        entry.iova = start;
> > > +        entry.addr_mask = mask - 1;
> > > +        memory_region_notify_one(n, &entry);
> > 
> > Sorry, I at least missed these lines:
> > 
> >            start += mask;
> >            remain -= mask;
> > 
> > > +    }
> > > 
> > >      trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > >                               VTD_PCI_SLOT(as->devfn),
> > >                               VTD_PCI_FUNC(as->devfn),
> > > -                             entry.iova, size);
> > > +                             n->start, size);
> > > 
> > > -    map.iova = entry.iova;
> > > -    map.size = entry.addr_mask;
> > > +    map.iova = n->start;
> > > +    map.size = size;
> > >      iova_tree_remove(as->iova_tree, &map);
> > > -
> > > -    memory_region_notify_one(n, &entry);
> > >  }
> > > 
> > >  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > > ------------------------------------------------------------
> > > 
> > > Regards,
> > > 
> > > -- 
> > > Peter Xu
> > 
> > Regards,
> > 
> > -- 
> > Peter Xu
> 
> hi Peter and Paolo,
> I tested with code and it's fine in my side.
> It's base on your version with some minor modifications, such as size is
> now  (end - start + 1) now.
> Thanks
> Yan

Hi, Yan,

Thanks for testing the patches.  I think below change [1] is not
related to the problem so I tend to split it out.  For [2] I'll change
to an assertion if you won't disagree.  I'll reorganize the patches
and post a formal version with proper authorships soon.

Thanks,

> 
> +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> +{
> +    /* Tries to find smallest mask from start first */
> +    uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> +    assert(size && gaw > 0 && gaw < 64);
> +    /* Zero start, or too big */
> +    if (!rmask || rmask > max_mask) {
> +        rmask = max_mask;
> +    }
> +    /* If the start mask worked, then use it */
> +    if (rmask <= size) {
> +        return rmask;
> +    }
> +
> +    /* Find the largest page mask from size */
> +    return 1ULL << (63 - clz64(size));
> +}
> +
>  /* Unmap the whole range in the notifier's scope. */
>  static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>  {
>      IOMMUTLBEntry entry;
> -    hwaddr size;
> +    hwaddr size, remain;
>      hwaddr start = n->start;
>      hwaddr end = n->end;
>      IntelIOMMUState *s = as->iommu_state;
> @@ -3380,48 +3398,46 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
>       * VT-d spec), otherwise we need to consider overflow of 64 bits.
>       */
> 
> -    if (end > VTD_ADDRESS_SIZE(s->aw_bits)) {
> +    if (end > VTD_ADDRESS_SIZE(s->aw_bits) - 1) {
>          /*
>           * Don't need to unmap regions that is bigger than the whole
>           * VT-d supported address space size
>           */
> -        end = VTD_ADDRESS_SIZE(s->aw_bits);
> +        end = VTD_ADDRESS_SIZE(s->aw_bits) - 1;

[1]

>      }
> 
>      assert(start <= end);
> -    size = end - start;
> 
> -    if (ctpop64(size) != 1) {
> -        /*
> -         * This size cannot format a correct mask. Let's enlarge it to
> -         * suite the minimum available mask.
> -         */
> -        int n = 64 - clz64(size);
> -        if (n > s->aw_bits) {
> -            /* should not happen, but in case it happens, limit it */
> -            n = s->aw_bits;
> -        }
> -        size = 1ULL << n;
> -    }
> +    size = remain = end - start + 1;
> 
>      entry.target_as = &address_space_memory;
> -    /* Adjust iova for the size */
> -    entry.iova = n->start & ~(size - 1);
> +
> +    entry.perm = IOMMU_NONE;
>      /* This field is meaningless for unmap */
>      entry.translated_addr = 0;
> -    entry.perm = IOMMU_NONE;
> -    entry.addr_mask = size - 1;
> +
> +    while (remain >= VTD_PAGE_SIZE) {
> +        uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> +
> +        entry.iova = start;
> +        entry.addr_mask = mask - 1;
> +        memory_region_notify_one(n, &entry);
> +        start += mask;
> +        remain -= mask;
> +    }
> +
> +    if (remain) {
> +        warn_report("Unmapping unaligned range %lx-%lx", start, end);

[2]

> +    }
> 
>      trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
>                               VTD_PCI_SLOT(as->devfn),
>                               VTD_PCI_FUNC(as->devfn),
> -                             entry.iova, size);
> -
> -    map.iova = entry.iova;
> -    map.size = entry.addr_mask;
> +                             n->start, size);
> +    map.iova = n->start;
> +    map.size = size;
>      iova_tree_remove(as->iova_tree, &map);
> 
> -    memory_region_notify_one(n, &entry);
>  }
> 
>  static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> --
> 2.7.4
> 
> 
> 

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-06-24  6:17 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-19  8:49 [Qemu-devel] [PATCH] memory: do not do out of bound notification Yan Zhao
2019-06-19 13:17 ` Auger Eric
2019-06-20  1:46   ` Yan Zhao
2019-06-20  4:02   ` Peter Xu
2019-06-20  4:14     ` Yan Zhao
2019-06-20  8:14       ` Peter Xu
2019-06-20  8:13         ` Yan Zhao
2019-06-20  8:35     ` Paolo Bonzini
2019-06-20 10:57       ` Yan Zhao
2019-06-20 12:04         ` Paolo Bonzini
2019-06-20 12:59       ` Peter Xu
2019-06-20 13:04         ` Peter Xu
2019-06-24  5:22           ` Yan Zhao
2019-06-24  6:14             ` Peter Xu
2019-06-20 13:14         ` Paolo Bonzini
2019-06-21  2:36           ` Peter Xu
2019-06-21  7:57         ` Yan Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).