All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier
@ 2020-06-25 19:16 Eugenio Pérez
  2020-06-25 19:16 ` [RFC 1/1] " Eugenio Pérez
                   ` (3 more replies)
  0 siblings, 4 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-06-25 19:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Juan Quintela, Jason Wang, Peter Xu, Avi Kivity,
	Paolo Bonzini

I am able to hit this assertion when a Red Hat 7 guest virtio_net device
raises an "Invalidation" of all the TLB entries. This happens in the
guest's startup if 'intel_iommu=on' argument is passed to the guest
kernel and right IOMMU/ATS devices are declared in qemu's command line.

If we examinate *entry in frame 4 of backtrace:
*entry = {target_as = 0x555556f6c050, iova = 0x0, translated_addr = 0x0,
addr_mask = 0xffffffffffffffff, perm = 0x0}

Which (I think) tries to invalidate all the TLB registers of the device.

Just deleting that assert is enough for the VM to start and communicate
using IOMMU, but maybe a better alternative is possible. We could move
it to the caller functions in other cases than IOMMU invalidation, or
make it conditional only if not invalidating.

Any comment would be appreciated. Thanks!

--

Guest kernel version: kernel-3.10.0-1151.el7.x86_64

Command line:
/home/qemu/x86_64-softmmu/qemu-system-x86_64 -name \
guest=rhel7-test,debug-threads=on -machine \
pc-q35-5.1,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split \
-cpu \
Broadwell,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,rtm=on,hle=on \
-m 8096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid \
d022ecbf-679e-4755-87ce-eb87fc5bbc5d -display none -no-user-config \
-nodefaults -rtc base=utc,driftfix=slew -global \
kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global \
ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on \
-device intel-iommu,intremap=on,device-iotlb=on -device \
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \
-device \
pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
-device \
pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
-device \
pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
-device \
pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
-device \
pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
-device \
pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
-device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device \
virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -drive \
file=/home/virtio-test2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 \
-device \
virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,vhostforce=on -device \
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:1d:f2,bus=pci.1,addr=0x0,iommu_platform=on,ats=on \
-device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object \
rng-random,id=objrng0,filename=/dev/urandom -device \
virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -s -msg \
timestamp=on

Full backtrace:

#0 0x00007ffff521370f in raise () at /lib64/libc.so.6
#1 0x00007ffff51fdb25 in abort () at /lib64/libc.so.6
#2 0x00007ffff51fd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3 0x00007ffff520bcc6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4 0x0000555555888171 in memory_region_notify_one (notifier=0x7ffde05dfde8, entry=0x7ffde5dfe200) at /home/qemu/memory.c:1918
#5 0x0000555555888247 in memory_region_notify_iommu (iommu_mr=0x555556f6c0b0, iommu_idx=0, entry=...) at /home/qemu/memory.c:1941
#6 0x0000555555951c8d in vtd_process_device_iotlb_desc (s=0x555557609000, inv_desc=0x7ffde5dfe2d0)
    at /home/qemu/hw/i386/intel_iommu.c:2468
#7 0x0000555555951e6a in vtd_process_inv_desc (s=0x555557609000) at /home/qemu/hw/i386/intel_iommu.c:2531
#8 0x0000555555951fa5 in vtd_fetch_inv_desc (s=0x555557609000) at /home/qemu/hw/i386/intel_iommu.c:2563
#9 0x00005555559520e5 in vtd_handle_iqt_write (s=0x555557609000) at /home/qemu/hw/i386/intel_iommu.c:2590
#10 0x0000555555952b45 in vtd_mem_write (opaque=0x555557609000, addr=136, val=2688, size=4) at /home/qemu/hw/i386/intel_iommu.c:2837
#11 0x0000555555883e17 in memory_region_write_accessor
    (mr=0x555557609330, addr=136, value=0x7ffde5dfe478, size=4, shift=0, mask=4294967295, attrs=...) at /home/qemu/memory.c:483
#12 0x000055555588401d in access_with_adjusted_size
    (addr=136, value=0x7ffde5dfe478, size=4, access_size_min=4, access_size_max=8, access_fn=
    0x555555883d38 <memory_region_write_accessor>, mr=0x555557609330, attrs=...) at /home/qemu/memory.c:544
#13 0x0000555555886f37 in memory_region_dispatch_write (mr=0x555557609330, addr=136, data=2688, op=MO_32, attrs=...)
    at /home/qemu/memory.c:1476
#14 0x0000555555827a03 in flatview_write_continue
    (fv=0x7ffde00935d0, addr=4275634312, attrs=..., ptr=0x7ffff7ff0028, len=4, addr1=136, l=4, mr=0x555557609330) at /home/qemu/exec.c:3146
#15 0x0000555555827b48 in flatview_write (fv=0x7ffde00935d0, addr=4275634312, attrs=..., buf=0x7ffff7ff0028, len=4)
    at /home/qemu/exec.c:3186
#16 0x0000555555827e9d in address_space_write
    (as=0x5555567ca640 <address_space_memory>, addr=4275634312, attrs=..., buf=0x7ffff7ff0028, len=4) at /home/qemu/exec.c:3277
#17 0x0000555555827f0a in address_space_rw
    (as=0x5555567ca640 <address_space_memory>, addr=4275634312, attrs=..., buf=0x7ffff7ff0028, len=4, is_write=true)
    at /home/qemu/exec.c:3287
#18 0x000055555589b633 in kvm_cpu_exec (cpu=0x555556b65640) at /home/qemu/accel/kvm/kvm-all.c:2511
#19 0x0000555555876ba8 in qemu_kvm_cpu_thread_fn (arg=0x555556b65640) at /home/qemu/cpus.c:1284
#20 0x0000555555dafff1 in qemu_thread_start (args=0x555556b8c3b0) at util/qemu-thread-posix.c:521
#21 0x00007ffff55a62de in start_thread () at /lib64/libpthread.so.0
#22 0x00007ffff52d7e83 in clone () at /lib64/libc.so.6

--

Eugenio Pérez (1):
  memory: Delete assertion in memory_region_unregister_iommu_notifier

 memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.18.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC 1/1] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-06-25 19:16 [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Pérez
@ 2020-06-25 19:16 ` Eugenio Pérez
  2020-06-25 19:29 ` [RFC 0/1] " no-reply
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-06-25 19:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Juan Quintela, Jason Wang, Peter Xu, Avi Kivity,
	Paolo Bonzini

Bug reference: https://bugs.launchpad.net/qemu/+bug/1885175

It is possible to hit this assertion on rhel7 guests if iommu is
properly enabled.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/memory.c b/memory.c
index 2f15a4b250..4db609b843 100644
--- a/memory.c
+++ b/memory.c
@@ -1915,7 +1915,7 @@ void memory_region_notify_one(IOMMUNotifier *notifier,
         return;
     }
 
-    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    // assert(entry->iova >= notifier->start && entry_end <= notifier->end);
 
     if (entry->perm & IOMMU_RW) {
         request_flags = IOMMU_NOTIFIER_MAP;
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-06-25 19:16 [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Pérez
  2020-06-25 19:16 ` [RFC 1/1] " Eugenio Pérez
@ 2020-06-25 19:29 ` no-reply
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
  3 siblings, 0 replies; 38+ messages in thread
From: no-reply @ 2020-06-25 19:29 UTC (permalink / raw)
  To: eperezma
  Cc: peter.maydell, quintela, jasowang, qemu-devel, peterx, avi, pbonzini

Patchew URL: https://patchew.org/QEMU/20200625191651.5817-1-eperezma@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier
Type: series
Message-id: 20200625191651.5817-1-eperezma@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20200625191651.5817-1-eperezma@redhat.com -> patchew/20200625191651.5817-1-eperezma@redhat.com
Switched to a new branch 'test'
3dcb120 memory: Delete assertion in memory_region_unregister_iommu_notifier

=== OUTPUT BEGIN ===
ERROR: do not use C99 // comments
#24: FILE: memory.c:1918:
+    // assert(entry->iova >= notifier->start && entry_end <= notifier->end);

total: 1 errors, 0 warnings, 8 lines checked

Commit 3dcb1204223a (memory: Delete assertion in memory_region_unregister_iommu_notifier) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200625191651.5817-1-eperezma@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-06-25 19:16 [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Pérez
  2020-06-25 19:16 ` [RFC 1/1] " Eugenio Pérez
  2020-06-25 19:29 ` [RFC 0/1] " no-reply
@ 2020-08-26 14:36 ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 01/13] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
                     ` (13 more replies)
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
  3 siblings, 14 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

I am able to hit this assertion when a Red Hat 7 guest virtio_net device
raises an "Invalidation" of all the TLB entries. This happens in the
guest's startup if 'intel_iommu=on' argument is passed to the guest
kernel and right IOMMU/ATS devices are declared in qemu's command line.

Command line:
/home/qemu/x86_64-softmmu/qemu-system-x86_64 -name \
guest=rhel7-test,debug-threads=on -machine \
pc-q35-5.1,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split \
-cpu \
Broadwell,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,rtm=on,hle=on \
-m 8096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid \
d022ecbf-679e-4755-87ce-eb87fc5bbc5d -display none -no-user-config \
-nodefaults -rtc base=utc,driftfix=slew -global \
kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global \
ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on \
-device intel-iommu,intremap=on,device-iotlb=on -device \
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \
-device \
pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
-device \
pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
-device \
pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
-device \
pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
-device \
pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
-device \
pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
-device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device \
virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -drive \
file=/home/virtio-test2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 \
-device \
virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,vhostforce=on -device \
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:1d:f2,bus=pci.1,addr=0x0,iommu_platform=on,ats=on \
-device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object \
rng-random,id=objrng0,filename=/dev/urandom -device \
virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -s -msg \
timestamp=on

Full backtrace:

#0  0x00007ffff521370f in raise () at /lib64/libc.so.6
#1  0x00007ffff51fdb25 in abort () at /lib64/libc.so.6
#2  0x00007ffff51fd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3  0x00007ffff520bcc6 in .annobin_assert.c_end () at /lib64/libc.so.6
#4  0x0000555555888171 in memory_region_notify_one (notifier=0x7ffde0487fa8,
                                                    entry=0x7ffde5dfe200)
                          at /home/qemu/memory.c:1918
#5  0x0000555555888247 in memory_region_notify_iommu (iommu_mr=0x555556f6c0b0,
                                                      iommu_idx=0, entry=...)
                          at /home/qemu/memory.c:1941
#6  0x0000555555951c8d in vtd_process_device_iotlb_desc (s=0x555557609000,
                                                       inv_desc=0x7ffde5dfe2d0)
                          at /home/qemu/hw/i386/intel_iommu.c:2468
#7  0x0000555555951e6a in vtd_process_inv_desc (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2531
#8  0x0000555555951fa5 in vtd_fetch_inv_desc (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2563
#9  0x00005555559520e5 in vtd_handle_iqt_write (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2590
#10 0x0000555555952b45 in vtd_mem_write (opaque=0x555557609000, addr=136,
                                         val=2688, size=4)
                          at /home/qemu/hw/i386/intel_iommu.c:2837
#11 0x0000555555883e17 in memory_region_write_accessor (mr=0x555557609330,
                                                        addr=136,
                                                        value=0x7ffde5dfe478,
                                                        size=4,
                                                        shift=0,
                                                        mask=4294967295,
                                                        attrs=...)
                         at /home/qemu/memory.c:483
#12 0x000055555588401d in access_with_adjusted_size (addr=136,
                       value=0x7ffde5dfe478,
                       size=4,
                       access_size_min=4,
                       access_size_max=8,
                       access_fn=0x555555883d38 <memory_region_write_accessor>,
                       mr=0x555557609330,
                       attrs=...)
                       at /home/qemu/memory.c:544
#13 0x0000555555886f37 in memory_region_dispatch_write (mr=0x555557609330,
                                                       addr=136,
                                                       data=2688,
                                                       op=MO_32,
                                                       attrs=...)
                         at /home/qemu/memory.c:1476
#14 0x0000555555827a03 in flatview_write_continue (fv=0x7ffdd8503150,
                                                   addr=4275634312,
                                                   attrs=...,
                                                   ptr=0x7ffff7ff0028,
                                                   len=4,
                                                   addr1=136,
                                                   l=4,
                                                   mr=0x555557609330)
                          at /home/qemu/exec.c:3146
#15 0x0000555555827b48 in flatview_write (fv=0x7ffdd8503150,
                                          addr=4275634312,
                                          attrs=...,
                                          buf=0x7ffff7ff0028,
                                          len=4)
                          at /home/qemu/exec.c:3186
#16 0x0000555555827e9d in address_space_write (
                                      as=0x5555567ca640 <address_space_memory>,
                                      addr=4275634312,
                                      attrs=...,
                                      buf=0x7ffff7ff0028,
                                      len=4)
                          at /home/qemu/exec.c:3277
#17 0x0000555555827f0a in address_space_rw (
                                      as=0x5555567ca640 <address_space_memory>,
                                      addr=4275634312,
                                      attrs=...,
                                      buf=0x7ffff7ff0028,
                                      len=4,
                                      is_write=true)
                          at /home/qemu/exec.c:3287
#18 0x000055555589b633 in kvm_cpu_exec (cpu=0x555556b65640)
                               at /home/qemu/accel/kvm/kvm-all.c:2511
#19 0x0000555555876ba8 in qemu_kvm_cpu_thread_fn (arg=0x555556b65640)
                               at /home/qemu/cpus.c:1284
#20 0x0000555555dafff1 in qemu_thread_start (args=0x555556b8c3b0)
                               at util/qemu-thread-posix.c:521
#21 0x00007ffff55a62de in start_thread () at /lib64/libpthread.so.0
#22 0x00007ffff52d7e83 in clone () at /lib64/libc.so.6

(gdb) frame 4
#4  0x0000555555888171 in memory_region_notify_one
                      (notifier=0x7ffde0487fa8, entry=0x7ffde5dfe200)
                      at /home/qemu/memory.c:1918
1918        assert(entry->iova >= notifier->start && entry_end <=
notifier->end);
(gdb) p *entry
$1 = {target_as = 0x555556f6c050, iova = 0, translated_addr = 0,
addr_mask = 18446744073709551615, perm = IOMMU_NONE}
--

Tested with vhost-net, host<->guest communication. Still more testing
needed, since it touches a few architectures and configurations.

v6: Introduce "type" field for IOMMUTLBEntry. Fill in all uses.
    Update tests reports with more fine-tuning (CPU, RPS/XPS tunning).

v5: Skip regular IOTLB notifications in dev_iotlb notifiers

v4: Rename IOMMU_NOTIFIER_IOTLB -> IOMMU_NOTIFIER_DEVIOTLB.
    Make vhost-net notifier just IOMMU_NOTIFIER_DEVIOTLB, not
    IOMMU_NOTIFIER_UNMAP

v3: Skip the assertion in case notifier is a IOTLB one, since they can manage
    arbitrary ranges. Using a flag in the notifier for now, as Peter suggested.

v2: Actually delete assertion instead of just commenting out using C99

Eugenio Pérez (13):
  memory: Rename memory_region_notify_one to
    memory_region_notify_iommu_one
  memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry
  hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  amd_iommu: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  hw/arm/smmu: Fill IOMMUTLBEntry notifier type
  dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  intel_iommu: Mark IOMMUTLBEntry of page notification as
    IOMMU_IOTLB_UNMAP type
  virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE
    type
  intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level
  memory: Notify IOMMU IOTLB based on entry type, not permissions
  memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType
  intel_iommu: Do not notify regular iotlb to device-iotlb notifiers
  memory: Skip bad range assertion if notifier is DEVIOTLB type

 hw/alpha/typhoon.c       |  1 +
 hw/arm/smmu-common.c     |  4 +++-
 hw/arm/smmuv3.c          |  4 +++-
 hw/dma/rc4030.c          |  1 +
 hw/i386/amd_iommu.c      |  7 ++++++-
 hw/i386/intel_iommu.c    | 14 +++++++++++--
 hw/virtio/vhost.c        |  2 +-
 hw/virtio/virtio-iommu.c |  1 +
 include/exec/memory.h    | 26 ++++++++++++++++-------
 softmmu/memory.c         | 45 ++++++++++++++++++++++++++++++----------
 10 files changed, 80 insertions(+), 25 deletions(-)

-- 
2.18.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v6 01/13] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry Eugenio Pérez
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Previous name didn't reflect the iommu operation.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/arm/smmu-common.c  | 2 +-
 hw/arm/smmuv3.c       | 2 +-
 hw/i386/intel_iommu.c | 4 ++--
 include/exec/memory.h | 6 +++---
 softmmu/memory.c      | 6 +++---
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e13a5f4a7c..b02ffb8822 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -396,7 +396,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
     entry.perm = IOMMU_NONE;
     entry.addr_mask = n->end - n->start;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_notify_iommu_one(n, &entry);
 }
 
 /* Unmap all notifiers attached to @mr */
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 57a79df55b..3bb85ab7e1 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -838,7 +838,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.addr_mask = (1 << tt->granule_sz) - 1;
     entry.perm = IOMMU_NONE;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_notify_iommu_one(n, &entry);
 }
 
 /* invalidate an asid/iova tuple in all mr's */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 5284bb68b6..2ad6b9d796 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3498,7 +3498,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
         /* This field is meaningless for unmap */
         entry.translated_addr = 0;
 
-        memory_region_notify_one(n, &entry);
+        memory_region_notify_iommu_one(n, &entry);
 
         start += mask;
         remain -= mask;
@@ -3536,7 +3536,7 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s)
 
 static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
 {
-    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    memory_region_notify_iommu_one((IOMMUNotifier *)private, entry);
     return 0;
 }
 
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 0cfe987ab4..22c5f564d1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -226,7 +226,7 @@ enum IOMMUMemoryRegionAttr {
  * The IOMMU implementation must use the IOMMU notifier infrastructure
  * to report whenever mappings are changed, by calling
  * memory_region_notify_iommu() (or, if necessary, by calling
- * memory_region_notify_one() for each registered notifier).
+ * memory_region_notify_iommu_one() for each registered notifier).
  *
  * Conceptually an IOMMU provides a mapping from input address
  * to an output TLB entry. If the IOMMU is aware of memory transaction
@@ -1274,7 +1274,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
                                 IOMMUTLBEntry entry);
 
 /**
- * memory_region_notify_one: notify a change in an IOMMU translation
+ * memory_region_notify_iommu_one: notify a change in an IOMMU translation
  *                           entry to a single notifier
  *
  * This works just like memory_region_notify_iommu(), but it only
@@ -1285,7 +1285,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
  *         replaces all old entries for the same virtual I/O address range.
  *         Deleted entries have .@perm == 0.
  */
-void memory_region_notify_one(IOMMUNotifier *notifier,
+void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
                               IOMMUTLBEntry *entry);
 
 /**
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 70b93104e8..961c25b42f 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1890,8 +1890,8 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr, NULL);
 }
 
-void memory_region_notify_one(IOMMUNotifier *notifier,
-                              IOMMUTLBEntry *entry)
+void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
+                                    IOMMUTLBEntry *entry)
 {
     IOMMUNotifierFlag request_flags;
     hwaddr entry_end = entry->iova + entry->addr_mask;
@@ -1927,7 +1927,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
 
     IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
         if (iommu_notifier->iommu_idx == iommu_idx) {
-            memory_region_notify_one(iommu_notifier, &entry);
+            memory_region_notify_iommu_one(iommu_notifier, &entry);
         }
     }
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 01/13] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 15:42     ` Peter Xu
  2020-08-26 14:36   ` [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

This way we can tell between MAPs and UNMAP, and potentially avoid to
send them to a notifier that does not require them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/exec/memory.h | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 22c5f564d1..f6d91c54aa 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -59,6 +59,12 @@ struct ReservedRegion {
 
 typedef struct IOMMUTLBEntry IOMMUTLBEntry;
 
+typedef enum {
+    IOMMU_IOTLB_NONE  = 0,
+    IOMMU_IOTLB_UNMAP = 1,
+    IOMMU_IOTLB_MAP   = 2,
+} IOMMUTLBNotificationType;
+
 /* See address_space_translate: bit 0 is read, bit 1 is write.  */
 typedef enum {
     IOMMU_NONE = 0,
@@ -70,11 +76,12 @@ typedef enum {
 #define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))
 
 struct IOMMUTLBEntry {
-    AddressSpace    *target_as;
-    hwaddr           iova;
-    hwaddr           translated_addr;
-    hwaddr           addr_mask;  /* 0xfff = 4k translation */
-    IOMMUAccessFlags perm;
+    AddressSpace            *target_as;
+    hwaddr                   iova;
+    hwaddr                   translated_addr;
+    hwaddr                   addr_mask;  /* 0xfff = 4k translation */
+    IOMMUAccessFlags         perm;
+    IOMMUTLBNotificationType type; /* Only valid if it is a notification */
 };
 
 /*
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 01/13] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 15:50     ` Peter Xu
  2020-08-26 14:36   ` [RFC v6 04/13] amd_iommu: " Eugenio Pérez
                     ` (10 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/alpha/typhoon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
index 29d44dfb06..b1e6c4e929 100644
--- a/hw/alpha/typhoon.c
+++ b/hw/alpha/typhoon.c
@@ -615,6 +615,7 @@ static bool make_iommu_tlbe(hwaddr taddr, hwaddr mask, IOMMUTLBEntry *ret)
         .translated_addr = taddr,
         .addr_mask = mask,
         .perm = IOMMU_RW,
+        .type = IOMMU_IOTLB_NONE,
     };
     return true;
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 04/13] amd_iommu: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (2 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 05/13] hw/arm/smmu: Fill IOMMUTLBEntry notifier type Eugenio Pérez
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/amd_iommu.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 087f601666..c2607e9e91 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -946,6 +946,7 @@ static void amdvi_page_walk(AMDVIAddressSpace *as, uint64_t *dte,
         ret->translated_addr = (pte & AMDVI_DEV_PT_ROOT_MASK) & page_mask;
         ret->addr_mask = ~page_mask;
         ret->perm = amdvi_get_perms(pte);
+        ret->type = IOMMU_IOTLB_NONE;
         return;
     }
 no_remap:
@@ -953,6 +954,7 @@ no_remap:
     ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
     ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
     ret->perm = amdvi_get_perms(pte);
+    ret->type = IOMMU_IOTLB_NONE;
 }
 
 static void amdvi_do_translate(AMDVIAddressSpace *as, hwaddr addr,
@@ -970,6 +972,7 @@ static void amdvi_do_translate(AMDVIAddressSpace *as, hwaddr addr,
         ret->translated_addr = iotlb_entry->translated_addr;
         ret->addr_mask = iotlb_entry->page_mask;
         ret->perm = iotlb_entry->perms;
+        ret->type = IOMMU_IOTLB_NONE;
         return;
     }
 
@@ -994,6 +997,7 @@ out:
     ret->translated_addr = addr & AMDVI_PAGE_MASK_4K;
     ret->addr_mask = ~AMDVI_PAGE_MASK_4K;
     ret->perm = IOMMU_RW;
+    ret->type = IOMMU_IOTLB_NONE;
 }
 
 static inline bool amdvi_is_interrupt_addr(hwaddr addr)
@@ -1011,7 +1015,8 @@ static IOMMUTLBEntry amdvi_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
         .iova = addr,
         .translated_addr = 0,
         .addr_mask = ~(hwaddr)0,
-        .perm = IOMMU_NONE
+        .perm = IOMMU_NONE,
+        .type = IOMMU_IOTLB_NONE
     };
 
     if (!s->enabled) {
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 05/13] hw/arm/smmu: Fill IOMMUTLBEntry notifier type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (3 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 04/13] amd_iommu: " Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 06/13] dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/arm/smmu-common.c | 2 ++
 hw/arm/smmuv3.c      | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index b02ffb8822..88cf1b86ea 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -181,6 +181,7 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
 
     tlbe->iova = iova;
     tlbe->addr_mask = (1 << granule_sz) - 1;
+    tlbe->type = IOMMU_IOTLB_NONE;
 
     while (level <= 3) {
         uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
@@ -395,6 +396,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
     entry.iova = n->start;
     entry.perm = IOMMU_NONE;
     entry.addr_mask = n->end - n->start;
+    entry.type = IOMMU_IOTLB_UNMAP,
 
     memory_region_notify_iommu_one(n, &entry);
 }
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 3bb85ab7e1..dee987b2b1 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -635,6 +635,7 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
         .translated_addr = addr,
         .addr_mask = ~(hwaddr)0,
         .perm = IOMMU_NONE,
+        .type = IOMMU_IOTLB_NONE,
     };
     SMMUIOTLBKey key, *new_key;
 
@@ -837,6 +838,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.iova = iova;
     entry.addr_mask = (1 << tt->granule_sz) - 1;
     entry.perm = IOMMU_NONE;
+    entry.type = IOMMU_NOTIFIER_UNMAP;
 
     memory_region_notify_iommu_one(n, &entry);
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 06/13] dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (4 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 05/13] hw/arm/smmu: Fill IOMMUTLBEntry notifier type Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 07/13] intel_iommu: Mark IOMMUTLBEntry of page notification as IOMMU_IOTLB_UNMAP type Eugenio Pérez
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/dma/rc4030.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
index 7eddc9a776..8eee12b1cb 100644
--- a/hw/dma/rc4030.c
+++ b/hw/dma/rc4030.c
@@ -506,6 +506,7 @@ static IOMMUTLBEntry rc4030_dma_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
         .translated_addr = 0,
         .addr_mask = DMA_PAGESIZE - 1,
         .perm = IOMMU_NONE,
+        .type = DEV_IOTLB_NONE,
     };
     uint64_t i, entry_address;
     dma_pagetable_entry entry;
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 07/13] intel_iommu: Mark IOMMUTLBEntry of page notification as IOMMU_IOTLB_UNMAP type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (5 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 06/13] dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 08/13] virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE type Eugenio Pérez
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/intel_iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2ad6b9d796..ed83e496b8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1999,6 +1999,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
                     .translated_addr = 0,
                     .addr_mask = size - 1,
                     .perm = IOMMU_NONE,
+                    .type = IOMMU_IOTLB_UNMAP,
                 };
                 memory_region_notify_iommu(&vtd_as->iommu, 0, entry);
             }
@@ -2465,6 +2466,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
     entry.iova = addr;
     entry.perm = IOMMU_NONE;
     entry.translated_addr = 0;
+    entry.type = IOMMU_IOTLB_UNMAP;
     memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
 
 done:
@@ -3497,6 +3499,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
         entry.perm = IOMMU_NONE;
         /* This field is meaningless for unmap */
         entry.translated_addr = 0;
+        entry.type = IOMMU_NOTIFIER_UNMAP;
 
         memory_region_notify_iommu_one(n, &entry);
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 08/13] virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (6 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 07/13] intel_iommu: Mark IOMMUTLBEntry of page notification as IOMMU_IOTLB_UNMAP type Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 09/13] intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level Eugenio Pérez
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/virtio-iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 5d56865e56..16267e8096 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -619,6 +619,7 @@ static IOMMUTLBEntry virtio_iommu_translate(IOMMUMemoryRegion *mr, hwaddr addr,
         .translated_addr = addr,
         .addr_mask = (1 << ctz32(s->config.page_size_mask)) - 1,
         .perm = IOMMU_NONE,
+        .type = IOMMU_IOTLB_NONE,
     };
 
     bypass_allowed = virtio_vdev_has_feature(&s->parent_obj,
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 09/13] intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (7 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 08/13] virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE type Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 10/13] memory: Notify IOMMU IOTLB based on entry type, not permissions Eugenio Pérez
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/intel_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ed83e496b8..0b3399874f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1251,6 +1251,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
             entry.addr_mask = ~subpage_mask;
             /* NOTE: this is only meaningful if entry_valid == true */
             entry.translated_addr = vtd_get_slpte_addr(slpte, info->aw);
+            entry.type = entry.perm ? IOMMU_IOTLB_MAP : IOMMU_IOTLB_UNMAP;
             ret = vtd_page_walk_one(&entry, info);
         }
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 10/13] memory: Notify IOMMU IOTLB based on entry type, not permissions
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (8 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 09/13] intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 11/13] memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

This way the intention is much clearer.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 softmmu/memory.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 961c25b42f..3e68442ca6 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1890,10 +1890,27 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr, NULL);
 }
 
+static IOMMUNotifierFlag notifier_type_iommu(const IOMMUNotifier *notifier)
+{
+    return notifier->notifier_flags & IOMMU_NOTIFIER_ALL;
+}
+
+static bool memory_region_notify(const IOMMUNotifier *notifier,
+                                 const IOMMUTLBEntry *entry)
+{
+    switch(entry->type) {
+    case IOMMU_IOTLB_MAP:
+        return notifier_type_iommu(notifier) == IOMMU_NOTIFIER_MAP;
+    case IOMMU_IOTLB_UNMAP:
+        return notifier_type_iommu(notifier) == IOMMU_NOTIFIER_UNMAP;
+    default:
+        return false;
+    };
+}
+
 void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
                                     IOMMUTLBEntry *entry)
 {
-    IOMMUNotifierFlag request_flags;
     hwaddr entry_end = entry->iova + entry->addr_mask;
 
     /*
@@ -1906,13 +1923,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
 
     assert(entry->iova >= notifier->start && entry_end <= notifier->end);
 
-    if (entry->perm & IOMMU_RW) {
-        request_flags = IOMMU_NOTIFIER_MAP;
-    } else {
-        request_flags = IOMMU_NOTIFIER_UNMAP;
-    }
-
-    if (notifier->notifier_flags & request_flags) {
+    if (memory_region_notify(notifier, entry)) {
         notifier->notify(notifier, entry);
     }
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 11/13] memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (9 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 10/13] memory: Notify IOMMU IOTLB based on entry type, not permissions Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 14:36   ` [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers Eugenio Pérez
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Adapt intel and vhost to use this new notification type

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/intel_iommu.c | 2 +-
 hw/virtio/vhost.c     | 2 +-
 include/exec/memory.h | 9 ++++++---
 softmmu/memory.c      | 5 ++++-
 4 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0b3399874f..ddb828da1f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2467,7 +2467,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
     entry.iova = addr;
     entry.perm = IOMMU_NONE;
     entry.translated_addr = 0;
-    entry.type = IOMMU_IOTLB_UNMAP;
+    entry.type = IOMMU_DEVIOTLB_UNMAP;
     memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
 
 done:
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1a1384e7a6..6ca168b47e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -729,7 +729,7 @@ static void vhost_iommu_region_add(MemoryListener *listener,
     iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
                                                    MEMTXATTRS_UNSPECIFIED);
     iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
-                        IOMMU_NOTIFIER_UNMAP,
+                        IOMMU_NOTIFIER_DEVIOTLB,
                         section->offset_within_region,
                         int128_get64(end),
                         iommu_idx);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index f6d91c54aa..477c3af24c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -60,9 +60,10 @@ struct ReservedRegion {
 typedef struct IOMMUTLBEntry IOMMUTLBEntry;
 
 typedef enum {
-    IOMMU_IOTLB_NONE  = 0,
-    IOMMU_IOTLB_UNMAP = 1,
-    IOMMU_IOTLB_MAP   = 2,
+    IOMMU_IOTLB_NONE     = 0,
+    IOMMU_IOTLB_UNMAP    = 1,
+    IOMMU_IOTLB_MAP      = 2,
+    IOMMU_DEVIOTLB_UNMAP = 3,
 } IOMMUTLBNotificationType;
 
 /* See address_space_translate: bit 0 is read, bit 1 is write.  */
@@ -94,6 +95,8 @@ typedef enum {
     IOMMU_NOTIFIER_UNMAP = 0x1,
     /* Notify entry changes (newly created entries) */
     IOMMU_NOTIFIER_MAP = 0x2,
+    /* Notify changes on device IOTLB entries */
+    IOMMU_NOTIFIER_DEVIOTLB = 0x04,
 } IOMMUNotifierFlag;
 
 #define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 3e68442ca6..4ed63f4d0d 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1892,7 +1892,8 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
 
 static IOMMUNotifierFlag notifier_type_iommu(const IOMMUNotifier *notifier)
 {
-    return notifier->notifier_flags & IOMMU_NOTIFIER_ALL;
+    return notifier->notifier_flags &
+           (IOMMU_NOTIFIER_ALL | IOMMU_NOTIFIER_DEVIOTLB);
 }
 
 static bool memory_region_notify(const IOMMUNotifier *notifier,
@@ -1903,6 +1904,8 @@ static bool memory_region_notify(const IOMMUNotifier *notifier,
         return notifier_type_iommu(notifier) == IOMMU_NOTIFIER_MAP;
     case IOMMU_IOTLB_UNMAP:
         return notifier_type_iommu(notifier) == IOMMU_NOTIFIER_UNMAP;
+    case IOMMU_DEVIOTLB_UNMAP:
+        return notifier_type_iommu(notifier) == IOMMU_NOTIFIER_DEVIOTLB;
     default:
         return false;
     };
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (10 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 11/13] memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 16:51     ` 罗勇刚(Yonggang Luo)
  2020-08-26 14:36   ` [RFC v6 13/13] memory: Skip bad range assertion if notifier is DEVIOTLB type Eugenio Pérez
  2020-08-26 15:00   ` [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Perez Martin
  13 siblings, 1 reply; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

This improves performance in case of netperf with vhost-net:
* TCP_STREAM: From 9049.59Mbit/s to 9049.59Mbit/s (13%)
* TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%)
* UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%)
* UDP_STREAM: No change observed (insignificant 0.1% improvement)

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/intel_iommu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ddb828da1f..7620a1abbf 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1960,6 +1960,12 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id)
     vtd_iommu_unlock(s);
 
     QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
+        if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) {
+            /* If IOMMU memory region is DEVICE IOTLB type, it does not make
+             * sense to send regular IOMMU notifications. */
+            continue;
+        }
+
         if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
                                       vtd_as->devfn, &ce) &&
             domain_id == vtd_get_domain_id(s, &ce)) {
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v6 13/13] memory: Skip bad range assertion if notifier is DEVIOTLB type
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (11 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers Eugenio Pérez
@ 2020-08-26 14:36   ` Eugenio Pérez
  2020-08-26 15:00   ` [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Perez Martin
  13 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-08-26 14:36 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 softmmu/memory.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 4ed63f4d0d..d2797e996a 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1915,6 +1915,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
                                     IOMMUTLBEntry *entry)
 {
     hwaddr entry_end = entry->iova + entry->addr_mask;
+    IOMMUTLBEntry tmp = *entry;
 
     /*
      * Skip the notification if the notification does not overlap
@@ -1924,10 +1925,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
         return;
     }
 
-    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB) {
+        /* Crop (iova, addr_mask) to range */
+        tmp.iova = MAX(tmp.iova, notifier->start);
+        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
+        /* Confirm no underflow */
+        assert(MIN(entry_end, notifier->end) >= tmp.iova);
+    } else {
+        assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    }
 
     if (memory_region_notify(notifier, entry)) {
-        notifier->notify(notifier, entry);
+        notifier->notify(notifier, &tmp);
     }
 }
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
                     ` (12 preceding siblings ...)
  2020-08-26 14:36   ` [RFC v6 13/13] memory: Skip bad range assertion if notifier is DEVIOTLB type Eugenio Pérez
@ 2020-08-26 15:00   ` Eugenio Perez Martin
  2020-08-26 15:54     ` Peter Xu
  13 siblings, 1 reply; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-08-26 15:00 UTC (permalink / raw)
  To: qemu-devel, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	Richard Henderson

Hi!

Sending v6 to see if that is on the same page as what you meant.
Making each setting of "type" explicitly IOMMU_IOTLB_NONE if not used
in notifications. This is done in different commits in case this helps
review of different architectures.

I think that this way we have too much freedom between entry flags
(currently only access type, RW) and notification type. Since not all
of them are valid nor used in the same context, I think this adds
complexity. I'm wondering if:

Option a) We could make it private to memory.c, and make it a flag of
memory_region_notify_iommu (like "bool deviotlb_type)". IOW, instead
of making it a member of IOMMUTLBEntry, wrap the "entry" parameter of
memory_region_notify_iommu in a new private structure defined in
memory.c that adds that flag.

Option b) We could keep the IOMMUTLBNotificationType enum (open to
suggestions for a better name :)), but not embed it in the struct,
like:

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 477c3af24c..d9150e7b7e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -72,7 +72,8 @@ typedef enum {
     IOMMU_RO   = 1,
     IOMMU_WO   = 2,
     IOMMU_RW   = 3,
-} IOMMUAccessFlags;
+    IOMMU_DEVIOTLB = 4,
+} IOMMUEntryFlags;

 #define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : 0))

@@ -81,10 +82,18 @@ struct IOMMUTLBEntry {
     hwaddr                   iova;
     hwaddr                   translated_addr;
     hwaddr                   addr_mask;  /* 0xfff = 4k translation */
-    IOMMUAccessFlags         perm;
+    IOMMUEntryFlags          flags;
     IOMMUTLBNotificationType type; /* Only valid if it is a notification */
 };

+IOMMUTLBNotificationType iommu_tlb_entry_type(struct IOMMUTLBEntry *s) {
+    if (s->flags & IOMMU_DEVIOTLB)
+        return IOMMU_DEVIOTLB_UNMAP;
+    else if (s->flags & IOMMU_RW)
+        return IOMMU_IOTLB_MAP;
+    return IOMMU_IOTLB_UNMAP;
+}
+
--

Thanks!

On Wed, Aug 26, 2020 at 4:38 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>
> I am able to hit this assertion when a Red Hat 7 guest virtio_net device
> raises an "Invalidation" of all the TLB entries. This happens in the
> guest's startup if 'intel_iommu=on' argument is passed to the guest
> kernel and right IOMMU/ATS devices are declared in qemu's command line.
>
> Command line:
> /home/qemu/x86_64-softmmu/qemu-system-x86_64 -name \
> guest=rhel7-test,debug-threads=on -machine \
> pc-q35-5.1,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split \
> -cpu \
> Broadwell,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,rtm=on,hle=on \
> -m 8096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid \
> d022ecbf-679e-4755-87ce-eb87fc5bbc5d -display none -no-user-config \
> -nodefaults -rtc base=utc,driftfix=slew -global \
> kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global \
> ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on \
> -device intel-iommu,intremap=on,device-iotlb=on -device \
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \
> -device \
> pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
> -device \
> pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
> -device \
> pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
> -device \
> pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
> -device \
> pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
> -device \
> pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
> -device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device \
> virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -drive \
> file=/home/virtio-test2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 \
> -device \
> virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> -netdev tap,id=hostnet0,vhost=on,vhostforce=on -device \
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:1d:f2,bus=pci.1,addr=0x0,iommu_platform=on,ats=on \
> -device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object \
> rng-random,id=objrng0,filename=/dev/urandom -device \
> virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -s -msg \
> timestamp=on
>
> Full backtrace:
>
> #0  0x00007ffff521370f in raise () at /lib64/libc.so.6
> #1  0x00007ffff51fdb25 in abort () at /lib64/libc.so.6
> #2  0x00007ffff51fd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> #3  0x00007ffff520bcc6 in .annobin_assert.c_end () at /lib64/libc.so.6
> #4  0x0000555555888171 in memory_region_notify_one (notifier=0x7ffde0487fa8,
>                                                     entry=0x7ffde5dfe200)
>                           at /home/qemu/memory.c:1918
> #5  0x0000555555888247 in memory_region_notify_iommu (iommu_mr=0x555556f6c0b0,
>                                                       iommu_idx=0, entry=...)
>                           at /home/qemu/memory.c:1941
> #6  0x0000555555951c8d in vtd_process_device_iotlb_desc (s=0x555557609000,
>                                                        inv_desc=0x7ffde5dfe2d0)
>                           at /home/qemu/hw/i386/intel_iommu.c:2468
> #7  0x0000555555951e6a in vtd_process_inv_desc (s=0x555557609000)
>                           at /home/qemu/hw/i386/intel_iommu.c:2531
> #8  0x0000555555951fa5 in vtd_fetch_inv_desc (s=0x555557609000)
>                           at /home/qemu/hw/i386/intel_iommu.c:2563
> #9  0x00005555559520e5 in vtd_handle_iqt_write (s=0x555557609000)
>                           at /home/qemu/hw/i386/intel_iommu.c:2590
> #10 0x0000555555952b45 in vtd_mem_write (opaque=0x555557609000, addr=136,
>                                          val=2688, size=4)
>                           at /home/qemu/hw/i386/intel_iommu.c:2837
> #11 0x0000555555883e17 in memory_region_write_accessor (mr=0x555557609330,
>                                                         addr=136,
>                                                         value=0x7ffde5dfe478,
>                                                         size=4,
>                                                         shift=0,
>                                                         mask=4294967295,
>                                                         attrs=...)
>                          at /home/qemu/memory.c:483
> #12 0x000055555588401d in access_with_adjusted_size (addr=136,
>                        value=0x7ffde5dfe478,
>                        size=4,
>                        access_size_min=4,
>                        access_size_max=8,
>                        access_fn=0x555555883d38 <memory_region_write_accessor>,
>                        mr=0x555557609330,
>                        attrs=...)
>                        at /home/qemu/memory.c:544
> #13 0x0000555555886f37 in memory_region_dispatch_write (mr=0x555557609330,
>                                                        addr=136,
>                                                        data=2688,
>                                                        op=MO_32,
>                                                        attrs=...)
>                          at /home/qemu/memory.c:1476
> #14 0x0000555555827a03 in flatview_write_continue (fv=0x7ffdd8503150,
>                                                    addr=4275634312,
>                                                    attrs=...,
>                                                    ptr=0x7ffff7ff0028,
>                                                    len=4,
>                                                    addr1=136,
>                                                    l=4,
>                                                    mr=0x555557609330)
>                           at /home/qemu/exec.c:3146
> #15 0x0000555555827b48 in flatview_write (fv=0x7ffdd8503150,
>                                           addr=4275634312,
>                                           attrs=...,
>                                           buf=0x7ffff7ff0028,
>                                           len=4)
>                           at /home/qemu/exec.c:3186
> #16 0x0000555555827e9d in address_space_write (
>                                       as=0x5555567ca640 <address_space_memory>,
>                                       addr=4275634312,
>                                       attrs=...,
>                                       buf=0x7ffff7ff0028,
>                                       len=4)
>                           at /home/qemu/exec.c:3277
> #17 0x0000555555827f0a in address_space_rw (
>                                       as=0x5555567ca640 <address_space_memory>,
>                                       addr=4275634312,
>                                       attrs=...,
>                                       buf=0x7ffff7ff0028,
>                                       len=4,
>                                       is_write=true)
>                           at /home/qemu/exec.c:3287
> #18 0x000055555589b633 in kvm_cpu_exec (cpu=0x555556b65640)
>                                at /home/qemu/accel/kvm/kvm-all.c:2511
> #19 0x0000555555876ba8 in qemu_kvm_cpu_thread_fn (arg=0x555556b65640)
>                                at /home/qemu/cpus.c:1284
> #20 0x0000555555dafff1 in qemu_thread_start (args=0x555556b8c3b0)
>                                at util/qemu-thread-posix.c:521
> #21 0x00007ffff55a62de in start_thread () at /lib64/libpthread.so.0
> #22 0x00007ffff52d7e83 in clone () at /lib64/libc.so.6
>
> (gdb) frame 4
> #4  0x0000555555888171 in memory_region_notify_one
>                       (notifier=0x7ffde0487fa8, entry=0x7ffde5dfe200)
>                       at /home/qemu/memory.c:1918
> 1918        assert(entry->iova >= notifier->start && entry_end <=
> notifier->end);
> (gdb) p *entry
> $1 = {target_as = 0x555556f6c050, iova = 0, translated_addr = 0,
> addr_mask = 18446744073709551615, perm = IOMMU_NONE}
> --
>
> Tested with vhost-net, host<->guest communication. Still more testing
> needed, since it touches a few architectures and configurations.
>
> v6: Introduce "type" field for IOMMUTLBEntry. Fill in all uses.
>     Update tests reports with more fine-tuning (CPU, RPS/XPS tunning).
>
> v5: Skip regular IOTLB notifications in dev_iotlb notifiers
>
> v4: Rename IOMMU_NOTIFIER_IOTLB -> IOMMU_NOTIFIER_DEVIOTLB.
>     Make vhost-net notifier just IOMMU_NOTIFIER_DEVIOTLB, not
>     IOMMU_NOTIFIER_UNMAP
>
> v3: Skip the assertion in case notifier is a IOTLB one, since they can manage
>     arbitrary ranges. Using a flag in the notifier for now, as Peter suggested.
>
> v2: Actually delete assertion instead of just commenting out using C99
>
> Eugenio Pérez (13):
>   memory: Rename memory_region_notify_one to
>     memory_region_notify_iommu_one
>   memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry
>   hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
>   amd_iommu: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
>   hw/arm/smmu: Fill IOMMUTLBEntry notifier type
>   dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
>   intel_iommu: Mark IOMMUTLBEntry of page notification as
>     IOMMU_IOTLB_UNMAP type
>   virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE
>     type
>   intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level
>   memory: Notify IOMMU IOTLB based on entry type, not permissions
>   memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType
>   intel_iommu: Do not notify regular iotlb to device-iotlb notifiers
>   memory: Skip bad range assertion if notifier is DEVIOTLB type
>
>  hw/alpha/typhoon.c       |  1 +
>  hw/arm/smmu-common.c     |  4 +++-
>  hw/arm/smmuv3.c          |  4 +++-
>  hw/dma/rc4030.c          |  1 +
>  hw/i386/amd_iommu.c      |  7 ++++++-
>  hw/i386/intel_iommu.c    | 14 +++++++++++--
>  hw/virtio/vhost.c        |  2 +-
>  hw/virtio/virtio-iommu.c |  1 +
>  include/exec/memory.h    | 26 ++++++++++++++++-------
>  softmmu/memory.c         | 45 ++++++++++++++++++++++++++++++----------
>  10 files changed, 80 insertions(+), 25 deletions(-)
>
> --
> 2.18.1
>
>



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry
  2020-08-26 14:36   ` [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry Eugenio Pérez
@ 2020-08-26 15:42     ` Peter Xu
  2020-08-27  6:11       ` Eugenio Perez Martin
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Xu @ 2020-08-26 15:42 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-devel,
	Eric Auger, qemu-arm, Hervé Poussineau, Avi Kivity,
	Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 04:36:40PM +0200, Eugenio Pérez wrote:
> This way we can tell between MAPs and UNMAP, and potentially avoid to
> send them to a notifier that does not require them.
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  include/exec/memory.h | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 22c5f564d1..f6d91c54aa 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -59,6 +59,12 @@ struct ReservedRegion {
>  
>  typedef struct IOMMUTLBEntry IOMMUTLBEntry;
>  
> +typedef enum {
> +    IOMMU_IOTLB_NONE  = 0,
> +    IOMMU_IOTLB_UNMAP = 1,
> +    IOMMU_IOTLB_MAP   = 2,
> +} IOMMUTLBNotificationType;

Can we directly use IOMMUNotifierFlag as the type rather than introducing a
similar enumeration?  Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type
  2020-08-26 14:36   ` [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
@ 2020-08-26 15:50     ` Peter Xu
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Xu @ 2020-08-26 15:50 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-devel,
	Eric Auger, qemu-arm, Hervé Poussineau, Avi Kivity,
	Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 04:36:41PM +0200, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/alpha/typhoon.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
> index 29d44dfb06..b1e6c4e929 100644
> --- a/hw/alpha/typhoon.c
> +++ b/hw/alpha/typhoon.c
> @@ -615,6 +615,7 @@ static bool make_iommu_tlbe(hwaddr taddr, hwaddr mask, IOMMUTLBEntry *ret)
>          .translated_addr = taddr,
>          .addr_mask = mask,
>          .perm = IOMMU_RW,
> +        .type = IOMMU_IOTLB_NONE,

IMHO we don't need to touch all the IOMMUTLBEntry users, but only the callers
of memory_region_notify_iommu*().  We should also comment at the type field
that it's meaningless except when used for IOMMU notifications, because these
are really two different things: IOMMUTLBEntry was originally a translated
entry out of IOMMU hardware, so in those case it does not need a "type" field.

To make it clearer (depending on your preference...), we can introduce
IOMMUTLBEvent to be:

  struct IOMMUTLBEvent {
    IOMMUTLBEntry entry;
    IOMMUTLBType type;
  };

Then it'll be clear on which is which.  Though you'll need to touch more things
(all the callers and all the registerered notifiers).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-08-26 15:00   ` [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Perez Martin
@ 2020-08-26 15:54     ` Peter Xu
  2020-08-27  6:53       ` Eugenio Perez Martin
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Xu @ 2020-08-26 15:54 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-devel,
	Eric Auger, qemu-arm, Hervé Poussineau, Avi Kivity,
	Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 05:00:30PM +0200, Eugenio Perez Martin wrote:
> Hi!
> 
> Sending v6 to see if that is on the same page as what you meant.
> Making each setting of "type" explicitly IOMMU_IOTLB_NONE if not used
> in notifications. This is done in different commits in case this helps
> review of different architectures.

I've also proposed IOMMUTLBEvent in the other reply, that might help too.

Since at it, there's also another trick to use - we don't need to touch those
"type" as long as the default type is "zero", so as long as we make sure the
default type (IOMMU_NOTIFIER_NONE) is zero, then we don't need to set it
everywhere either.

> 
> I think that this way we have too much freedom between entry flags
> (currently only access type, RW) and notification type. Since not all
> of them are valid nor used in the same context, I think this adds
> complexity. I'm wondering if:
> 
> Option a) We could make it private to memory.c, and make it a flag of
> memory_region_notify_iommu (like "bool deviotlb_type)". IOW, instead
> of making it a member of IOMMUTLBEntry, wrap the "entry" parameter of
> memory_region_notify_iommu in a new private structure defined in
> memory.c that adds that flag.

No strong preference from me.  But since you posted the series before you
provide the options... Maybe continue with what we have can be easier. :)

> 
> Option b) We could keep the IOMMUTLBNotificationType enum (open to
> suggestions for a better name :)), but not embed it in the struct,
> like:
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 477c3af24c..d9150e7b7e 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -72,7 +72,8 @@ typedef enum {
>      IOMMU_RO   = 1,
>      IOMMU_WO   = 2,
>      IOMMU_RW   = 3,
> -} IOMMUAccessFlags;
> +    IOMMU_DEVIOTLB = 4,
> +} IOMMUEntryFlags;

Just in case you didn't notice - IOMMUAccessFlags is actaully a bitmap. :)

IMHO we can keep the IOMMUAccessFlags scemantics, since it's still correct for
a general translated IOMMUTLBEntry object.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers
  2020-08-26 14:36   ` [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers Eugenio Pérez
@ 2020-08-26 16:51     ` 罗勇刚(Yonggang Luo)
  2020-08-27  6:56       ` Eugenio Perez Martin
  0 siblings, 1 reply; 38+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-08-26 16:51 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-level,
	Peter Xu, Eric Auger, qemu-arm, Hervé Poussineau,
	Avi Kivity, Paolo Bonzini, Richard Henderson

[-- Attachment #1: Type: text/plain, Size: 1515 bytes --]

On Wed, Aug 26, 2020 at 10:42 PM Eugenio Pérez <eperezma@redhat.com> wrote:

> This improves performance in case of netperf with vhost-net:
> * TCP_STREAM: From 9049.59Mbit/s to 9049.59Mbit/s (13%)
>
What's improvement ? they are the same


> * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%)
> * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%)
> * UDP_STREAM: No change observed (insignificant 0.1% improvement)
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index ddb828da1f..7620a1abbf 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -1960,6 +1960,12 @@ static void
> vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id)
>      vtd_iommu_unlock(s);
>
>      QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
> +        if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) {
> +            /* If IOMMU memory region is DEVICE IOTLB type, it does not
> make
> +             * sense to send regular IOMMU notifications. */
> +            continue;
> +        }
> +
>          if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
>                                        vtd_as->devfn, &ce) &&
>              domain_id == vtd_get_domain_id(s, &ce)) {
> --
> 2.18.1
>
>
>

-- 
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo

[-- Attachment #2: Type: text/html, Size: 2294 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry
  2020-08-26 15:42     ` Peter Xu
@ 2020-08-27  6:11       ` Eugenio Perez Martin
  0 siblings, 0 replies; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-08-27  6:11 UTC (permalink / raw)
  To: Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-devel,
	Eric Auger, qemu-arm, Hervé Poussineau, Avi Kivity,
	Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 5:42 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Wed, Aug 26, 2020 at 04:36:40PM +0200, Eugenio Pérez wrote:
> > This way we can tell between MAPs and UNMAP, and potentially avoid to
> > send them to a notifier that does not require them.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  include/exec/memory.h | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 22c5f564d1..f6d91c54aa 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -59,6 +59,12 @@ struct ReservedRegion {
> >
> >  typedef struct IOMMUTLBEntry IOMMUTLBEntry;
> >
> > +typedef enum {
> > +    IOMMU_IOTLB_NONE  = 0,
> > +    IOMMU_IOTLB_UNMAP = 1,
> > +    IOMMU_IOTLB_MAP   = 2,
> > +} IOMMUTLBNotificationType;
>
> Can we directly use IOMMUNotifierFlag as the type rather than introducing a
> similar enumeration?  Thanks,
>

Right, I think it's simpler your way.

Thanks!

> --
> Peter Xu
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier
  2020-08-26 15:54     ` Peter Xu
@ 2020-08-27  6:53       ` Eugenio Perez Martin
  0 siblings, 0 replies; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-08-27  6:53 UTC (permalink / raw)
  To: Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-devel,
	Eric Auger, qemu-arm, Hervé Poussineau, Avi Kivity,
	Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 5:54 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Wed, Aug 26, 2020 at 05:00:30PM +0200, Eugenio Perez Martin wrote:
> > Hi!
> >
> > Sending v6 to see if that is on the same page as what you meant.
> > Making each setting of "type" explicitly IOMMU_IOTLB_NONE if not used
> > in notifications. This is done in different commits in case this helps
> > review of different architectures.
>
> I've also proposed IOMMUTLBEvent in the other reply, that might help too.
>

I think IOMMUTLBEvent would be way clearer, yes :).

> Since at it, there's also another trick to use - we don't need to touch those
> "type" as long as the default type is "zero", so as long as we make sure the
> default type (IOMMU_NOTIFIER_NONE) is zero, then we don't need to set it
> everywhere either.
>

Right, I was just making it explicit.

> >
> > I think that this way we have too much freedom between entry flags
> > (currently only access type, RW) and notification type. Since not all
> > of them are valid nor used in the same context, I think this adds
> > complexity. I'm wondering if:
> >
> > Option a) We could make it private to memory.c, and make it a flag of
> > memory_region_notify_iommu (like "bool deviotlb_type)". IOW, instead
> > of making it a member of IOMMUTLBEntry, wrap the "entry" parameter of
> > memory_region_notify_iommu in a new private structure defined in
> > memory.c that adds that flag.
>
> No strong preference from me.  But since you posted the series before you
> provide the options... Maybe continue with what we have can be easier. :)
>

I just wanted to be sure I understood your proposal before comparing :)

> >
> > Option b) We could keep the IOMMUTLBNotificationType enum (open to
> > suggestions for a better name :)), but not embed it in the struct,
> > like:
> >
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 477c3af24c..d9150e7b7e 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -72,7 +72,8 @@ typedef enum {
> >      IOMMU_RO   = 1,
> >      IOMMU_WO   = 2,
> >      IOMMU_RW   = 3,
> > -} IOMMUAccessFlags;
> > +    IOMMU_DEVIOTLB = 4,
> > +} IOMMUEntryFlags;
>
> Just in case you didn't notice - IOMMUAccessFlags is actaully a bitmap. :)
>

I know I know, that is why I compared with IOMMU_RW in proposed
iommu_tlb_entry_type.

Thanks!

> IMHO we can keep the IOMMUAccessFlags scemantics, since it's still correct for
> a general translated IOMMUTLBEntry object.
>
> Thanks,
>
> --
> Peter Xu
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers
  2020-08-26 16:51     ` 罗勇刚(Yonggang Luo)
@ 2020-08-27  6:56       ` Eugenio Perez Martin
  0 siblings, 0 replies; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-08-27  6:56 UTC (permalink / raw)
  To: luoyonggang
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Michael S. Tsirkin, qemu-level,
	Peter Xu, Eric Auger, qemu-arm, Hervé Poussineau,
	Avi Kivity, Paolo Bonzini, Richard Henderson

On Wed, Aug 26, 2020 at 6:52 PM 罗勇刚(Yonggang Luo) <luoyonggang@gmail.com> wrote:
>
>
>
> On Wed, Aug 26, 2020 at 10:42 PM Eugenio Pérez <eperezma@redhat.com> wrote:
>>
>> This improves performance in case of netperf with vhost-net:
>> * TCP_STREAM: From 9049.59Mbit/s to 9049.59Mbit/s (13%)
>
> What's improvement ? they are the same
>

Ouch, it was from 1923.6 Mbit/s to 2175.13 Mbit/s. Thanks for notify
it, will fix in next revision!

>>
>> * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%)
>> * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%)
>> * UDP_STREAM: No change observed (insignificant 0.1% improvement)
>>
>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> ---
>>  hw/i386/intel_iommu.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index ddb828da1f..7620a1abbf 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -1960,6 +1960,12 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id)
>>      vtd_iommu_unlock(s);
>>
>>      QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
>> +        if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) {
>> +            /* If IOMMU memory region is DEVICE IOTLB type, it does not make
>> +             * sense to send regular IOMMU notifications. */
>> +            continue;
>> +        }
>> +
>>          if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
>>                                        vtd_as->devfn, &ce) &&
>>              domain_id == vtd_get_domain_id(s, &ce)) {
>> --
>> 2.18.1
>>
>>
>
>
> --
>          此致
> 礼
> 罗勇刚
> Yours
>     sincerely,
> Yonggang Luo



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 0/5] memory: Skip assertion in memory_region_unregister_iommu_notifier
  2020-06-25 19:16 [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Pérez
                   ` (2 preceding siblings ...)
  2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
@ 2020-09-03 16:14 ` Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 1/5] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
                     ` (4 more replies)
  3 siblings, 5 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

I am able to hit this assertion when a Red Hat 7 guest virtio_net device
raises an "Invalidation" of all the TLB entries. This happens in the
guest's startup if 'intel_iommu=on' argument is passed to the guest
kernel and right IOMMU/ATS devices are declared in qemu's command line.

Command line:
/home/qemu/x86_64-softmmu/qemu-system-x86_64 -name \
guest=rhel7-test,debug-threads=on -machine \
pc-q35-5.1,accel=kvm,usb=off,dump-guest-core=off,kernel_irqchip=split \
-cpu \
Broadwell,vme=on,ss=on,vmx=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,arch-capabilities=on,xsaveopt=on,pdpe1gb=on,abm=on,skip-l1dfl-vmentry=on,rtm=on,hle=on \
-m 8096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid \
d022ecbf-679e-4755-87ce-eb87fc5bbc5d -display none -no-user-config \
-nodefaults -rtc base=utc,driftfix=slew -global \
kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global \
ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on \
-device intel-iommu,intremap=on,device-iotlb=on -device \
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \
-device \
pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
-device \
pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
-device \
pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
-device \
pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
-device \
pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
-device \
pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
-device qemu-xhci,p2=15,p3=15,id=usb,bus=pci.2,addr=0x0 -device \
virtio-serial-pci,id=virtio-serial0,bus=pci.3,addr=0x0 -drive \
file=/home/virtio-test2.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 \
-device \
virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,vhostforce=on -device \
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:1d:f2,bus=pci.1,addr=0x0,iommu_platform=on,ats=on \
-device virtio-balloon-pci,id=balloon0,bus=pci.5,addr=0x0 -object \
rng-random,id=objrng0,filename=/dev/urandom -device \
virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.6,addr=0x0 -s -msg \
timestamp=on

Full backtrace:
 #0  0x00007ffff521370f in raise () at /lib64/libc.so.6
 #1  0x00007ffff51fdb25 in abort () at /lib64/libc.so.6
 #2  0x00007ffff51fd9f9 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
 #3  0x00007ffff520bcc6 in .annobin_assert.c_end () at /lib64/libc.so.6
 #4  0x0000555555888171 in memory_region_notify_one (notifier=0x7ffde0487fa8,
                                                    entry=0x7ffde5dfe200)
                          at /home/qemu/memory.c:1918
 #5  0x0000555555888247 in memory_region_notify_iommu (iommu_mr=0x555556f6c0b0,
                                                      iommu_idx=0, entry=...)
                          at /home/qemu/memory.c:1941
 #6  0x0000555555951c8d in vtd_process_device_iotlb_desc (s=0x555557609000,
                                                       inv_desc=0x7ffde5dfe2d0)
                          at /home/qemu/hw/i386/intel_iommu.c:2468
 #7  0x0000555555951e6a in vtd_process_inv_desc (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2531
 #8  0x0000555555951fa5 in vtd_fetch_inv_desc (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2563
 #9  0x00005555559520e5 in vtd_handle_iqt_write (s=0x555557609000)
                          at /home/qemu/hw/i386/intel_iommu.c:2590
 #10 0x0000555555952b45 in vtd_mem_write (opaque=0x555557609000, addr=136,
                                         val=2688, size=4)
                          at /home/qemu/hw/i386/intel_iommu.c:2837
 #11 0x0000555555883e17 in memory_region_write_accessor (mr=0x555557609330,
                                                        addr=136,
                                                        value=0x7ffde5dfe478,
                                                        size=4,
                                                        shift=0,
                                                        mask=4294967295,
                                                        attrs=...)
                         at /home/qemu/memory.c:483
 #12 0x000055555588401d in access_with_adjusted_size (addr=136,
                       value=0x7ffde5dfe478,
                       size=4,
                       access_size_min=4,
                       access_size_max=8,
                       access_fn=0x555555883d38 <memory_region_write_accessor>,
                       mr=0x555557609330,
                       attrs=...)
                       at /home/qemu/memory.c:544
 #13 0x0000555555886f37 in memory_region_dispatch_write (mr=0x555557609330,
                                                       addr=136,
                                                       data=2688,
                                                       op=MO_32,
                                                       attrs=...)
                         at /home/qemu/memory.c:1476
 #14 0x0000555555827a03 in flatview_write_continue (fv=0x7ffdd8503150,
                                                   addr=4275634312,
                                                   attrs=...,
                                                   ptr=0x7ffff7ff0028,
                                                   len=4,
                                                   addr1=136,
                                                   l=4,
                                                   mr=0x555557609330)
                          at /home/qemu/exec.c:3146
 #15 0x0000555555827b48 in flatview_write (fv=0x7ffdd8503150,
                                          addr=4275634312,
                                          attrs=...,
                                          buf=0x7ffff7ff0028,
                                          len=4)
                          at /home/qemu/exec.c:3186
 #16 0x0000555555827e9d in address_space_write (
                                      as=0x5555567ca640 <address_space_memory>,
                                      addr=4275634312,
                                      attrs=...,
                                      buf=0x7ffff7ff0028,
                                      len=4)
                          at /home/qemu/exec.c:3277
 #17 0x0000555555827f0a in address_space_rw (
                                      as=0x5555567ca640 <address_space_memory>,
                                      addr=4275634312,
                                      attrs=...,
                                      buf=0x7ffff7ff0028,
                                      len=4,
                                      is_write=true)
                          at /home/qemu/exec.c:3287
 #18 0x000055555589b633 in kvm_cpu_exec (cpu=0x555556b65640)
                               at /home/qemu/accel/kvm/kvm-all.c:2511
 #19 0x0000555555876ba8 in qemu_kvm_cpu_thread_fn (arg=0x555556b65640)
                               at /home/qemu/cpus.c:1284
 #20 0x0000555555dafff1 in qemu_thread_start (args=0x555556b8c3b0)
                               at util/qemu-thread-posix.c:521
 #21 0x00007ffff55a62de in start_thread () at /lib64/libpthread.so.0
 #22 0x00007ffff52d7e83 in clone () at /lib64/libc.so.6

(gdb) frame 4
 #4  0x0000555555888171 in memory_region_notify_one
                      (notifier=0x7ffde0487fa8, entry=0x7ffde5dfe200)
                      at /home/qemu/memory.c:1918
1918        assert(entry->iova >= notifier->start && entry_end <=
notifier->end);
(gdb) p *entry
$1 = {target_as = 0x555556f6c050, iova = 0, translated_addr = 0,
addr_mask = 18446744073709551615, perm = IOMMU_NONE}
--

Tested with vhost-net and qemu driver, host<->guest communication.

v1: * IOMMU_NOTIFIER_ALL now includes IOMMU_NOTIFIER_DEVIOTLB_EVENTS
      also. VFIO IOMMU notifier will register for all events (as before
      of the patching)
    * Cosmetic changes, like:
      - Expand commit messages
      - Better naming and checks
      - Fix alignment issues
      - Avoid an already present casting from `void *`

RFC v8: Fix use of "tmp" notification in memory.c:memory_region_notify_iommu_one

v7: Add IOMMUTLBNotification, and move introduced "type" from
    IOMMUTLBEntry to the former.

v6: Introduce "type" field for IOMMUTLBEntry. Fill in all uses.
    Update tests reports with more fine-tuning (CPU, RPS/XPS tunning).

v5: Skip regular IOTLB notifications in dev_iotlb notifiers

v4: Rename IOMMU_NOTIFIER_IOTLB -> IOMMU_NOTIFIER_DEVIOTLB.
    Make vhost-net notifier just IOMMU_NOTIFIER_DEVIOTLB, not
    IOMMU_NOTIFIER_UNMAP

v3: Skip the assertion in case notifier is a IOTLB one, since they can manage
    arbitrary ranges. Using a flag in the notifier for now, as Peter suggested.

v2: Actually delete assertion instead of just commenting out using C99

Eugenio Pérez (5):
  memory: Rename memory_region_notify_one to
    memory_region_notify_iommu_one
  memory: Add IOMMUTLBEvent
  memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType
  intel_iommu: Skip page walking on device iotlb invalidations
  memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type

 include/exec/memory.h | 40 ++++++++++---------
 hw/arm/smmu-common.c  | 13 +++---
 hw/arm/smmuv3.c       | 13 +++---
 hw/i386/intel_iommu.c | 92 +++++++++++++++++++++++++------------------
 hw/misc/tz-mpc.c      | 32 ++++++++-------
 hw/ppc/spapr_iommu.c  | 15 +++----
 hw/virtio/vhost.c     |  2 +-
 softmmu/memory.c      | 31 +++++++++------
 8 files changed, 135 insertions(+), 103 deletions(-)

-- 
2.18.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/5] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
@ 2020-09-03 16:14   ` Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 2/5] memory: Add IOMMUTLBEvent Eugenio Pérez
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

Previous name didn't reflect the iommu operation.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
---
 include/exec/memory.h | 6 +++---
 hw/arm/smmu-common.c  | 2 +-
 hw/arm/smmuv3.c       | 2 +-
 hw/i386/intel_iommu.c | 4 ++--
 softmmu/memory.c      | 6 +++---
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 0cfe987ab4..22c5f564d1 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -226,7 +226,7 @@ enum IOMMUMemoryRegionAttr {
  * The IOMMU implementation must use the IOMMU notifier infrastructure
  * to report whenever mappings are changed, by calling
  * memory_region_notify_iommu() (or, if necessary, by calling
- * memory_region_notify_one() for each registered notifier).
+ * memory_region_notify_iommu_one() for each registered notifier).
  *
  * Conceptually an IOMMU provides a mapping from input address
  * to an output TLB entry. If the IOMMU is aware of memory transaction
@@ -1274,7 +1274,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
                                 IOMMUTLBEntry entry);
 
 /**
- * memory_region_notify_one: notify a change in an IOMMU translation
+ * memory_region_notify_iommu_one: notify a change in an IOMMU translation
  *                           entry to a single notifier
  *
  * This works just like memory_region_notify_iommu(), but it only
@@ -1285,7 +1285,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
  *         replaces all old entries for the same virtual I/O address range.
  *         Deleted entries have .@perm == 0.
  */
-void memory_region_notify_one(IOMMUNotifier *notifier,
+void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
                               IOMMUTLBEntry *entry);
 
 /**
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 3838db1395..88d2c454f0 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -472,7 +472,7 @@ static void smmu_unmap_notifier_range(IOMMUNotifier *n)
     entry.perm = IOMMU_NONE;
     entry.addr_mask = n->end - n->start;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_notify_iommu_one(n, &entry);
 }
 
 /* Unmap all notifiers attached to @mr */
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0122700e72..0a893ae918 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -827,7 +827,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
     entry.addr_mask = num_pages * (1 << granule) - 1;
     entry.perm = IOMMU_NONE;
 
-    memory_region_notify_one(n, &entry);
+    memory_region_notify_iommu_one(n, &entry);
 }
 
 /* invalidate an asid/iova range tuple in all mr's */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 5284bb68b6..2ad6b9d796 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3498,7 +3498,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
         /* This field is meaningless for unmap */
         entry.translated_addr = 0;
 
-        memory_region_notify_one(n, &entry);
+        memory_region_notify_iommu_one(n, &entry);
 
         start += mask;
         remain -= mask;
@@ -3536,7 +3536,7 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s)
 
 static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
 {
-    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    memory_region_notify_iommu_one((IOMMUNotifier *)private, entry);
     return 0;
 }
 
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 70b93104e8..961c25b42f 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1890,8 +1890,8 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
     memory_region_update_iommu_notify_flags(iommu_mr, NULL);
 }
 
-void memory_region_notify_one(IOMMUNotifier *notifier,
-                              IOMMUTLBEntry *entry)
+void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
+                                    IOMMUTLBEntry *entry)
 {
     IOMMUNotifierFlag request_flags;
     hwaddr entry_end = entry->iova + entry->addr_mask;
@@ -1927,7 +1927,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
 
     IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
         if (iommu_notifier->iommu_idx == iommu_idx) {
-            memory_region_notify_one(iommu_notifier, &entry);
+            memory_region_notify_iommu_one(iommu_notifier, &entry);
         }
     }
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/5] memory: Add IOMMUTLBEvent
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 1/5] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
@ 2020-09-03 16:14   ` Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 3/5] memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

This way we can tell between regular IOMMUTLBEntry (entry of IOMMU
hardware) and notifications.

In the notifications, we set explicitly if it is a MAPs or an UNMAP,
instead of trusting in entry permissions to differentiate them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory.h | 27 +++++++------
 hw/arm/smmu-common.c  | 13 ++++---
 hw/arm/smmuv3.c       | 13 ++++---
 hw/i386/intel_iommu.c | 88 ++++++++++++++++++++++++-------------------
 hw/misc/tz-mpc.c      | 32 +++++++++-------
 hw/ppc/spapr_iommu.c  | 15 ++++----
 softmmu/memory.c      | 20 +++++-----
 7 files changed, 111 insertions(+), 97 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 22c5f564d1..ab8d1ac9c9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -106,6 +106,11 @@ struct IOMMUNotifier {
 };
 typedef struct IOMMUNotifier IOMMUNotifier;
 
+typedef struct IOMMUTLBEvent {
+    IOMMUNotifierFlag type;
+    IOMMUTLBEntry entry;
+} IOMMUTLBEvent;
+
 /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */
 #define RAM_PREALLOC   (1 << 0)
 
@@ -1254,24 +1259,18 @@ uint64_t memory_region_iommu_get_min_page_size(IOMMUMemoryRegion *iommu_mr);
 /**
  * memory_region_notify_iommu: notify a change in an IOMMU translation entry.
  *
- * The notification type will be decided by entry.perm bits:
- *
- * - For UNMAP (cache invalidation) notifies: set entry.perm to IOMMU_NONE.
- * - For MAP (newly added entry) notifies: set entry.perm to the
- *   permission of the page (which is definitely !IOMMU_NONE).
- *
  * Note: for any IOMMU implementation, an in-place mapping change
  * should be notified with an UNMAP followed by a MAP.
  *
  * @iommu_mr: the memory region that was changed
  * @iommu_idx: the IOMMU index for the translation table which has changed
- * @entry: the new entry in the IOMMU translation table.  The entry
- *         replaces all old entries for the same virtual I/O address range.
- *         Deleted entries have .@perm == 0.
+ * @event: TLB event with the new entry in the IOMMU translation table.
+ *         The entry replaces all old entries for the same virtual I/O address
+ *         range.
  */
 void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
                                 int iommu_idx,
-                                IOMMUTLBEntry entry);
+                                IOMMUTLBEvent event);
 
 /**
  * memory_region_notify_iommu_one: notify a change in an IOMMU translation
@@ -1281,12 +1280,12 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
  * notifies a specific notifier, not all of them.
  *
  * @notifier: the notifier to be notified
- * @entry: the new entry in the IOMMU translation table.  The entry
- *         replaces all old entries for the same virtual I/O address range.
- *         Deleted entries have .@perm == 0.
+ * @event: TLB event with the new entry in the IOMMU translation table.
+ *         The entry replaces all old entries for the same virtual I/O address
+ *         range.
  */
 void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
-                              IOMMUTLBEntry *entry);
+                                    IOMMUTLBEvent *event);
 
 /**
  * memory_region_register_iommu_notifier: register a notifier for changes to
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 88d2c454f0..405d5c5325 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -465,14 +465,15 @@ IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid)
 /* Unmap the whole notifier's range */
 static void smmu_unmap_notifier_range(IOMMUNotifier *n)
 {
-    IOMMUTLBEntry entry;
+    IOMMUTLBEvent event;
 
-    entry.target_as = &address_space_memory;
-    entry.iova = n->start;
-    entry.perm = IOMMU_NONE;
-    entry.addr_mask = n->end - n->start;
+    event.type = IOMMU_NOTIFIER_UNMAP;
+    event.entry.target_as = &address_space_memory;
+    event.entry.iova = n->start;
+    event.entry.perm = IOMMU_NONE;
+    event.entry.addr_mask = n->end - n->start;
 
-    memory_region_notify_iommu_one(n, &entry);
+    memory_region_notify_iommu_one(n, &event);
 }
 
 /* Unmap all notifiers attached to @mr */
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0a893ae918..62b0e289ca 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -799,7 +799,7 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
                                uint8_t tg, uint64_t num_pages)
 {
     SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
-    IOMMUTLBEntry entry;
+    IOMMUTLBEvent event;
     uint8_t granule = tg;
 
     if (!tg) {
@@ -822,12 +822,13 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
         granule = tt->granule_sz;
     }
 
-    entry.target_as = &address_space_memory;
-    entry.iova = iova;
-    entry.addr_mask = num_pages * (1 << granule) - 1;
-    entry.perm = IOMMU_NONE;
+    event.type = IOMMU_NOTIFIER_UNMAP;
+    event.entry.target_as = &address_space_memory;
+    event.entry.iova = iova;
+    event.entry.addr_mask = num_pages * (1 << granule) - 1;
+    event.entry.perm = IOMMU_NONE;
 
-    memory_region_notify_iommu_one(n, &entry);
+    memory_region_notify_iommu_one(n, &event);
 }
 
 /* invalidate an asid/iova range tuple in all mr's */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2ad6b9d796..19661bba3e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1073,7 +1073,7 @@ static int vtd_iova_to_slpte(IntelIOMMUState *s, VTDContextEntry *ce,
     }
 }
 
-typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+typedef int (*vtd_page_walk_hook)(IOMMUTLBEvent *event, void *private);
 
 /**
  * Constant information used during page walking
@@ -1094,11 +1094,12 @@ typedef struct {
     uint16_t domain_id;
 } vtd_page_walk_info;
 
-static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *info)
+static int vtd_page_walk_one(IOMMUTLBEvent *event, vtd_page_walk_info *info)
 {
     VTDAddressSpace *as = info->as;
     vtd_page_walk_hook hook_fn = info->hook_fn;
     void *private = info->private;
+    IOMMUTLBEntry *entry = &event->entry;
     DMAMap target = {
         .iova = entry->iova,
         .size = entry->addr_mask,
@@ -1107,7 +1108,7 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *info)
     };
     DMAMap *mapped = iova_tree_find(as->iova_tree, &target);
 
-    if (entry->perm == IOMMU_NONE && !info->notify_unmap) {
+    if (event->type == IOMMU_NOTIFIER_UNMAP && !info->notify_unmap) {
         trace_vtd_page_walk_one_skip_unmap(entry->iova, entry->addr_mask);
         return 0;
     }
@@ -1115,7 +1116,7 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *info)
     assert(hook_fn);
 
     /* Update local IOVA mapped ranges */
-    if (entry->perm) {
+    if (event->type == IOMMU_NOTIFIER_MAP) {
         if (mapped) {
             /* If it's exactly the same translation, skip */
             if (!memcmp(mapped, &target, sizeof(target))) {
@@ -1141,19 +1142,21 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *info)
                 int ret;
 
                 /* Emulate an UNMAP */
+                event->type = IOMMU_NOTIFIER_UNMAP;
                 entry->perm = IOMMU_NONE;
                 trace_vtd_page_walk_one(info->domain_id,
                                         entry->iova,
                                         entry->translated_addr,
                                         entry->addr_mask,
                                         entry->perm);
-                ret = hook_fn(entry, private);
+                ret = hook_fn(event, private);
                 if (ret) {
                     return ret;
                 }
                 /* Drop any existing mapping */
                 iova_tree_remove(as->iova_tree, &target);
-                /* Recover the correct permission */
+                /* Recover the correct type */
+                event->type = IOMMU_NOTIFIER_MAP;
                 entry->perm = cache_perm;
             }
         }
@@ -1170,7 +1173,7 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *info)
     trace_vtd_page_walk_one(info->domain_id, entry->iova,
                             entry->translated_addr, entry->addr_mask,
                             entry->perm);
-    return hook_fn(entry, private);
+    return hook_fn(event, private);
 }
 
 /**
@@ -1191,7 +1194,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
     uint32_t offset;
     uint64_t slpte;
     uint64_t subpage_size, subpage_mask;
-    IOMMUTLBEntry entry;
+    IOMMUTLBEvent event;
     uint64_t iova = start;
     uint64_t iova_next;
     int ret = 0;
@@ -1245,13 +1248,15 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
              *
              * In either case, we send an IOTLB notification down.
              */
-            entry.target_as = &address_space_memory;
-            entry.iova = iova & subpage_mask;
-            entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
-            entry.addr_mask = ~subpage_mask;
+            event.entry.target_as = &address_space_memory;
+            event.entry.iova = iova & subpage_mask;
+            event.entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+            event.entry.addr_mask = ~subpage_mask;
             /* NOTE: this is only meaningful if entry_valid == true */
-            entry.translated_addr = vtd_get_slpte_addr(slpte, info->aw);
-            ret = vtd_page_walk_one(&entry, info);
+            event.entry.translated_addr = vtd_get_slpte_addr(slpte, info->aw);
+            event.type = event.entry.perm ? IOMMU_NOTIFIER_MAP :
+                                            IOMMU_NOTIFIER_UNMAP;
+            ret = vtd_page_walk_one(&event, info);
         }
 
         if (ret < 0) {
@@ -1430,10 +1435,10 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num,
     return 0;
 }
 
-static int vtd_sync_shadow_page_hook(IOMMUTLBEntry *entry,
+static int vtd_sync_shadow_page_hook(IOMMUTLBEvent *event,
                                      void *private)
 {
-    memory_region_notify_iommu((IOMMUMemoryRegion *)private, 0, *entry);
+    memory_region_notify_iommu(private, 0, *event);
     return 0;
 }
 
@@ -1993,14 +1998,17 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
                  * page tables.  We just deliver the PSI down to
                  * invalidate caches.
                  */
-                IOMMUTLBEntry entry = {
-                    .target_as = &address_space_memory,
-                    .iova = addr,
-                    .translated_addr = 0,
-                    .addr_mask = size - 1,
-                    .perm = IOMMU_NONE,
+                IOMMUTLBEvent event = {
+                    .type = IOMMU_NOTIFIER_UNMAP,
+                    .entry = {
+                        .target_as = &address_space_memory,
+                        .iova = addr,
+                        .translated_addr = 0,
+                        .addr_mask = size - 1,
+                        .perm = IOMMU_NONE,
+                    },
                 };
-                memory_region_notify_iommu(&vtd_as->iommu, 0, entry);
+                memory_region_notify_iommu(&vtd_as->iommu, 0, event);
             }
         }
     }
@@ -2412,7 +2420,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
                                           VTDInvDesc *inv_desc)
 {
     VTDAddressSpace *vtd_dev_as;
-    IOMMUTLBEntry entry;
+    IOMMUTLBEvent event;
     struct VTDBus *vtd_bus;
     hwaddr addr;
     uint64_t sz;
@@ -2460,12 +2468,13 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
         sz = VTD_PAGE_SIZE;
     }
 
-    entry.target_as = &vtd_dev_as->as;
-    entry.addr_mask = sz - 1;
-    entry.iova = addr;
-    entry.perm = IOMMU_NONE;
-    entry.translated_addr = 0;
-    memory_region_notify_iommu(&vtd_dev_as->iommu, 0, entry);
+    event.type = IOMMU_NOTIFIER_UNMAP;
+    event.entry.target_as = &vtd_dev_as->as;
+    event.entry.addr_mask = sz - 1;
+    event.entry.iova = addr;
+    event.entry.perm = IOMMU_NONE;
+    event.entry.translated_addr = 0;
+    memory_region_notify_iommu(&vtd_dev_as->iommu, 0, event);
 
 done:
     return true;
@@ -3486,19 +3495,20 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
     size = remain = end - start + 1;
 
     while (remain >= VTD_PAGE_SIZE) {
-        IOMMUTLBEntry entry;
+        IOMMUTLBEvent event;
         uint64_t mask = get_naturally_aligned_size(start, remain, s->aw_bits);
 
         assert(mask);
 
-        entry.iova = start;
-        entry.addr_mask = mask - 1;
-        entry.target_as = &address_space_memory;
-        entry.perm = IOMMU_NONE;
+        event.type = IOMMU_NOTIFIER_UNMAP;
+        event.entry.iova = start;
+        event.entry.addr_mask = mask - 1;
+        event.entry.target_as = &address_space_memory;
+        event.entry.perm = IOMMU_NONE;
         /* This field is meaningless for unmap */
-        entry.translated_addr = 0;
+        event.entry.translated_addr = 0;
 
-        memory_region_notify_iommu_one(n, &entry);
+        memory_region_notify_iommu_one(n, &event);
 
         start += mask;
         remain -= mask;
@@ -3534,9 +3544,9 @@ static void vtd_address_space_refresh_all(IntelIOMMUState *s)
     vtd_switch_address_space_all(s);
 }
 
-static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
+static int vtd_replay_hook(IOMMUTLBEvent *event, void *private)
 {
-    memory_region_notify_iommu_one((IOMMUNotifier *)private, entry);
+    memory_region_notify_iommu_one(private, event);
     return 0;
 }
 
diff --git a/hw/misc/tz-mpc.c b/hw/misc/tz-mpc.c
index 98f151237f..30481e1c90 100644
--- a/hw/misc/tz-mpc.c
+++ b/hw/misc/tz-mpc.c
@@ -82,8 +82,10 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
     /* Called when the LUT word at lutidx has changed from oldlut to newlut;
      * must call the IOMMU notifiers for the changed blocks.
      */
-    IOMMUTLBEntry entry = {
-        .addr_mask = s->blocksize - 1,
+    IOMMUTLBEvent event = {
+        .entry = {
+            .addr_mask = s->blocksize - 1,
+        }
     };
     hwaddr addr = lutidx * s->blocksize * 32;
     int i;
@@ -100,26 +102,28 @@ static void tz_mpc_iommu_notify(TZMPC *s, uint32_t lutidx,
         block_is_ns = newlut & (1 << i);
 
         trace_tz_mpc_iommu_notify(addr);
-        entry.iova = addr;
-        entry.translated_addr = addr;
+        event.entry.iova = addr;
+        event.entry.translated_addr = addr;
 
-        entry.perm = IOMMU_NONE;
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+        event.type = IOMMU_NOTIFIER_UNMAP;
+        event.entry.perm = IOMMU_NONE;
+        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, event);
+        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, event);
 
-        entry.perm = IOMMU_RW;
+        event.type = IOMMU_NOTIFIER_MAP;
+        event.entry.perm = IOMMU_RW;
         if (block_is_ns) {
-            entry.target_as = &s->blocked_io_as;
+            event.entry.target_as = &s->blocked_io_as;
         } else {
-            entry.target_as = &s->downstream_as;
+            event.entry.target_as = &s->downstream_as;
         }
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, entry);
+        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_S, event);
         if (block_is_ns) {
-            entry.target_as = &s->downstream_as;
+            event.entry.target_as = &s->downstream_as;
         } else {
-            entry.target_as = &s->blocked_io_as;
+            event.entry.target_as = &s->blocked_io_as;
         }
-        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, entry);
+        memory_region_notify_iommu(&s->upstream, IOMMU_IDX_NS, event);
     }
 }
 
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 0fecabc135..8bc3cff05d 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -445,7 +445,7 @@ static void spapr_tce_reset(DeviceState *dev)
 static target_ulong put_tce_emu(SpaprTceTable *tcet, target_ulong ioba,
                                 target_ulong tce)
 {
-    IOMMUTLBEntry entry;
+    IOMMUTLBEvent event;
     hwaddr page_mask = IOMMU_PAGE_MASK(tcet->page_shift);
     unsigned long index = (ioba - tcet->bus_offset) >> tcet->page_shift;
 
@@ -457,12 +457,13 @@ static target_ulong put_tce_emu(SpaprTceTable *tcet, target_ulong ioba,
 
     tcet->table[index] = tce;
 
-    entry.target_as = &address_space_memory,
-    entry.iova = (ioba - tcet->bus_offset) & page_mask;
-    entry.translated_addr = tce & page_mask;
-    entry.addr_mask = ~page_mask;
-    entry.perm = spapr_tce_iommu_access_flags(tce);
-    memory_region_notify_iommu(&tcet->iommu, 0, entry);
+    event.entry.target_as = &address_space_memory,
+    event.entry.iova = (ioba - tcet->bus_offset) & page_mask;
+    event.entry.translated_addr = tce & page_mask;
+    event.entry.addr_mask = ~page_mask;
+    event.entry.perm = spapr_tce_iommu_access_flags(tce);
+    event.type = entry.perm ? IOMMU_NOTIFIER_MAP : IOMMU_NOTIFIER_UNMAP;
+    memory_region_notify_iommu(&tcet->iommu, 0, event);
 
     return H_SUCCESS;
 }
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 961c25b42f..8694fc7cf7 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1891,11 +1891,15 @@ void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
 }
 
 void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
-                                    IOMMUTLBEntry *entry)
+                                    IOMMUTLBEvent *event)
 {
-    IOMMUNotifierFlag request_flags;
+    IOMMUTLBEntry *entry = &event->entry;
     hwaddr entry_end = entry->iova + entry->addr_mask;
 
+    if (event->type == IOMMU_NOTIFIER_UNMAP) {
+        assert(entry->perm == IOMMU_NONE);
+    }
+
     /*
      * Skip the notification if the notification does not overlap
      * with registered range.
@@ -1906,20 +1910,14 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
 
     assert(entry->iova >= notifier->start && entry_end <= notifier->end);
 
-    if (entry->perm & IOMMU_RW) {
-        request_flags = IOMMU_NOTIFIER_MAP;
-    } else {
-        request_flags = IOMMU_NOTIFIER_UNMAP;
-    }
-
-    if (notifier->notifier_flags & request_flags) {
+    if (event->type & notifier->notifier_flags) {
         notifier->notify(notifier, entry);
     }
 }
 
 void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
                                 int iommu_idx,
-                                IOMMUTLBEntry entry)
+                                IOMMUTLBEvent event)
 {
     IOMMUNotifier *iommu_notifier;
 
@@ -1927,7 +1925,7 @@ void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
 
     IOMMU_NOTIFIER_FOREACH(iommu_notifier, iommu_mr) {
         if (iommu_notifier->iommu_idx == iommu_idx) {
-            memory_region_notify_iommu_one(iommu_notifier, &entry);
+            memory_region_notify_iommu_one(iommu_notifier, &event);
         }
     }
 }
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/5] memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 1/5] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 2/5] memory: Add IOMMUTLBEvent Eugenio Pérez
@ 2020-09-03 16:14   ` Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations Eugenio Pérez
  2020-09-03 16:14   ` [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type Eugenio Pérez
  4 siblings, 0 replies; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

This allows us to differentiate between regular IOMMU map/unmap events
and DEVIOTLB unmap. Doing so, notifiers that only need device IOTLB
invalidations will not receive regular IOMMU unmappings.

Adapt intel and vhost to use it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/exec/memory.h | 7 ++++++-
 hw/i386/intel_iommu.c | 2 +-
 hw/virtio/vhost.c     | 2 +-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index ab8d1ac9c9..77afeda60c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -87,9 +87,14 @@ typedef enum {
     IOMMU_NOTIFIER_UNMAP = 0x1,
     /* Notify entry changes (newly created entries) */
     IOMMU_NOTIFIER_MAP = 0x2,
+    /* Notify changes on device IOTLB entries */
+    IOMMU_NOTIFIER_DEVIOTLB_UNMAP = 0x04,
 } IOMMUNotifierFlag;
 
-#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+#define IOMMU_NOTIFIER_IOTLB_EVENTS (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
+#define IOMMU_NOTIFIER_DEVIOTLB_EVENTS IOMMU_NOTIFIER_DEVIOTLB_UNMAP
+#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_IOTLB_EVENTS | \
+                            IOMMU_NOTIFIER_DEVIOTLB_EVENTS)
 
 struct IOMMUNotifier;
 typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier,
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 19661bba3e..ab6833d5a0 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2468,7 +2468,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUState *s,
         sz = VTD_PAGE_SIZE;
     }
 
-    event.type = IOMMU_NOTIFIER_UNMAP;
+    event.type = IOMMU_NOTIFIER_DEVIOTLB_UNMAP;
     event.entry.target_as = &vtd_dev_as->as;
     event.entry.addr_mask = sz - 1;
     event.entry.iova = addr;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 1a1384e7a6..1bf5bbc82e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -729,7 +729,7 @@ static void vhost_iommu_region_add(MemoryListener *listener,
     iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
                                                    MEMTXATTRS_UNSPECIFIED);
     iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify,
-                        IOMMU_NOTIFIER_UNMAP,
+                        IOMMU_NOTIFIER_DEVIOTLB_UNMAP,
                         section->offset_within_region,
                         int128_get64(end),
                         iommu_idx);
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
                     ` (2 preceding siblings ...)
  2020-09-03 16:14   ` [PATCH 3/5] memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
@ 2020-09-03 16:14   ` Eugenio Pérez
  2020-09-04 18:32     ` Peter Xu
  2020-09-03 16:14   ` [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type Eugenio Pérez
  4 siblings, 1 reply; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

Although they didn't reach the notifier because of the filtering in
memory_region_notify_iommu_one, the vt-d was still splitting huge
memory invalidations in chunks. Skipping it.

This improves performance in case of netperf with vhost-net:
* TCP_STREAM: From 1923.6Mbit/s to 2175.13Mbit/s (13%)
* TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%)
* UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%)
* UDP_STREAM: No change observed (insignificant 0.1% improvement)

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/i386/intel_iommu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ab6833d5a0..fbbda0c87e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1478,6 +1478,10 @@ static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
     VTDContextEntry ce;
     IOMMUNotifier *n;
 
+    if (!(vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_IOTLB_EVENTS)) {
+        return 0;
+    }
+
     ret = vtd_dev_to_context_entry(vtd_as->iommu_state,
                                    pci_bus_num(vtd_as->bus),
                                    vtd_as->devfn, &ce);
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
                     ` (3 preceding siblings ...)
  2020-09-03 16:14   ` [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations Eugenio Pérez
@ 2020-09-03 16:14   ` Eugenio Pérez
  2020-09-04  4:34     ` Jason Wang
  4 siblings, 1 reply; 38+ messages in thread
From: Eugenio Pérez @ 2020-09-03 16:14 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, Eric Auger, qemu-arm, qemu-ppc,
	Avi Kivity, Paolo Bonzini, David Gibson, Richard Henderson

Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
the memory region or even [0, ~0ULL] for all the space. The assertion
could be hit by a guest, and rhel7 guest effectively hit it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 softmmu/memory.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 8694fc7cf7..e723fcbaa1 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
 {
     IOMMUTLBEntry *entry = &event->entry;
     hwaddr entry_end = entry->iova + entry->addr_mask;
+    IOMMUTLBEntry tmp = *entry;
 
     if (event->type == IOMMU_NOTIFIER_UNMAP) {
         assert(entry->perm == IOMMU_NONE);
@@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
         return;
     }
 
-    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
+        /* Crop (iova, addr_mask) to range */
+        tmp.iova = MAX(tmp.iova, notifier->start);
+        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
+        /* Confirm no underflow */
+        assert(MIN(entry_end, notifier->end) >= tmp.iova);
+    } else {
+        assert(entry->iova >= notifier->start && entry_end <= notifier->end);
+    }
 
     if (event->type & notifier->notifier_flags) {
-        notifier->notify(notifier, entry);
+        notifier->notify(notifier, &tmp);
     }
 }
 
-- 
2.18.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-09-03 16:14   ` [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type Eugenio Pérez
@ 2020-09-04  4:34     ` Jason Wang
  2020-09-28  9:05       ` Eugenio Perez Martin
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Wang @ 2020-09-04  4:34 UTC (permalink / raw)
  To: Eugenio Pérez, Peter Xu, qemu-devel
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Juan Quintela, Hervé Poussineau,
	Eric Auger, qemu-arm, qemu-ppc, Avi Kivity, Paolo Bonzini,
	David Gibson, Richard Henderson


On 2020/9/4 上午12:14, Eugenio Pérez wrote:
> Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
> the memory region or even [0, ~0ULL] for all the space. The assertion
> could be hit by a guest, and rhel7 guest effectively hit it.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> ---
>   softmmu/memory.c | 13 +++++++++++--
>   1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 8694fc7cf7..e723fcbaa1 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
>   {
>       IOMMUTLBEntry *entry = &event->entry;
>       hwaddr entry_end = entry->iova + entry->addr_mask;
> +    IOMMUTLBEntry tmp = *entry;
>   
>       if (event->type == IOMMU_NOTIFIER_UNMAP) {
>           assert(entry->perm == IOMMU_NONE);
> @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
>           return;
>       }
>   
> -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
> +        /* Crop (iova, addr_mask) to range */
> +        tmp.iova = MAX(tmp.iova, notifier->start);
> +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
> +        /* Confirm no underflow */
> +        assert(MIN(entry_end, notifier->end) >= tmp.iova);


It's still not clear to me why we need such assert. Consider 
notifier->end is the possible IOVA range but not possible device IOTLB 
invalidation range (e.g it allows [0, ULLONG_MAX]).

Thanks


> +    } else {
> +        assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> +    }
>   
>       if (event->type & notifier->notifier_flags) {
> -        notifier->notify(notifier, entry);
> +        notifier->notify(notifier, &tmp);
>       }
>   }
>   



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations
  2020-09-03 16:14   ` [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations Eugenio Pérez
@ 2020-09-04 18:32     ` Peter Xu
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Xu @ 2020-09-04 18:32 UTC (permalink / raw)
  To: Eugenio Pérez
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, qemu-devel, Eric Auger, qemu-arm,
	qemu-ppc, Avi Kivity, Paolo Bonzini, David Gibson,
	Richard Henderson

On Thu, Sep 03, 2020 at 06:14:45PM +0200, Eugenio Pérez wrote:
> Although they didn't reach the notifier because of the filtering in
> memory_region_notify_iommu_one, the vt-d was still splitting huge
> memory invalidations in chunks. Skipping it.
> 
> This improves performance in case of netperf with vhost-net:
> * TCP_STREAM: From 1923.6Mbit/s to 2175.13Mbit/s (13%)
> * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%)
> * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%)
> * UDP_STREAM: No change observed (insignificant 0.1% improvement)
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-09-04  4:34     ` Jason Wang
@ 2020-09-28  9:05       ` Eugenio Perez Martin
  2020-09-28 17:48         ` Peter Xu
  0 siblings, 1 reply; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-09-28  9:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Juan Quintela, Hervé Poussineau,
	qemu-level, Peter Xu, Eric Auger, qemu-arm, qemu-ppc, Avi Kivity,
	Paolo Bonzini, David Gibson, Richard Henderson

On Fri, Sep 4, 2020 at 6:34 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/9/4 上午12:14, Eugenio Pérez wrote:
> > Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
> > the memory region or even [0, ~0ULL] for all the space. The assertion
> > could be hit by a guest, and rhel7 guest effectively hit it.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> > Reviewed-by: Juan Quintela <quintela@redhat.com>
> > ---
> >   softmmu/memory.c | 13 +++++++++++--
> >   1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > index 8694fc7cf7..e723fcbaa1 100644
> > --- a/softmmu/memory.c
> > +++ b/softmmu/memory.c
> > @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> >   {
> >       IOMMUTLBEntry *entry = &event->entry;
> >       hwaddr entry_end = entry->iova + entry->addr_mask;
> > +    IOMMUTLBEntry tmp = *entry;
> >
> >       if (event->type == IOMMU_NOTIFIER_UNMAP) {
> >           assert(entry->perm == IOMMU_NONE);
> > @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> >           return;
> >       }
> >
> > -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> > +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
> > +        /* Crop (iova, addr_mask) to range */
> > +        tmp.iova = MAX(tmp.iova, notifier->start);
> > +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
> > +        /* Confirm no underflow */
> > +        assert(MIN(entry_end, notifier->end) >= tmp.iova);
>
>
> It's still not clear to me why we need such assert. Consider
> notifier->end is the possible IOVA range but not possible device IOTLB
> invalidation range (e.g it allows [0, ULLONG_MAX]).
>
> Thanks
>

As far as I understood the device should admit that out of bounds
notifications in that case,
and the assert just makes sure that there was no underflow in
tmp.addr_mask, i.e., that something
very wrong that should never happen in production happened.

Peter, would you mind to confirm/correct it?

Is there anything else needed to pull this patch?

Thanks!

>
> > +    } else {
> > +        assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> > +    }
> >
> >       if (event->type & notifier->notifier_flags) {
> > -        notifier->notify(notifier, entry);
> > +        notifier->notify(notifier, &tmp);
> >       }
> >   }
> >
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-09-28  9:05       ` Eugenio Perez Martin
@ 2020-09-28 17:48         ` Peter Xu
  2020-10-03 17:38           ` Michael S. Tsirkin
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Xu @ 2020-09-28 17:48 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Hervé Poussineau, qemu-level, Eric Auger, qemu-arm,
	qemu-ppc, Avi Kivity, Paolo Bonzini, David Gibson,
	Richard Henderson

On Mon, Sep 28, 2020 at 11:05:01AM +0200, Eugenio Perez Martin wrote:
> On Fri, Sep 4, 2020 at 6:34 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> >
> > On 2020/9/4 上午12:14, Eugenio Pérez wrote:
> > > Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
> > > the memory region or even [0, ~0ULL] for all the space. The assertion
> > > could be hit by a guest, and rhel7 guest effectively hit it.
> > >
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > Reviewed-by: Peter Xu <peterx@redhat.com>
> > > Reviewed-by: Juan Quintela <quintela@redhat.com>
> > > ---
> > >   softmmu/memory.c | 13 +++++++++++--
> > >   1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > index 8694fc7cf7..e723fcbaa1 100644
> > > --- a/softmmu/memory.c
> > > +++ b/softmmu/memory.c
> > > @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > >   {
> > >       IOMMUTLBEntry *entry = &event->entry;
> > >       hwaddr entry_end = entry->iova + entry->addr_mask;
> > > +    IOMMUTLBEntry tmp = *entry;
> > >
> > >       if (event->type == IOMMU_NOTIFIER_UNMAP) {
> > >           assert(entry->perm == IOMMU_NONE);
> > > @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > >           return;
> > >       }
> > >
> > > -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> > > +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
> > > +        /* Crop (iova, addr_mask) to range */
> > > +        tmp.iova = MAX(tmp.iova, notifier->start);
> > > +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
> > > +        /* Confirm no underflow */
> > > +        assert(MIN(entry_end, notifier->end) >= tmp.iova);
> >
> >
> > It's still not clear to me why we need such assert. Consider
> > notifier->end is the possible IOVA range but not possible device IOTLB
> > invalidation range (e.g it allows [0, ULLONG_MAX]).
> >
> > Thanks
> >
> 
> As far as I understood the device should admit that out of bounds
> notifications in that case,
> and the assert just makes sure that there was no underflow in
> tmp.addr_mask, i.e., that something
> very wrong that should never happen in production happened.
> 
> Peter, would you mind to confirm/correct it?

I think Jason is right - since we have checked at the entry that the two
regions cross over each other:

    /*
     * Skip the notification if the notification does not overlap
     * with registered range.
     */
    if (notifier->start > entry_end || notifier->end < entry->iova) {
        return;
    }

Then I don't see how this assertion can fail any more.

But imho not a big problem either, and it shouldn't hurt to even keep the
assertion of above isn't that straightforward.

> 
> Is there anything else needed to pull this patch?

I didn't post a pull for this only because I shouldn't :) - the plan was that
all vt-d patches will still go via Michael's tree, iiuc.  Though at least to me
I think this series is acceptable for merging.

Though it would always be good too if Jason would still like to review it.

Jason, what's your opinion?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-09-28 17:48         ` Peter Xu
@ 2020-10-03 17:38           ` Michael S. Tsirkin
  2020-10-05  6:32             ` Eugenio Perez Martin
  2020-10-15  7:50             ` Jason Wang
  0 siblings, 2 replies; 38+ messages in thread
From: Michael S. Tsirkin @ 2020-10-03 17:38 UTC (permalink / raw)
  To: Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Eric Auger, Jason Wang, Juan Quintela, Hervé Poussineau,
	qemu-level, Eugenio Perez Martin, qemu-arm, qemu-ppc, Avi Kivity,
	Paolo Bonzini, David Gibson, Richard Henderson

On Mon, Sep 28, 2020 at 01:48:57PM -0400, Peter Xu wrote:
> On Mon, Sep 28, 2020 at 11:05:01AM +0200, Eugenio Perez Martin wrote:
> > On Fri, Sep 4, 2020 at 6:34 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > >
> > > On 2020/9/4 上午12:14, Eugenio Pérez wrote:
> > > > Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
> > > > the memory region or even [0, ~0ULL] for all the space. The assertion
> > > > could be hit by a guest, and rhel7 guest effectively hit it.
> > > >
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > Reviewed-by: Peter Xu <peterx@redhat.com>
> > > > Reviewed-by: Juan Quintela <quintela@redhat.com>
> > > > ---
> > > >   softmmu/memory.c | 13 +++++++++++--
> > > >   1 file changed, 11 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > index 8694fc7cf7..e723fcbaa1 100644
> > > > --- a/softmmu/memory.c
> > > > +++ b/softmmu/memory.c
> > > > @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > > >   {
> > > >       IOMMUTLBEntry *entry = &event->entry;
> > > >       hwaddr entry_end = entry->iova + entry->addr_mask;
> > > > +    IOMMUTLBEntry tmp = *entry;
> > > >
> > > >       if (event->type == IOMMU_NOTIFIER_UNMAP) {
> > > >           assert(entry->perm == IOMMU_NONE);
> > > > @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > > >           return;
> > > >       }
> > > >
> > > > -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> > > > +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
> > > > +        /* Crop (iova, addr_mask) to range */
> > > > +        tmp.iova = MAX(tmp.iova, notifier->start);
> > > > +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
> > > > +        /* Confirm no underflow */
> > > > +        assert(MIN(entry_end, notifier->end) >= tmp.iova);
> > >
> > >
> > > It's still not clear to me why we need such assert. Consider
> > > notifier->end is the possible IOVA range but not possible device IOTLB
> > > invalidation range (e.g it allows [0, ULLONG_MAX]).
> > >
> > > Thanks
> > >
> > 
> > As far as I understood the device should admit that out of bounds
> > notifications in that case,
> > and the assert just makes sure that there was no underflow in
> > tmp.addr_mask, i.e., that something
> > very wrong that should never happen in production happened.
> > 
> > Peter, would you mind to confirm/correct it?
> 
> I think Jason is right - since we have checked at the entry that the two
> regions cross over each other:
> 
>     /*
>      * Skip the notification if the notification does not overlap
>      * with registered range.
>      */
>     if (notifier->start > entry_end || notifier->end < entry->iova) {
>         return;
>     }
> 
> Then I don't see how this assertion can fail any more.
> 
> But imho not a big problem either, and it shouldn't hurt to even keep the
> assertion of above isn't that straightforward.
> 
> > 
> > Is there anything else needed to pull this patch?
> 
> I didn't post a pull for this only because I shouldn't :) - the plan was that
> all vt-d patches will still go via Michael's tree, iiuc.  Though at least to me
> I think this series is acceptable for merging.

Sure, that's ok.

Eugenio, you sent patch 0 as a response to another series, which
made me miss the series. Pls don't do that in the future.

Looks like Jason reviewed the series - Jason, is that right? -
so I'd like his ack if possible.


> Though it would always be good too if Jason would still like to review it.
> 
> Jason, what's your opinion?
> 
> Thanks,
> 
> -- 
> Peter Xu



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-10-03 17:38           ` Michael S. Tsirkin
@ 2020-10-05  6:32             ` Eugenio Perez Martin
  2020-10-15  7:50             ` Jason Wang
  1 sibling, 0 replies; 38+ messages in thread
From: Eugenio Perez Martin @ 2020-10-05  6:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Jason Wang, Hervé Poussineau, qemu-level,
	Peter Xu, Eric Auger, qemu-arm, qemu-ppc, Avi Kivity,
	Paolo Bonzini, David Gibson, Richard Henderson

On Sat, Oct 3, 2020 at 7:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Sep 28, 2020 at 01:48:57PM -0400, Peter Xu wrote:
> > On Mon, Sep 28, 2020 at 11:05:01AM +0200, Eugenio Perez Martin wrote:
> > > On Fri, Sep 4, 2020 at 6:34 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > >
> > > > On 2020/9/4 上午12:14, Eugenio Pérez wrote:
> > > > > Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
> > > > > the memory region or even [0, ~0ULL] for all the space. The assertion
> > > > > could be hit by a guest, and rhel7 guest effectively hit it.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > Reviewed-by: Peter Xu <peterx@redhat.com>
> > > > > Reviewed-by: Juan Quintela <quintela@redhat.com>
> > > > > ---
> > > > >   softmmu/memory.c | 13 +++++++++++--
> > > > >   1 file changed, 11 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/softmmu/memory.c b/softmmu/memory.c
> > > > > index 8694fc7cf7..e723fcbaa1 100644
> > > > > --- a/softmmu/memory.c
> > > > > +++ b/softmmu/memory.c
> > > > > @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > > > >   {
> > > > >       IOMMUTLBEntry *entry = &event->entry;
> > > > >       hwaddr entry_end = entry->iova + entry->addr_mask;
> > > > > +    IOMMUTLBEntry tmp = *entry;
> > > > >
> > > > >       if (event->type == IOMMU_NOTIFIER_UNMAP) {
> > > > >           assert(entry->perm == IOMMU_NONE);
> > > > > @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
> > > > >           return;
> > > > >       }
> > > > >
> > > > > -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
> > > > > +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
> > > > > +        /* Crop (iova, addr_mask) to range */
> > > > > +        tmp.iova = MAX(tmp.iova, notifier->start);
> > > > > +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
> > > > > +        /* Confirm no underflow */
> > > > > +        assert(MIN(entry_end, notifier->end) >= tmp.iova);
> > > >
> > > >
> > > > It's still not clear to me why we need such assert. Consider
> > > > notifier->end is the possible IOVA range but not possible device IOTLB
> > > > invalidation range (e.g it allows [0, ULLONG_MAX]).
> > > >
> > > > Thanks
> > > >
> > >
> > > As far as I understood the device should admit that out of bounds
> > > notifications in that case,
> > > and the assert just makes sure that there was no underflow in
> > > tmp.addr_mask, i.e., that something
> > > very wrong that should never happen in production happened.
> > >
> > > Peter, would you mind to confirm/correct it?
> >
> > I think Jason is right - since we have checked at the entry that the two
> > regions cross over each other:
> >
> >     /*
> >      * Skip the notification if the notification does not overlap
> >      * with registered range.
> >      */
> >     if (notifier->start > entry_end || notifier->end < entry->iova) {
> >         return;
> >     }
> >
> > Then I don't see how this assertion can fail any more.
> >
> > But imho not a big problem either, and it shouldn't hurt to even keep the
> > assertion of above isn't that straightforward.
> >
> > >
> > > Is there anything else needed to pull this patch?
> >
> > I didn't post a pull for this only because I shouldn't :) - the plan was that
> > all vt-d patches will still go via Michael's tree, iiuc.  Though at least to me
> > I think this series is acceptable for merging.
>
> Sure, that's ok.
>
> Eugenio, you sent patch 0 as a response to another series, which
> made me miss the series. Pls don't do that in the future.
>

Sorry, noted for the next time.

Thanks!

> Looks like Jason reviewed the series - Jason, is that right? -
> so I'd like his ack if possible.
>
>
> > Though it would always be good too if Jason would still like to review it.
> >
> > Jason, what's your opinion?
> >
> > Thanks,
> >
> > --
> > Peter Xu
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type
  2020-10-03 17:38           ` Michael S. Tsirkin
  2020-10-05  6:32             ` Eugenio Perez Martin
@ 2020-10-15  7:50             ` Jason Wang
  1 sibling, 0 replies; 38+ messages in thread
From: Jason Wang @ 2020-10-15  7:50 UTC (permalink / raw)
  To: Michael S. Tsirkin, Peter Xu
  Cc: Peter Maydell, Aleksandar Rikalo, Yan Zhao, Eduardo Habkost,
	Juan Quintela, Richard Henderson, qemu-level, Eric Auger,
	qemu-arm, Hervé Poussineau, Avi Kivity, Paolo Bonzini,
	qemu-ppc, Eugenio Perez Martin, David Gibson


On 2020/10/4 上午1:38, Michael S. Tsirkin wrote:
> On Mon, Sep 28, 2020 at 01:48:57PM -0400, Peter Xu wrote:
>> On Mon, Sep 28, 2020 at 11:05:01AM +0200, Eugenio Perez Martin wrote:
>>> On Fri, Sep 4, 2020 at 6:34 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>> On 2020/9/4 上午12:14, Eugenio Pérez wrote:
>>>>> Device IOTLB invalidations can unmap arbitrary ranges, eiter outside of
>>>>> the memory region or even [0, ~0ULL] for all the space. The assertion
>>>>> could be hit by a guest, and rhel7 guest effectively hit it.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> Reviewed-by: Peter Xu <peterx@redhat.com>
>>>>> Reviewed-by: Juan Quintela <quintela@redhat.com>
>>>>> ---
>>>>>    softmmu/memory.c | 13 +++++++++++--
>>>>>    1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>>>>> index 8694fc7cf7..e723fcbaa1 100644
>>>>> --- a/softmmu/memory.c
>>>>> +++ b/softmmu/memory.c
>>>>> @@ -1895,6 +1895,7 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
>>>>>    {
>>>>>        IOMMUTLBEntry *entry = &event->entry;
>>>>>        hwaddr entry_end = entry->iova + entry->addr_mask;
>>>>> +    IOMMUTLBEntry tmp = *entry;
>>>>>
>>>>>        if (event->type == IOMMU_NOTIFIER_UNMAP) {
>>>>>            assert(entry->perm == IOMMU_NONE);
>>>>> @@ -1908,10 +1909,18 @@ void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
>>>>>            return;
>>>>>        }
>>>>>
>>>>> -    assert(entry->iova >= notifier->start && entry_end <= notifier->end);
>>>>> +    if (notifier->notifier_flags & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
>>>>> +        /* Crop (iova, addr_mask) to range */
>>>>> +        tmp.iova = MAX(tmp.iova, notifier->start);
>>>>> +        tmp.addr_mask = MIN(entry_end, notifier->end) - tmp.iova;
>>>>> +        /* Confirm no underflow */
>>>>> +        assert(MIN(entry_end, notifier->end) >= tmp.iova);
>>>>
>>>> It's still not clear to me why we need such assert. Consider
>>>> notifier->end is the possible IOVA range but not possible device IOTLB
>>>> invalidation range (e.g it allows [0, ULLONG_MAX]).
>>>>
>>>> Thanks
>>>>
>>> As far as I understood the device should admit that out of bounds
>>> notifications in that case,
>>> and the assert just makes sure that there was no underflow in
>>> tmp.addr_mask, i.e., that something
>>> very wrong that should never happen in production happened.
>>>
>>> Peter, would you mind to confirm/correct it?
>> I think Jason is right - since we have checked at the entry that the two
>> regions cross over each other:
>>
>>      /*
>>       * Skip the notification if the notification does not overlap
>>       * with registered range.
>>       */
>>      if (notifier->start > entry_end || notifier->end < entry->iova) {
>>          return;
>>      }
>>
>> Then I don't see how this assertion can fail any more.
>>
>> But imho not a big problem either, and it shouldn't hurt to even keep the
>> assertion of above isn't that straightforward.
>>
>>> Is there anything else needed to pull this patch?
>> I didn't post a pull for this only because I shouldn't :) - the plan was that
>> all vt-d patches will still go via Michael's tree, iiuc.  Though at least to me
>> I think this series is acceptable for merging.
> Sure, that's ok.
>
> Eugenio, you sent patch 0 as a response to another series, which
> made me miss the series. Pls don't do that in the future.
>
> Looks like Jason reviewed the series - Jason, is that right? -
> so I'd like his ack if possible.


Right.

Euenio. If it's possible I would prefer to remove the assertion but it's 
ok it you leave it.

And please repost the series.

Thanks


>
>
>> Though it would always be good too if Jason would still like to review it.
>>
>> Jason, what's your opinion?
>>
>> Thanks,
>>
>> -- 
>> Peter Xu
>



^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2020-10-15  7:52 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 19:16 [RFC 0/1] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Pérez
2020-06-25 19:16 ` [RFC 1/1] " Eugenio Pérez
2020-06-25 19:29 ` [RFC 0/1] " no-reply
2020-08-26 14:36 ` [RFC v6 00/13] " Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 01/13] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 02/13] memory: Add IOMMUTLBNotificationType to IOMMUTLBEntry Eugenio Pérez
2020-08-26 15:42     ` Peter Xu
2020-08-27  6:11       ` Eugenio Perez Martin
2020-08-26 14:36   ` [RFC v6 03/13] hw/alpha/typhoon: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
2020-08-26 15:50     ` Peter Xu
2020-08-26 14:36   ` [RFC v6 04/13] amd_iommu: " Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 05/13] hw/arm/smmu: Fill IOMMUTLBEntry notifier type Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 06/13] dma/rc4030: Mark all IOMMUTLBEntry as IOMMU_IOTLB_NONE type Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 07/13] intel_iommu: Mark IOMMUTLBEntry of page notification as IOMMU_IOTLB_UNMAP type Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 08/13] virtio-iommu: Mark virtio_iommu_translate IOTLB as IOMMU_IOTLB_NONE type Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 09/13] intel_iommu: Set IOMMUTLBEntry type in vtd_page_walk_level Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 10/13] memory: Notify IOMMU IOTLB based on entry type, not permissions Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 11/13] memory: Add IOMMU_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
2020-08-26 14:36   ` [RFC v6 12/13] intel_iommu: Do not notify regular iotlb to device-iotlb notifiers Eugenio Pérez
2020-08-26 16:51     ` 罗勇刚(Yonggang Luo)
2020-08-27  6:56       ` Eugenio Perez Martin
2020-08-26 14:36   ` [RFC v6 13/13] memory: Skip bad range assertion if notifier is DEVIOTLB type Eugenio Pérez
2020-08-26 15:00   ` [RFC v6 00/13] memory: Delete assertion in memory_region_unregister_iommu_notifier Eugenio Perez Martin
2020-08-26 15:54     ` Peter Xu
2020-08-27  6:53       ` Eugenio Perez Martin
2020-09-03 16:14 ` [PATCH 0/5] memory: Skip " Eugenio Pérez
2020-09-03 16:14   ` [PATCH 1/5] memory: Rename memory_region_notify_one to memory_region_notify_iommu_one Eugenio Pérez
2020-09-03 16:14   ` [PATCH 2/5] memory: Add IOMMUTLBEvent Eugenio Pérez
2020-09-03 16:14   ` [PATCH 3/5] memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType Eugenio Pérez
2020-09-03 16:14   ` [PATCH 4/5] intel_iommu: Skip page walking on device iotlb invalidations Eugenio Pérez
2020-09-04 18:32     ` Peter Xu
2020-09-03 16:14   ` [PATCH 5/5] memory: Skip bad range assertion if notifier is DEVIOTLB_UNMAP type Eugenio Pérez
2020-09-04  4:34     ` Jason Wang
2020-09-28  9:05       ` Eugenio Perez Martin
2020-09-28 17:48         ` Peter Xu
2020-10-03 17:38           ` Michael S. Tsirkin
2020-10-05  6:32             ` Eugenio Perez Martin
2020-10-15  7:50             ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.