All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support
@ 2017-08-11 14:22 Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 1/9] hw/arm/smmu-common: smmu base class Eric Auger
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

This series implements the emulation code for ARM SMMUv3.
This is the continuation of Prem's work [1].

This v6 fixes some VFIO integration issues:
- Block replay was corrected.
- Range unmap is done before the replay (vhost issue).
- Introduction of a new CMDQ_OP_TLBI_NH_VA_AM command to
  handle invalidation of hugepages. DPDK was tested with
  a single assigned VF*.
Also 64b MMIO accesses should be fixed now.

For VFIO integration, a quirk is needed on guest side in the arm-smmu-v3
driver [2]. This quirk now can used both in dt and ACPI modes.

The smmu is instantiated when passing the smmu option to machvirt:
"-M virt-2.10,smmu". Most probably I will change this to instantiate it
using -device option in next version and add an option to select caching
mode.

The series needs to be further split to allow decent reviews and I
don't expect any at this stage. However testing really is welcome.

I tested the following use cases:
- booted a guest in dt and acpi mode with an iommu_platform
  virtio-net-pci device (using dma ops). Tested with the following
  guest combinations: 4K page - 39 bit VA, 4K - 48b, 64K - 39b,
  64K - 48b.
- booted a guest [2] with assigned PCIe device virtual functions:
  - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
  - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
  - ran DPDK testpmd on guest using a single passthroughed igbvf.

* dpdk testpmd using Using 2 igbvf's does not work at the moment:
  - the issue is on guest side the 2 devices are put in the same domain
    by vfio_iommu_type1_attach_group() and mappings are replayed on
    a single devicephysical smmu
    only for the 1st device. This causes an smmu fault. I will address
    this in next revision.

Known limitations:
- no VMSAv8-32 suport
- no nested stage support (S1 + S2)
- no support for HYP mappings
- register fine emulation, commands, interrupts and errors were
  not accurately tested. Handling is sufficient to run use cases
  described above though.

Best Regards

Eric

This series can be found at:
v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
v5: https://github.com/eauger/qemu/tree/v2.9-SMMU-v5

References:
[1] Prem's last iteration:
[2] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option

History:
v5 -> v6:
- Rebase on 2.10 and IOMMUMemoryRegion
- add ACPI TLBI_ON_MAP support (VFIO integration also works in
  ACPI mode)
- fix block replay
- handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
  (goes along with TLBI_ON_MAP FW quirk)
- replay systematically unmap the whole range first
- smmuv3_map_hook does not unmap anymore and the unmap is done
  before the replay
- add and use smmuv3_context_device_invalidate instead of
  blindly replaying everything

v4 -> v5:
- initial_level now part of SMMUTransCfg
- smmu_page_walk_64 takes into account the max input size
- implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
- smmuv3_translate: bug fix: don't walk on bypass
- smmu_update_qreg: fix PROD index update
- I did not yet address Peter's comments as the code is not mature enough
  to be split into sub patches.

v3 -> v4 [Eric]:
- page table walk rewritten to allow scan of the page table within a
  range of IOVA. This prepares for VFIO integration and replay.
- configuration parsing partially reworked.
- do not advertise unsupported/untested features: S2, S1 + S2, HYP,
  PRI, ATS, ..
- added ACPI table generation
- migrated to dynamic traces
- mingw compilation fix

v2 -> v3 [Eric]:
- rebased on 2.9
- mostly code and patch reorganization to ease the review process
- optional patches removed. They may be handled separately. I am currently
  working on ACPI enablement.
- optional instantiation of the smmu in mach-virt
- removed [2/9] (fdt functions) since not mandated
- start splitting main patch into base and derived object
- no new function feature added

v1 -> v2 [Prem]:
- Adopted review comments from Eric Auger
        - Make SMMU_DPRINTF to internally call qemu_log
            (since translation requests are too many, we need control
             on the type of log we want)
        - SMMUTransCfg modified to suite simplicity
        - Change RegInfo to uint64 register array
        - Code cleanup
        - Test cleanups
- Reshuffled patches

v0 -> v1 [Prem]:
- As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
- Reworked register access/update logic
- Factored out translation code for
        - single point bug fix
        - sharing/removal in future
- (optional) Unit tests added, with PCI test device
        - S1 with 4k/64k, S1+S2 with 4k/64k
        - (S1 or S2) only can be verified by Linux 4.7 driver
        - (optional) Priliminary ACPI support

v0 [Prem]:
- Implements SMMUv3 spec 11.0
- Supported for PCIe devices,
- Command Queue and Event Queue supported
- LPAE only, S1 is supported and Tested, S2 not tested
- BE mode Translation not supported
- IRQ support (legacy, no MSI)
- Tested with DPDK and e1000


Eric Auger (6):
  hw/arm/smmu-common: smmu base class
  hw/arm/virt: Add 2.11 machine type
  hw/arm/virt: Add tlbi-on-map property to the smmuv3 node
  target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  hw/arm/smmuv3: VFIO integration
  hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model

Prem Mallappa (3):
  hw/arm/smmuv3: smmuv3 emulation model
  hw/arm/virt: Add SMMUv3 to the virt board
  hw/arm/virt-acpi-build: Add smmuv3 node in IORT table

 default-configs/aarch64-softmmu.mak |    1 +
 hw/arm/Makefile.objs                |    1 +
 hw/arm/smmu-common.c                |  493 ++++++++++++
 hw/arm/smmu-internal.h              |   89 +++
 hw/arm/smmuv3-internal.h            |  652 ++++++++++++++++
 hw/arm/smmuv3.c                     | 1412 +++++++++++++++++++++++++++++++++++
 hw/arm/trace-events                 |   62 ++
 hw/arm/virt-acpi-build.c            |   58 +-
 hw/arm/virt.c                       |  110 ++-
 include/hw/acpi/acpi-defs.h         |   15 +
 include/hw/arm/smmu-common.h        |  126 ++++
 include/hw/arm/smmuv3.h             |   89 +++
 include/hw/arm/virt.h               |    5 +
 include/hw/compat.h                 |    3 +
 target/arm/kvm.c                    |   27 +
 target/arm/trace-events             |    3 +
 16 files changed, 3137 insertions(+), 9 deletions(-)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmu-common.h
 create mode 100644 include/hw/arm/smmuv3.h

-- 
2.5.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 1/9] hw/arm/smmu-common: smmu base class
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 2/9] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

Introduces the base device and class for the ARM smmu.
Implements VMSAv8-64 table lookup and translation. VMSAv8-32
is not implemented.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>

---
v5 -> v6:
- use IOMMUMemoryRegion
- remove initial_lookup_level()
- fix block replay

v4 -> v5:
- add initial level in translation config
- implement block pte
- rename must_translate into nofail
- introduce call_entry_hook
- small changes to dynamic traces
- smmu_page_walk code moved from smmuv3.c to this file
- remove smmu_translate*

v3 -> v4:
- reworked page table walk to prepare for VFIO integration
  (capability to scan a range of IOVA). Same function is used
  for translate for a single iova. This is largely inspired
  from intel_iommu.c
- as the translate function was not straightforward to me,
  I tried to stick more closely to the VMSA spec.
- remove support of nested stage (kernel driver does not
  support it anyway)
- introduce smmu-internal.h to put page table definitions
- added smmu_find_as_from_bus_num
- SMMU_PCI_BUS_MAX and SMMU_PCI_DEVFN_MAX in smmu-common header
- new fields in SMMUState:
  - iommu_ops, smmu_as_by_busptr, smmu_as_by_bus_num
- use error_report and trace events
- add aa64[] field in SMMUTransCfg

v3:
- moved the base code in a separate patch to ease the review.
- clearer separation between base class and smmuv3 class
- translate_* only implemented as class methods
---
 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs                |   1 +
 hw/arm/smmu-common.c                | 493 ++++++++++++++++++++++++++++++++++++
 hw/arm/smmu-internal.h              |  89 +++++++
 hw/arm/trace-events                 |  14 +
 include/hw/arm/smmu-common.h        | 126 +++++++++
 6 files changed, 724 insertions(+)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 include/hw/arm/smmu-common.h

diff --git a/default-configs/aarch64-softmmu.mak b/default-configs/aarch64-softmmu.mak
index 2449483..83a2932 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -7,3 +7,4 @@ CONFIG_AUX=y
 CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
+CONFIG_ARM_SMMUV3=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index a2e56ec..5b2d38d 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -19,3 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
new file mode 100644
index 0000000..02741c2
--- /dev/null
+++ b/hw/arm/smmu-common.c
@@ -0,0 +1,493 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Author: Prem Mallappa <pmallapp@broadcom.com>
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "exec/target_page.h"
+#include "qom/cpu.h"
+
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+#include "smmu-internal.h"
+
+inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
+                                    bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        *(uint32_t *)buf = ldl_le_phys(&address_space_memory, addr);
+        break;
+    case 8:
+        *(uint64_t *)buf = ldq_le_phys(&address_space_memory, addr);
+        break;
+    default:
+        return address_space_rw(&address_space_memory, addr,
+                                attrs, buf, len, false);
+    }
+    return MEMTX_OK;
+}
+
+inline void
+smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        stl_le_phys(&address_space_memory, addr, *(uint32_t *)buf);
+        break;
+    case 8:
+        stq_le_phys(&address_space_memory, addr, *(uint64_t *)buf);
+        break;
+    default:
+        address_space_rw(&address_space_memory, addr,
+                         attrs, buf, len, true);
+    }
+}
+
+/*************************/
+/* VMSAv8-64 Translation */
+/*************************/
+
+/**
+ * get_pte - Get the content of a page table entry located in
+ * @base_addr[@index]
+ */
+static uint64_t get_pte(dma_addr_t baseaddr, uint32_t index)
+{
+    uint64_t pte;
+
+    if (smmu_read_sysmem(baseaddr + index * sizeof(pte),
+                         &pte, sizeof(pte), false)) {
+        error_report("can't read pte at address=0x%"PRIx64,
+                     baseaddr + index * sizeof(pte));
+        pte = (uint64_t)-1;
+        return pte;
+    }
+    trace_smmu_get_pte(baseaddr, index, baseaddr + index * sizeof(pte), pte);
+    /* TODO: handle endianness */
+    return pte;
+}
+
+/* VMSAv8-64 Translation Table Format Descriptor Decoding */
+
+#define PTE_ADDRESS(pte, shift) (extract64(pte, shift, 47 - shift) << shift)
+
+/**
+ * get_page_pte_address - returns the L3 descriptor output address,
+ * ie. the page frame
+ * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
+ */
+static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_table_pte_address - return table descriptor output address,
+ * ie. address of next level table
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_block_pte_address - return block descriptor output address and block size
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
+                                    uint64_t *bsz)
+{
+    int n;
+
+    switch (granule_sz) {
+    case 12:
+        if (level == 1) {
+            n = 30;
+        } else if (level == 2) {
+            n = 21;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 14:
+        if (level == 2) {
+            n = 25;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 16:
+        if (level == 2) {
+            n = 29;
+        } else {
+            goto error_out;
+        }
+        break;
+    default:
+            goto error_out;
+    }
+    *bsz = 1 << n;
+    return PTE_ADDRESS(pte, n);
+
+error_out:
+
+    error_report("unexpected granule_sz=%d/level=%d for block pte",
+                 granule_sz, level);
+    *bsz = 0;
+    return (hwaddr)-1;
+}
+
+static int call_entry_hook(uint64_t iova, uint64_t mask, uint64_t gpa,
+                           int perm, smmu_page_walk_hook hook_fn, void *private)
+{
+    IOMMUTLBEntry entry;
+    int ret;
+
+    entry.target_as = &address_space_memory;
+    entry.iova = iova & mask;
+    entry.translated_addr = gpa;
+    entry.addr_mask = ~mask;
+    entry.perm = perm;
+
+    ret = hook_fn(&entry, private);
+    if (ret) {
+        error_report("%s hook returned %d", __func__, ret);
+    }
+    return ret;
+}
+
+/**
+ * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
+ * @baseaddr: table base address corresponding to @level
+ * @level: level
+ * @cfg: translation config
+ * @start: end of the IOVA range
+ * @end: end of the IOVA range
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @nofail: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ * @notify_unmap: whether we should notify invalid entries
+ *
+ * Return 0 on success, < 0 on errors not related to translation
+ * process, > 1 on errors related to translation process (only
+ * if nofail is set)
+ */
+static int
+smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
+                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        smmu_page_walk_hook hook_fn, void *private,
+                        bool read, bool write, bool nofail,
+                        bool notify_unmap)
+{
+    uint64_t subpage_size, subpage_mask, pte, iova = start;
+    bool read_cur, write_cur, entry_valid;
+    int ret, granule_sz, stage;
+
+    granule_sz = cfg->granule_sz;
+    stage = cfg->stage;
+    subpage_size = 1ULL << level_shift(level, granule_sz);
+    subpage_mask = level_page_mask(level, granule_sz);
+
+    trace_smmu_page_walk_level_in(level, baseaddr, granule_sz,
+                                  start, end, subpage_size);
+
+    while (iova < end) {
+        dma_addr_t next_table_baseaddr;
+        uint64_t iova_next, pte_addr;
+        uint32_t offset;
+
+        iova_next = (iova & subpage_mask) + subpage_size;
+        offset = iova_level_offset(iova, level, granule_sz);
+        pte_addr = baseaddr + offset * sizeof(pte);
+        pte = get_pte(baseaddr, offset);
+
+        trace_smmu_page_walk_level(level, iova, subpage_size,
+                                   baseaddr, offset, pte);
+
+        if (pte == (uint64_t)-1) {
+            if (nofail) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            goto next;
+        }
+        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+            trace_smmu_page_walk_level_res_invalid_pte(stage, level, baseaddr,
+                                                       pte_addr, offset, pte);
+            if (nofail) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            goto next;
+        }
+
+        read_cur = read; /* TODO */
+        write_cur = write; /* TODO */
+        entry_valid = read_cur | write_cur; /* TODO */
+
+        if (is_page_pte(pte, level)) {
+            uint64_t gpa = get_page_pte_address(pte, granule_sz);
+            int perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+
+            trace_smmu_page_walk_level_page_pte(stage, level, iova,
+                                                baseaddr, pte_addr, pte, gpa);
+            if (!entry_valid && !notify_unmap) {
+                printf("%s entry_valid=%d notify_unmap=%d\n", __func__,
+                       entry_valid, notify_unmap);
+                goto next;
+            }
+            ret = call_entry_hook(iova, subpage_mask, gpa, perm,
+                                  hook_fn, private);
+            if (ret) {
+                return ret;
+            }
+            goto next;
+        }
+        if (is_block_pte(pte, level)) {
+            size_t target_page_size = qemu_target_page_size();;
+            int perm = IOMMU_ACCESS_FLAG(read_cur, write_cur);
+            uint64_t block_size, top_iova;
+            hwaddr gpa, block_gpa;
+
+            block_gpa = get_block_pte_address(pte, level, granule_sz,
+                                              &block_size);
+
+            if (block_gpa == -1) {
+                if (nofail) {
+                    return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+                } else {
+                    goto next;
+                }
+            }
+            trace_smmu_page_walk_level_block_pte(stage, level, baseaddr,
+                                                 pte_addr, pte, iova, block_gpa,
+                                                 (int)(block_size >> 20));
+
+            gpa = block_gpa + (iova & (block_size - 1));
+            if ((block_gpa == gpa) && (end >= iova_next - 1)) {
+                ret = call_entry_hook(iova, ~(block_size - 1), block_gpa,
+                                      perm, hook_fn, private);
+                if (ret) {
+                    return ret;
+                }
+                goto next;
+            } else {
+                top_iova = MIN(end, iova_next);
+                while (iova < top_iova) {
+                    gpa = block_gpa + (iova & (block_size - 1));
+                    ret = call_entry_hook(iova, ~(target_page_size - 1),
+                                          gpa, perm, hook_fn, private);
+                    if (ret) {
+                        return ret;
+                    }
+                    iova += target_page_size;
+                }
+            }
+        }
+        if (level  == 3) {
+            goto next;
+        }
+        /* table pte */
+        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
+        trace_smmu_page_walk_level_table_pte(stage, level, baseaddr, pte_addr,
+                                             pte, next_table_baseaddr);
+        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
+                                      iova, MIN(iova_next, end),
+                                      hook_fn, private, read_cur, write_cur,
+                                      nofail, notify_unmap);
+        if (!ret) {
+            return ret;
+        }
+
+next:
+        iova = iova_next;
+    }
+
+    return SMMU_TRANS_ERR_NONE;
+}
+
+/**
+ * smmu_page_walk_64 - walk a specific IOVA range from the initial
+ * lookup level, and call the hook for each valid entry
+ *
+ * @cfg: translation config
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ * @nofail: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ */
+static int
+smmu_page_walk_64(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                  bool nofail, smmu_page_walk_hook hook_fn,
+                  void *private)
+{
+    dma_addr_t ttbr;
+    int stage = cfg->stage;
+    uint64_t roof = MIN(end, (1ULL << (64 - cfg->tsz)) - 1);
+
+    if (!hook_fn) {
+        return 0;
+    }
+
+    ttbr = extract64(cfg->ttbr, 0, 48);
+
+    trace_smmu_page_walk_64(stage, cfg->ttbr, cfg->initial_level, start, roof);
+
+    return smmu_page_walk_level_64(ttbr, cfg->initial_level, cfg, start, roof,
+                                   hook_fn, private,
+                                   true /* read */, true /* write */,
+                                   nofail, false /* notify_unmap */);
+}
+
+static int set_translated_address(IOMMUTLBEntry *entry, void *private)
+{
+    SMMUTransCfg *cfg = (SMMUTransCfg *)private;
+    size_t offset = cfg->input - entry->iova;
+
+    cfg->output = entry->translated_addr + offset;
+
+    trace_smmu_set_translated_address(cfg->input, cfg->output);
+    return 0;
+}
+
+/**
+ * smmu_page_walk - Walk the page table for a given
+ * config and a given entry
+ *
+ * tlbe->iova must have been populated
+ */
+int smmu_page_walk(SMMUState *sys, SMMUTransCfg *cfg,
+                   IOMMUTLBEntry *tlbe, bool is_write)
+{
+    uint32_t page_size = 0, perm = 0;
+    int ret = 0;
+
+    trace_smmu_walk_pgtable(tlbe->iova, is_write);
+
+    if (cfg->bypassed || cfg->disabled) {
+        return 0;
+    }
+
+    cfg->input = tlbe->iova;
+
+    if (cfg->aa64) {
+        ret = smmu_page_walk_64(cfg, cfg->input, cfg->input + 1,
+                            true /* nofail */,
+                            set_translated_address, cfg);
+        page_size = 1 << cfg->granule_sz;
+    } else {
+        error_report("VMSAv8-32 translation is not yet implemented");
+        abort();
+    }
+
+    if (ret) {
+        error_report("PTW failed for iova=0x%"PRIx64" is_write=%d (%d)",
+                     cfg->input, is_write, ret);
+        goto exit;
+    }
+    tlbe->translated_addr = cfg->output;
+    tlbe->addr_mask = page_size - 1;
+    tlbe->perm = perm;
+
+    trace_smmu_walk_pgtable_out(tlbe->translated_addr,
+                                tlbe->addr_mask, tlbe->perm);
+exit:
+    return ret;
+}
+
+/*************************/
+/* VMSAv8-32 Translation */
+/*************************/
+
+static int
+smmu_page_walk_32(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                  bool nofail, smmu_page_walk_hook hook_fn,
+                  void *private)
+{
+    error_report("VMSAv8-32 translation is not yet implemented");
+    abort();
+}
+
+/******************/
+/* Infrastructure */
+/******************/
+
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
+{
+    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
+
+    if (!smmu_pci_bus) {
+        GHashTableIter iter;
+
+        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
+        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
+            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
+                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
+                return smmu_pci_bus;
+            }
+        }
+    }
+    return smmu_pci_bus;
+}
+
+static void smmu_base_instance_init(Object *obj)
+{
+     /* Nothing much to do here as of now */
+}
+
+static void smmu_base_class_init(ObjectClass *klass, void *data)
+{
+    SMMUBaseClass *sbc = SMMU_DEVICE_CLASS(klass);
+
+    sbc->page_walk_64 = smmu_page_walk_64;
+
+    sbc->page_walk_32 = smmu_page_walk_32;
+}
+
+static const TypeInfo smmu_base_info = {
+    .name          = TYPE_SMMU_DEV_BASE,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(SMMUState),
+    .instance_init = smmu_base_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUBaseClass),
+    .class_init    = smmu_base_class_init,
+    .abstract      = true,
+};
+
+static void smmu_base_register_types(void)
+{
+    type_register_static(&smmu_base_info);
+}
+
+type_init(smmu_base_register_types)
+
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
new file mode 100644
index 0000000..3b1e222
--- /dev/null
+++ b/hw/arm/smmu-internal.h
@@ -0,0 +1,89 @@
+/*
+ * ARM SMMU support - Internal API
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define ARM_LPAE_MAX_ADDR_BITS          48
+#define ARM_LPAE_MAX_LEVELS             4
+
+/* Page table bits */
+
+#ifndef HW_ARM_SMMU_INTERNAL_H
+#define HW_ARM_SMMU_INTERNAL_H
+
+#define ARM_LPAE_PTE_TYPE_SHIFT         0
+#define ARM_LPAE_PTE_TYPE_MASK          0x3
+
+#define ARM_LPAE_PTE_TYPE_BLOCK         1
+#define ARM_LPAE_PTE_TYPE_RESERVED      1
+#define ARM_LPAE_PTE_TYPE_TABLE         3
+#define ARM_LPAE_PTE_TYPE_PAGE          3
+
+#define ARM_LPAE_PTE_VALID              (1 << 0)
+
+static inline bool is_invalid_pte(uint64_t pte)
+{
+    return !(pte & ARM_LPAE_PTE_VALID);
+}
+
+static inline bool is_reserved_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_RESERVED));
+}
+
+static inline bool is_block_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK));
+}
+
+static inline bool is_table_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE));
+}
+
+static inline bool is_page_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_PAGE));
+}
+
+static inline int level_shift(int level, int granule_sz)
+{
+    return granule_sz + (3 - level) * (granule_sz - 3);
+}
+
+static inline uint64_t level_page_mask(int level, int granule_sz)
+{
+    return ~((1ULL << level_shift(level, granule_sz)) - 1);
+}
+
+/**
+ * TODO: handle the case where the level resolves less than
+ * granule_sz -3 IA bits.
+ */
+static inline
+uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
+{
+    return (iova >> level_shift(level, granule_sz)) &
+            ((1ULL << (granule_sz - 3)) - 1);
+}
+
+#endif
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 193063e..b371b4d 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -2,3 +2,17 @@
 
 # hw/arm/virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+
+# hw/arm/smmu-common.c
+
+smmu_page_walk_64(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
+smmu_page_walk_level_in(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64", subpage_size=0x%lx"
+smmu_page_walk_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
+smmu_page_walk_level_res_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%lx"
+smmu_page_walk_level_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
+smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
+smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
+smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
+smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
+smmu_walk_pgtable(hwaddr iova, bool is_write) "Input addr: 0x%"PRIx64", is_write=%d"
+smmu_walk_pgtable_out(hwaddr addr, uint32_t mask, int perm) "DONE: o/p addr:0x%"PRIx64" mask:0x%x perm:%d"
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
new file mode 100644
index 0000000..ea20a78
--- /dev/null
+++ b/include/hw/arm/smmu-common.h
@@ -0,0 +1,126 @@
+/*
+ * ARM SMMU Support
+ *
+ * Copyright (C) 2015-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_COMMON_H
+#define HW_ARM_SMMU_COMMON_H
+
+#include <hw/sysbus.h>
+#include "hw/pci/pci.h"
+
+#define SMMU_PCI_BUS_MAX      256
+#define SMMU_PCI_DEVFN_MAX    256
+
+typedef enum {
+    SMMU_TRANS_ERR_NONE          = 0x0,
+    SMMU_TRANS_ERR_WALK_EXT_ABRT = 0x1,  /* Translation walk external abort */
+    SMMU_TRANS_ERR_TRANS         = 0x10, /* Translation fault */
+    SMMU_TRANS_ERR_ADDR_SZ,              /* Address Size fault */
+    SMMU_TRANS_ERR_ACCESS,               /* Access fault */
+    SMMU_TRANS_ERR_PERM,                 /* Permission fault */
+    SMMU_TRANS_ERR_TLB_CONFLICT  = 0x20, /* TLB Conflict */
+} SMMUTransErr;
+
+/*
+ * Generic structure populated by derived SMMU devices
+ * after decoding the configuration information and used as
+ * input to the page table walk
+ */
+typedef struct SMMUTransCfg {
+    hwaddr   input;            /* input address */
+    hwaddr   output;           /* Output address */
+    int      stage;            /* translation stage */
+    uint32_t oas;              /* output address width */
+    uint32_t tsz;              /* input range, ie. 2^(64 -tnsz)*/
+    uint64_t ttbr;             /* TTBR address */
+    uint32_t granule_sz;       /* granule page shift */
+    bool     aa64;             /* arch64 or aarch32 translation table */
+    int      initial_level;    /* initial lookup level */
+    bool     disabled;         /* smmu is disabled */
+    bool     bypassed;         /* stage is bypassed */
+} SMMUTransCfg;
+
+typedef struct SMMUDevice {
+    void               *smmu;
+    PCIBus             *bus;
+    int                devfn;
+    IOMMUMemoryRegion  iommu;
+    AddressSpace       as;
+} SMMUDevice;
+
+typedef struct SMMUNotifierNode {
+    SMMUDevice *sdev;
+    QLIST_ENTRY(SMMUNotifierNode) next;
+} SMMUNotifierNode;
+
+typedef struct SMMUPciBus {
+    PCIBus       *bus;
+    SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} SMMUPciBus;
+
+typedef struct SMMUState {
+    /* <private> */
+    SysBusDevice  dev;
+
+    MemoryRegion iomem;
+
+    GHashTable *smmu_as_by_busptr;
+    SMMUPciBus *smmu_as_by_bus_num[SMMU_PCI_BUS_MAX];
+    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+
+} SMMUState;
+
+typedef int (*smmu_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+
+typedef struct {
+    /* <private> */
+    SysBusDeviceClass parent_class;
+
+    /* public */
+    int (*page_walk_32)(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        bool nofail, smmu_page_walk_hook hook_fn,
+                        void *private);
+    int (*page_walk_64)(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        bool nofail, smmu_page_walk_hook hook_fn,
+                        void *private);
+} SMMUBaseClass;
+
+#define TYPE_SMMU_DEV_BASE "smmu-base"
+#define SMMU_SYS_DEV(obj) OBJECT_CHECK(SMMUState, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_CLASS(klass)                                    \
+    OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_SMMU_DEV_BASE)
+
+MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
+                             dma_addr_t len, bool secure);
+void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
+
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
+
+static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
+{
+    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
+}
+
+int smmu_page_walk(SMMUState *s, SMMUTransCfg *cfg,
+                   IOMMUTLBEntry *tlbe, bool is_write);
+
+#endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 2/9] hw/arm/smmuv3: smmuv3 emulation model
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 1/9] hw/arm/smmu-common: smmu base class Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 3/9] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

From: Prem Mallappa <prem.mallappa@broadcom.com>

Introduces the SMMUv3 derived model. This is based on
System MMUv3 specification (v17).

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- Use IOMMUMemoryregion
- regs become uint32_t and fix 64b MMIO access (.impl)
- trace_smmuv3_write/read_mmio take the size param

v4 -> v5:
- change smmuv3_translate proto (IOMMUAccessFlags flag)
- has_stagex replaced by is_ste_stagex
- smmu_cfg_populate removed
- added smmuv3_decode_config and reworked error management
- remwork the naming of IOMMU mrs
- fix SMMU_CMDQ_CONS offset

v3 -> v4
- smmu_irq_update
- fix hash key allocation
- set smmu_iommu_ops
- set SMMU_REG_CR0,
- smmuv3_translate: ret.perm not set in bypass mode
- use trace events
- renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
- rework smmu_find_ste
- fix tg2granule in TT0/0b10 corresponds to 16kB

v2 -> v3:
- move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
- compilation allowed
- fix sbus allocation in smmu_init_pci_iommu
- restructure code into headers
- misc cleanups
---
 hw/arm/Makefile.objs     |    2 +-
 hw/arm/smmuv3-internal.h |  651 ++++++++++++++++++++++++++
 hw/arm/smmuv3.c          | 1152 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   34 ++
 include/hw/arm/smmuv3.h  |   89 ++++
 5 files changed, 1927 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmuv3.h

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 5b2d38d..a7c808b 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
-obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
new file mode 100644
index 0000000..e255df1
--- /dev/null
+++ b/hw/arm/smmuv3-internal.h
@@ -0,0 +1,651 @@
+/*
+ * ARM SMMUv3 support - Internal API
+ *
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_V3_INTERNAL_H
+#define HW_ARM_SMMU_V3_INTERNAL_H
+
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+/*****************************
+ * MMIO Register
+ *****************************/
+enum {
+    SMMU_REG_IDR0            = 0x0,
+
+/* IDR0 Field Values and supported features */
+
+#define SMMU_IDR0_S2P      1  /* stage 2 */
+#define SMMU_IDR0_S1P      1  /* stage 1 */
+#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
+#define SMMU_IDR0_COHACC   1  /* IO coherent access */
+#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
+#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
+#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
+#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
+#define SMMU_IDR0_PRI      0  /* Page Request Interface */
+#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
+#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
+#define SMMU_IDR0_STALL    1  /* Stalling fault model */
+#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
+#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
+
+#define SMMU_IDR0_S2P_SHIFT      0
+#define SMMU_IDR0_S1P_SHIFT      1
+#define SMMU_IDR0_TTF_SHIFT      2
+#define SMMU_IDR0_COHACC_SHIFT   4
+#define SMMU_IDR0_HTTU_SHIFT     6
+#define SMMU_IDR0_HYP_SHIFT      9
+#define SMMU_IDR0_ATS_SHIFT      10
+#define SMMU_IDR0_ASID16_SHIFT   12
+#define SMMU_IDR0_PRI_SHIFT      16
+#define SMMU_IDR0_VMID16_SHIFT   18
+#define SMMU_IDR0_CD2L_SHIFT     19
+#define SMMU_IDR0_STALL_SHIFT    24
+#define SMMU_IDR0_TERM_SHIFT     26
+#define SMMU_IDR0_STLEVEL_SHIFT  27
+
+    SMMU_REG_IDR1            = 0x4,
+#define SMMU_IDR1_SIDSIZE 16
+    SMMU_REG_IDR2            = 0x8,
+    SMMU_REG_IDR3            = 0xc,
+    SMMU_REG_IDR4            = 0x10,
+    SMMU_REG_IDR5            = 0x14,
+#define SMMU_IDR5_GRAN_SHIFT 4
+#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
+#define SMMU_IDR5_OAS        4     /* 44 bits */
+    SMMU_REG_IIDR            = 0x1c,
+    SMMU_REG_CR0             = 0x20,
+
+#define SMMU_CR0_SMMU_ENABLE (1 << 0)
+#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
+#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
+#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
+#define SMMU_CR0_ATS_CHECK   (1 << 4)
+
+    SMMU_REG_CR0_ACK         = 0x24,
+    SMMU_REG_CR1             = 0x28,
+    SMMU_REG_CR2             = 0x2c,
+
+    SMMU_REG_STATUSR         = 0x40,
+
+    SMMU_REG_IRQ_CTRL        = 0x50,
+    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
+
+#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
+#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
+#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
+
+    SMMU_REG_GERROR          = 0x60,
+
+#define SMMU_GERROR_CMDQ       (1 << 0)
+#define SMMU_GERROR_EVENTQ     (1 << 2)
+#define SMMU_GERROR_PRIQ       (1 << 3)
+#define SMMU_GERROR_MSI_CMDQ   (1 << 4)
+#define SMMU_GERROR_MSI_EVENTQ (1 << 5)
+#define SMMU_GERROR_MSI_PRIQ   (1 << 6)
+#define SMMU_GERROR_MSI_GERROR (1 << 7)
+#define SMMU_GERROR_SFM_ERR    (1 << 8)
+
+    SMMU_REG_GERRORN         = 0x64,
+    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
+    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
+    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
+
+    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
+#define SMMU_BASE_RA        (1ULL << 62)
+    SMMU_REG_STRTAB_BASE     = 0x80,
+    SMMU_REG_STRTAB_BASE_CFG = 0x88,
+
+    SMMU_REG_CMDQ_BASE       = 0x90,
+    SMMU_REG_CMDQ_PROD       = 0x98,
+    SMMU_REG_CMDQ_CONS       = 0x9c,
+    /* CMD Consumer (CONS) */
+#define SMMU_CMD_CONS_ERR_SHIFT        24
+#define SMMU_CMD_CONS_ERR_BITS         7
+
+    SMMU_REG_EVTQ_BASE       = 0xa0,
+    SMMU_REG_EVTQ_PROD       = 0xa8,
+    SMMU_REG_EVTQ_CONS       = 0xac,
+    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
+    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
+    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
+
+    SMMU_REG_PRIQ_BASE       = 0xc0,
+    SMMU_REG_PRIQ_PROD       = 0xc8,
+    SMMU_REG_PRIQ_CONS       = 0xcc,
+    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
+    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
+    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
+
+    SMMU_ID_REGS_OFFSET      = 0xfd0,
+
+    /* Secure registers are not used for now */
+    SMMU_SECURE_OFFSET       = 0x8000,
+};
+
+/**********************
+ * Data Structures
+ **********************/
+
+struct __smmu_data2 {
+    uint32_t word[2];
+};
+
+struct __smmu_data8 {
+    uint32_t word[8];
+};
+
+struct __smmu_data16 {
+    uint32_t word[16];
+};
+
+struct __smmu_data4 {
+    uint32_t word[4];
+};
+
+typedef struct __smmu_data2  STEDesc; /* STE Level 1 Descriptor */
+typedef struct __smmu_data16 Ste;     /* Stream Table Entry(STE) */
+typedef struct __smmu_data2  CDDesc;  /* CD Level 1 Descriptor */
+typedef struct __smmu_data16 Cd;      /* Context Descriptor(CD) */
+
+typedef struct __smmu_data4  Cmd; /* Command Entry */
+typedef struct __smmu_data8  Evt; /* Event Entry */
+typedef struct __smmu_data4  Pri; /* PRI entry */
+
+/*****************************
+ * STE fields
+ *****************************/
+
+#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
+#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
+enum {
+    STE_CONFIG_NONE      = 0,
+    STE_CONFIG_BYPASS    = 4,       /* S1 Bypass    , S2 Bypass */
+    STE_CONFIG_S1        = 5,       /* S1 Translate , S2 Bypass */
+    STE_CONFIG_S2        = 6,       /* S1 Bypass    , S2 Translate */
+    STE_CONFIG_NESTED    = 7,       /* S1 Translate , S2 Translate */
+};
+#define STE_S1FMT(x)   extract32((x)->word[0], 4, 2)
+#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
+#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
+#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
+#define STE_S2VMID(x)  extract32((x)->word[4], 0, 16)
+#define STE_S2T0SZ(x)  extract32((x)->word[5], 0, 6)
+#define STE_S2SL0(x)   extract32((x)->word[5], 6, 2)
+#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
+#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
+#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
+#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
+#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
+#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
+#define STE_CTXPTR(x)                                           \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
+        addr;                                                   \
+    })
+
+#define STE_S2TTB(x)                                            \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
+        addr;                                                   \
+    })
+
+static inline int is_ste_bypass(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_BYPASS;
+}
+
+static inline bool is_ste_stage1(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_S1;
+}
+
+static inline bool is_ste_stage2(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_S2;
+}
+
+/**
+ * is_s2granule_valid - Check the stage 2 translation granule size
+ * advertised in the STE matches any IDR5 supported value
+ */
+static inline bool is_s2granule_valid(Ste *ste)
+{
+    int idr5_format = 0;
+
+    switch (STE_S2TG(ste)) {
+    case 0: /* 4kB */
+        idr5_format = 0x1;
+        break;
+    case 1: /* 64 kB */
+        idr5_format = 0x4;
+        break;
+    case 2: /* 16 kB */
+        idr5_format = 0x2;
+        break;
+    case 3: /* reserved */
+        break;
+    }
+    idr5_format &= SMMU_IDR5_GRAN;
+    return idr5_format;
+}
+
+static inline int oas2bits(int oas_field)
+{
+    switch (oas_field) {
+    case 0b011:
+        return 42;
+    case 0b100:
+        return 44;
+    default:
+        return 32 + (1 << oas_field);
+   }
+}
+
+static inline int pa_range(Ste *ste)
+{
+    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
+
+    if (!STE_S2AA64(ste)) {
+        return 40;
+    }
+
+    return oas2bits(oas_field);
+}
+
+#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
+
+/*****************************
+ * CD fields
+ *****************************/
+#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
+#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
+#define CD_TTB(x, sel)                                      \
+    ({                                                      \
+        uint64_t hi, lo;                                    \
+        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
+        hi <<= 32;                                          \
+        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
+        hi | lo;                                            \
+    })
+
+#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
+#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
+#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
+
+#define CD_T0SZ(x)    CD_TSZ((x), 0)
+#define CD_T1SZ(x)    CD_TSZ((x), 1)
+#define CD_TG0(x)     CD_TG((x), 0)
+#define CD_TG1(x)     CD_TG((x), 1)
+#define CD_EPD0(x)    CD_EPD((x), 0)
+#define CD_EPD1(x)    CD_EPD((x), 1)
+#define CD_IPS(x)     extract32((x)->word[1], 0, 3)
+#define CD_AARCH64(x) extract32((x)->word[1], 9, 1)
+#define CD_TTB0(x)    CD_TTB((x), 0)
+#define CD_TTB1(x)    CD_TTB((x), 1)
+
+#define CDM_VALID(x)    ((x)->word[0] & 0x1)
+
+static inline int is_cd_valid(SMMUV3State *s, Ste *ste, Cd *cd)
+{
+    return CD_VALID(cd);
+}
+
+/*****************************
+ * Commands
+ *****************************/
+enum {
+    SMMU_CMD_PREFETCH_CONFIG = 0x01,
+    SMMU_CMD_PREFETCH_ADDR,
+    SMMU_CMD_CFGI_STE,
+    SMMU_CMD_CFGI_STE_RANGE,
+    SMMU_CMD_CFGI_CD,
+    SMMU_CMD_CFGI_CD_ALL,
+    SMMU_CMD_CFGI_ALL,
+    SMMU_CMD_TLBI_NH_ALL     = 0x10,
+    SMMU_CMD_TLBI_NH_ASID,
+    SMMU_CMD_TLBI_NH_VA,
+    SMMU_CMD_TLBI_NH_VAA,
+    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
+    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
+    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
+    SMMU_CMD_TLBI_EL2_ASID,
+    SMMU_CMD_TLBI_EL2_VA,
+    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
+    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
+    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
+    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
+    SMMU_CMD_ATC_INV         = 0x40,
+    SMMU_CMD_PRI_RESP,
+    SMMU_CMD_RESUME          = 0x44,
+    SMMU_CMD_STALL_TERM,
+    SMMU_CMD_SYNC,          /* 0x46 */
+};
+
+static const char *cmd_stringify[] = {
+    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
+    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
+    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
+    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
+    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
+    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
+    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
+    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
+    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
+    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
+    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
+    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
+    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
+    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
+    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
+    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
+    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
+    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
+    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
+    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
+    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
+    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
+    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
+    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
+    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
+};
+
+/*****************************
+ *  Register Access Primitives
+ *****************************/
+
+static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
+{
+    addr >>= 2;
+    s->regs[addr] = extract64(val, 0, 32);
+    s->regs[addr + 1] = extract64(val, 32, 32);
+}
+
+static inline void smmu_write_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
+{
+    s->regs[addr >> 2] = val;
+}
+
+static inline uint32_t smmu_read_reg(SMMUV3State *s, uint32_t addr)
+{
+    return s->regs[addr >> 2];
+}
+
+static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
+{
+    addr >>= 2;
+    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
+}
+
+#define smmu_read32_reg smmu_read_reg
+#define smmu_write32_reg smmu_write_reg
+
+/*****************************
+ * CMDQ fields
+ *****************************/
+
+enum { /* Command Errors */
+    SMMU_CMD_ERR_NONE = 0,
+    SMMU_CMD_ERR_ILLEGAL,
+    SMMU_CMD_ERR_ABORT
+};
+
+enum { /* Command completion notification */
+    CMD_SYNC_SIG_NONE,
+    CMD_SYNC_SIG_IRQ,
+    CMD_SYNC_SIG_SEV,
+};
+
+#define CMD_TYPE(x)  extract32((x)->word[0], 0, 8)
+#define CMD_SEC(x)   extract32((x)->word[0], 9, 1)
+#define CMD_SEV(x)   extract32((x)->word[0], 10, 1)
+#define CMD_AC(x)    extract32((x)->word[0], 12, 1)
+#define CMD_AB(x)    extract32((x)->word[0], 13, 1)
+#define CMD_CS(x)    extract32((x)->word[0], 12, 2)
+#define CMD_SSID(x)  extract32((x)->word[0], 16, 16)
+#define CMD_SID(x)   ((x)->word[1])
+#define CMD_VMID(x)  extract32((x)->word[1], 0, 16)
+#define CMD_ASID(x)  extract32((x)->word[1], 16, 16)
+#define CMD_STAG(x)  extract32((x)->word[2], 0, 16)
+#define CMD_RESP(x)  extract32((x)->word[2], 11, 2)
+#define CMD_GRPID(x) extract32((x)->word[3], 0, 8)
+#define CMD_SIZE(x)  extract32((x)->word[3], 0, 16)
+#define CMD_LEAF(x)  extract32((x)->word[3], 0, 1)
+#define CMD_SPAN(x)  extract32((x)->word[3], 0, 5)
+#define CMD_ADDR(x) ({                                  \
+            uint64_t addr = (uint64_t)(x)->word[3];     \
+            addr <<= 32;                                \
+            addr |=  extract32((x)->word[3], 12, 20);   \
+            addr;                                       \
+        })
+
+/***************************
+ * Queue Handling
+ ***************************/
+
+typedef enum {
+    CMD_Q_EMPTY,
+    CMD_Q_FULL,
+    CMD_Q_INUSE,
+} SMMUQStatus;
+
+#define Q_ENTRY(q, idx)  (q->base + q->ent_size * idx)
+#define Q_WRAP(q, pc)    ((pc) >> (q)->shift)
+#define Q_IDX(q, pc)     ((pc) & ((1 << (q)->shift) - 1))
+
+static inline SMMUQStatus __smmu_queue_status(SMMUV3State *s, SMMUQueue *q)
+{
+    uint32_t prod = Q_IDX(q, q->prod);
+    uint32_t cons = Q_IDX(q, q->cons);
+
+    if ((prod == cons) && (q->wrap.prod != q->wrap.cons)) {
+        return CMD_Q_FULL;
+    } else if ((prod == cons) && (q->wrap.prod == q->wrap.cons)) {
+        return CMD_Q_EMPTY;
+    }
+    return CMD_Q_INUSE;
+}
+#define smmu_is_q_full(s, q) (__smmu_queue_status(s, q) == CMD_Q_FULL)
+#define smmu_is_q_empty(s, q) (__smmu_queue_status(s, q) == CMD_Q_EMPTY)
+
+static inline int __smmu_q_enabled(SMMUV3State *s, uint32_t q)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & q;
+}
+#define smmu_cmd_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_CMDQ_ENABLE)
+#define smmu_evt_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_EVTQ_ENABLE)
+
+#define SMMU_CMDQ_ERR(s) ((smmu_read32_reg(s, SMMU_REG_GERROR) ^    \
+                           smmu_read32_reg(s, SMMU_REG_GERRORN)) &  \
+                          SMMU_GERROR_CMDQ)
+
+static inline void smmuv3_init_queues(SMMUV3State *s)
+{
+    s->cmdq.prod = 0;
+    s->cmdq.cons = 0;
+    s->cmdq.wrap.prod = 0;
+    s->cmdq.wrap.cons = 0;
+
+    s->evtq.prod = 0;
+    s->evtq.cons = 0;
+    s->evtq.wrap.prod = 0;
+    s->evtq.wrap.cons = 0;
+
+    s->priq.prod = 0;
+    s->priq.cons = 0;
+    s->priq.wrap.prod = 0;
+    s->priq.wrap.cons = 0;
+}
+
+/*****************************
+ * EVTQ fields
+ *****************************/
+
+#define EVT_Q_OVERFLOW        (1 << 31)
+
+#define EVT_SET_TYPE(x, t)    deposit32((x)->word[0], 0, 8, t)
+#define EVT_SET_SID(x, s)     ((x)->word[1] =  s)
+#define EVT_SET_INPUT_ADDR(x, addr) ({                    \
+            (x)->word[5] = (uint32_t)(addr >> 32);        \
+            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
+            addr;                                         \
+        })
+
+/*****************************
+ * Events
+ *****************************/
+
+enum evt_err {
+    SMMU_EVT_F_UUT    = 0x1,
+    SMMU_EVT_C_BAD_SID,
+    SMMU_EVT_F_STE_FETCH,
+    SMMU_EVT_C_BAD_STE,
+    SMMU_EVT_F_BAD_ATS_REQ,
+    SMMU_EVT_F_STREAM_DISABLED,
+    SMMU_EVT_F_TRANS_FORBIDDEN,
+    SMMU_EVT_C_BAD_SSID,
+    SMMU_EVT_F_CD_FETCH,
+    SMMU_EVT_C_BAD_CD,
+    SMMU_EVT_F_WALK_EXT_ABRT,
+    SMMU_EVT_F_TRANS        = 0x10,
+    SMMU_EVT_F_ADDR_SZ,
+    SMMU_EVT_F_ACCESS,
+    SMMU_EVT_F_PERM,
+    SMMU_EVT_F_TLB_CONFLICT = 0x20,
+    SMMU_EVT_F_CFG_CONFLICT = 0x21,
+    SMMU_EVT_E_PAGE_REQ     = 0x24,
+};
+
+typedef enum evt_err SMMUEvtErr;
+
+/*****************************
+ * Interrupts
+ *****************************/
+
+static inline int __smmu_irq_enabled(SMMUV3State *s, uint32_t q)
+{
+    return smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & q;
+}
+#define smmu_evt_irq_enabled(s)                   \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_EVENT_EN)
+#define smmu_gerror_irq_enabled(s)                  \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_GERROR_EN)
+#define smmu_pri_irq_enabled(s)                 \
+    __smmu_irq_enabled(s, SMMU_IRQ_CTRL_PRI_EN)
+
+static inline bool
+smmu_is_irq_pending(SMMUV3State *s, int irq)
+{
+    return smmu_read32_reg(s, SMMU_REG_GERROR) ^
+        smmu_read32_reg(s, SMMU_REG_GERRORN);
+}
+
+/*****************************
+ * Hash Table
+ *****************************/
+
+static inline gboolean smmu_uint64_equal(gconstpointer v1, gconstpointer v2)
+{
+    return *((const uint64_t *)v1) == *((const uint64_t *)v2);
+}
+
+static inline guint smmu_uint64_hash(gconstpointer v)
+{
+    return (guint)*(const uint64_t *)v;
+}
+
+/*****************************
+ * Misc
+ *****************************/
+
+/**
+ * tg2granule - Decodes the CD translation granule size field according
+ * to the TT in use
+ * @bits: TG0/1 fiels
+ * @tg1: if set, @bits belong to TG1, otherwise belong to TG0
+ */
+static inline int tg2granule(int bits, bool tg1)
+{
+    switch (bits) {
+    case 1:
+        return tg1 ? 14 : 16;
+    case 2:
+        return tg1 ? 12 : 14;
+    case 3:
+        return tg1 ? 16 : 12;
+    default:
+        return 12;
+    }
+}
+
+#define L1STD_L2PTR(stm) ({                                 \
+            uint64_t hi, lo;                            \
+            hi = (stm)->word[1];                        \
+            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
+            hi << 32 | lo;                              \
+        })
+
+#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
+
+/*****************************
+ * Debug
+ *****************************/
+#define ARM_SMMU_DEBUG
+
+#ifdef ARM_SMMU_DEBUG
+static inline void dump_ste(Ste *ste)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(ste->word); i += 2) {
+        trace_smmuv3_dump_ste(i, ste->word[i], i + 1, ste->word[i + 1]);
+    }
+}
+
+static inline void dump_cd(Cd *cd)
+{
+    int i;
+    for (i = 0; i < ARRAY_SIZE(cd->word); i += 2) {
+        trace_smmuv3_dump_cd(i, cd->word[i], i + 1, cd->word[i + 1]);
+    }
+}
+
+static inline void dump_cmd(Cmd *cmd)
+{
+    int i;
+    for (i = 0; i < ARRAY_SIZE(cmd->word); i += 2) {
+        trace_smmuv3_dump_cmd(i, cmd->word[i], i + 1, cmd->word[i + 1]);
+    }
+}
+
+#else
+#define dump_ste(...) do {} while (0)
+#define dump_cd(...) do {} while (0)
+#define dump_cmd(...) do {} while (0)
+#endif /* ARM_SMMU_DEBUG */
+
+#endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
new file mode 100644
index 0000000..a3199f1
--- /dev/null
+++ b/hw/arm/smmuv3.c
@@ -0,0 +1,1152 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "sysemu/sysemu.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-internal.h"
+
+static inline int smmu_enabled(SMMUV3State *s)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
+}
+
+/**
+ * smmu_irq_update - update the GERROR register according to
+ * the IRQ and the enable state
+ *
+ * return > 0 when IRQ is supposed to be raised
+ */
+static int smmu_irq_update(SMMUV3State *s, int irq, uint64_t data)
+{
+    uint32_t error = 0;
+
+    if (!smmu_gerror_irq_enabled(s)) {
+        return 0;
+    }
+
+    switch (irq) {
+    case SMMU_IRQ_EVTQ:
+        if (smmu_evt_irq_enabled(s)) {
+            error = SMMU_GERROR_EVENTQ;
+        }
+        break;
+    case SMMU_IRQ_CMD_SYNC:
+        if (smmu_gerror_irq_enabled(s)) {
+            uint32_t err_type = (uint32_t)data;
+
+            if (err_type) {
+                uint32_t regval = smmu_read32_reg(s, SMMU_REG_CMDQ_CONS);
+                smmu_write32_reg(s, SMMU_REG_CMDQ_CONS,
+                                 regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
+            }
+            error = SMMU_GERROR_CMDQ;
+        }
+        break;
+    case SMMU_IRQ_PRIQ:
+        if (smmu_pri_irq_enabled(s)) {
+            error = SMMU_GERROR_PRIQ;
+        }
+        break;
+    }
+
+    if (error) {
+        uint32_t gerror = smmu_read32_reg(s, SMMU_REG_GERROR);
+        uint32_t gerrorn = smmu_read32_reg(s, SMMU_REG_GERRORN);
+
+        trace_smmuv3_irq_update(error, gerror, gerrorn);
+
+        /* only toggle GERROR if the interrupt is not active */
+        if (!((gerror ^ gerrorn) & error)) {
+            smmu_write32_reg(s, SMMU_REG_GERROR, gerror ^ error);
+        }
+    }
+
+    return error;
+}
+
+static void smmu_irq_raise(SMMUV3State *s, int irq, uint64_t data)
+{
+    trace_smmuv3_irq_raise(irq);
+    if (smmu_irq_update(s, irq, data)) {
+            qemu_irq_raise(s->irq[irq]);
+    }
+}
+
+static MemTxResult smmu_q_read(SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->cons));
+    MemTxResult ret;
+
+    ret = smmu_read_sysmem(addr, data, q->ent_size, false);
+    /* TODO if (ret != MEMTX_OK ) handle error */
+
+    q->cons++;
+    if (q->cons == q->entries) {
+        q->cons = 0;
+        q->wrap.cons++;     /* this will toggle */
+    }
+
+    return ret;
+}
+
+static MemTxResult smmu_q_write(SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->prod));
+
+    if (q->prod == q->entries) {
+        q->prod = 0;
+        q->wrap.prod++;     /* this will toggle */
+    }
+
+    q->prod++;
+
+    smmu_write_sysmem(addr, data, q->ent_size, false);
+
+    return MEMTX_OK;
+}
+
+static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
+{
+    SMMUQueue *q = &s->cmdq;
+    MemTxResult ret = smmu_q_read(q, cmd);
+    uint32_t val = 0;
+
+    val |= (q->wrap.cons << q->shift) | q->cons;
+
+    /* Update consumer pointer */
+    smmu_write32_reg(s, SMMU_REG_CMDQ_CONS, val);
+
+    return ret;
+}
+
+static int smmu_cmdq_consume(SMMUV3State *s)
+{
+    uint32_t error = SMMU_CMD_ERR_NONE;
+
+    trace_smmuv3_cmdq_consume(SMMU_CMDQ_ERR(s), smmu_cmd_q_enabled(s),
+                              s->cmdq.prod, s->cmdq.cons,
+                              s->cmdq.wrap.prod, s->cmdq.wrap.cons);
+
+    if (!smmu_cmd_q_enabled(s)) {
+        return 0;
+    }
+
+    while (!SMMU_CMDQ_ERR(s) && !smmu_is_q_empty(s, &s->cmdq)) {
+        uint32_t type;
+        Cmd cmd;
+
+        if (smmu_read_cmdq(s, &cmd) != MEMTX_OK) {
+            error = SMMU_CMD_ERR_ABORT;
+            break;
+        }
+
+        type = CMD_TYPE(&cmd);
+
+        trace_smmuv3_cmdq_opcode(cmd_stringify[type]);
+
+        switch (CMD_TYPE(&cmd)) {
+        case SMMU_CMD_SYNC:     /* Fallthrough */
+            if (CMD_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
+                smmu_irq_raise(s, SMMU_IRQ_CMD_SYNC, SMMU_CMD_ERR_NONE);
+            } else if (CMD_CS(&cmd) & CMD_SYNC_SIG_SEV) {
+                trace_smmuv3_cmdq_consume_sev();
+            }
+            break;
+        case SMMU_CMD_PREFETCH_CONFIG:
+        case SMMU_CMD_PREFETCH_ADDR:
+        case SMMU_CMD_CFGI_STE:
+        {
+             uint32_t streamid = cmd.word[1];
+
+             trace_smmuv3_cmdq_cfgi_ste(streamid);
+            break;
+        }
+        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
+        {
+            uint32_t start = cmd.word[1], range, end;
+
+            range = extract32(cmd.word[2], 0, 5);
+            end = start + (1 << (range + 1)) - 1;
+            trace_smmuv3_cmdq_cfgi_ste_range(start, end);
+            break;
+        }
+        case SMMU_CMD_CFGI_CD:
+        case SMMU_CMD_CFGI_CD_ALL:
+            break;
+        case SMMU_CMD_TLBI_NH_ALL:
+        case SMMU_CMD_TLBI_NH_ASID:
+            printf("%s TLBI* replay\n", __func__);
+            break;
+        case SMMU_CMD_TLBI_NH_VA:
+        {
+            int asid = extract32(cmd.word[1], 16, 16);
+            int vmid = extract32(cmd.word[1], 0, 16);
+            uint64_t low = extract32(cmd.word[2], 12, 20);
+            uint64_t high = cmd.word[3];
+            uint64_t addr = high << 32 | (low << 12);
+
+            trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
+            break;
+        }
+        case SMMU_CMD_TLBI_NH_VAA:
+        case SMMU_CMD_TLBI_EL3_ALL:
+        case SMMU_CMD_TLBI_EL3_VA:
+        case SMMU_CMD_TLBI_EL2_ALL:
+        case SMMU_CMD_TLBI_EL2_ASID:
+        case SMMU_CMD_TLBI_EL2_VA:
+        case SMMU_CMD_TLBI_EL2_VAA:
+        case SMMU_CMD_TLBI_S12_VMALL:
+        case SMMU_CMD_TLBI_S2_IPA:
+        case SMMU_CMD_TLBI_NSNH_ALL:
+            break;
+        case SMMU_CMD_ATC_INV:
+        case SMMU_CMD_PRI_RESP:
+        case SMMU_CMD_RESUME:
+        case SMMU_CMD_STALL_TERM:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        default:
+            error = SMMU_CMD_ERR_ILLEGAL;
+            error_report("Illegal command type: %d, ignoring", CMD_TYPE(&cmd));
+            dump_cmd(&cmd);
+            break;
+        }
+
+        if (error != SMMU_CMD_ERR_NONE) {
+            error_report("CMD Error");
+            break;
+        }
+    }
+
+    if (error) {
+        smmu_irq_raise(s, SMMU_IRQ_GERROR, error);
+    }
+
+    trace_smmuv3_cmdq_consume_out(s->cmdq.wrap.prod, s->cmdq.prod,
+                                  s->cmdq.wrap.cons, s->cmdq.cons);
+
+    return 0;
+}
+
+/**
+ * GERROR is updated when raising an interrupt, GERRORN will be updated
+ * by SW and should match GERROR before normal operation resumes.
+ */
+static void smmu_irq_clear(SMMUV3State *s, uint64_t gerrorn)
+{
+    int irq = SMMU_IRQ_GERROR;
+    uint32_t toggled;
+
+    toggled = smmu_read32_reg(s, SMMU_REG_GERRORN) ^ gerrorn;
+
+    while (toggled) {
+        irq = ctz32(toggled);
+
+        qemu_irq_lower(s->irq[irq]);
+
+        toggled &= toggled - 1;
+    }
+}
+
+static int smmu_evtq_update(SMMUV3State *s)
+{
+    if (!smmu_enabled(s)) {
+        return 0;
+    }
+
+    if (!smmu_is_q_empty(s, &s->evtq)) {
+        if (smmu_evt_irq_enabled(s)) {
+            smmu_irq_raise(s, SMMU_IRQ_EVTQ, 0);
+        }
+    }
+
+    if (smmu_is_q_empty(s, &s->evtq)) {
+        smmu_irq_clear(s, SMMU_GERROR_EVENTQ);
+    }
+
+    return 1;
+}
+
+static void smmu_create_event(SMMUV3State *s, hwaddr iova,
+                              uint32_t sid, bool is_write, int error);
+
+static void smmu_update(SMMUV3State *s)
+{
+    int error = 0;
+
+    /* SMMU starts processing commands even when not enabled */
+    if (!smmu_enabled(s)) {
+        goto check_cmdq;
+    }
+
+    /* EVENT Q updates takes more priority */
+    if ((smmu_evt_q_enabled(s)) && !smmu_is_q_empty(s, &s->evtq)) {
+        trace_smmuv3_update(smmu_is_q_empty(s, &s->evtq), s->evtq.prod,
+                            s->evtq.cons, s->evtq.wrap.prod, s->evtq.wrap.cons);
+        error = smmu_evtq_update(s);
+    }
+
+    if (error) {
+        /* TODO: May be in future we create proper event queue entry */
+        /* an error condition is not a recoverable event, like other devices */
+        error_report("An unfavourable condition");
+        smmu_create_event(s, 0, 0, 0, error);
+    }
+
+check_cmdq:
+    if (smmu_cmd_q_enabled(s) && !SMMU_CMDQ_ERR(s)) {
+        smmu_cmdq_consume(s);
+    } else {
+        trace_smmuv3_update_check_cmd(SMMU_CMDQ_ERR(s));
+    }
+
+}
+
+static void smmu_update_irq(SMMUV3State *s, uint64_t addr, uint64_t val)
+{
+    smmu_irq_clear(s, val);
+
+    smmu_write32_reg(s, SMMU_REG_GERRORN, val);
+
+    trace_smmuv3_update_irq(smmu_is_irq_pending(s, 0),
+                          smmu_read32_reg(s, SMMU_REG_GERROR),
+                          smmu_read32_reg(s, SMMU_REG_GERRORN));
+
+    /* Clear only when no more left */
+    if (!smmu_is_irq_pending(s, 0)) {
+        qemu_irq_lower(s->irq[0]);
+    }
+}
+
+#define SMMU_ID_REG_INIT(s, reg, d) do {        \
+    s->regs[reg >> 2] = d;                      \
+    } while (0)
+
+static void smmuv3_id_reg_init(SMMUV3State *s)
+{
+    uint32_t data =
+        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
+        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
+        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
+        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
+        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
+        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
+        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
+        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
+        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
+        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
+        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
+        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
+        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR0, data);
+
+#define SMMU_QUEUE_SIZE_LOG2  19
+    data =
+        1 << 27 |                    /* Attr Types override */
+        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
+        0  << 6 |                    /* SSID not supported */
+        SMMU_IDR1_SIDSIZE;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR1, data);
+
+    data =
+        SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
+
+    SMMU_ID_REG_INIT(s, SMMU_REG_IDR5, data);
+
+}
+
+static void smmuv3_init(SMMUV3State *s)
+{
+    smmuv3_id_reg_init(s);      /* Update ID regs alone */
+
+    s->sid_size = SMMU_IDR1_SIDSIZE;
+
+    s->cmdq.entries = (smmu_read32_reg(s, SMMU_REG_IDR1) >> 21) & 0x1f;
+    s->cmdq.ent_size = sizeof(Cmd);
+    s->evtq.entries = (smmu_read32_reg(s, SMMU_REG_IDR1) >> 16) & 0x1f;
+    s->evtq.ent_size = sizeof(Evt);
+}
+
+/*
+ * All SMMU data structures are little endian, and are aligned to 8 bytes
+ * L1STE/STE/L1CD/CD, Queue entries in CMDQ/EVTQ/PRIQ
+ */
+static inline int smmu_get_ste(SMMUV3State *s, hwaddr addr, Ste *buf)
+{
+    int ret;
+
+    trace_smmuv3_get_ste(addr);
+    ret = dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+    dump_ste(buf);
+    return ret;
+}
+
+/*
+ * For now we only support CD with a single entry, 'ssid' is used to identify
+ * otherwise
+ */
+static inline int smmu_get_cd(SMMUV3State *s, Ste *ste, uint32_t ssid, Cd *buf)
+{
+    hwaddr addr = STE_CTXPTR(ste);
+    int ret;
+
+    if (STE_S1CDMAX(ste) != 0) {
+        error_report("Multilevel Ctx Descriptor not supported yet");
+    }
+
+    ret = dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+
+    trace_smmuv3_get_cd(addr);
+    dump_cd(buf);
+
+    return ret;
+}
+
+/**
+ * is_ste_consistent - Check validity of STE
+ * according to 6.2.1 Validty of STE
+ * TODO: check the relevance of each check and compliance
+ * with this spec chapter
+ */
+static int is_ste_consistent(SMMUV3State *s, Ste *ste)
+{
+    uint32_t _config = STE_CONFIG(ste);
+    uint32_t ste_vmid, ste_eats, ste_s2s, ste_s1fmt, ste_s2aa64, ste_s1cdmax;
+    uint32_t ste_strw;
+    bool strw_unused, addr_out_of_range, granule_supported;
+    bool config[] = {_config & 0x1, _config & 0x2, _config & 0x3};
+
+    ste_vmid = STE_S2VMID(ste);
+    ste_eats = STE_EATS(ste); /* Enable PCIe ATS trans */
+    ste_s2s = STE_S2S(ste);
+    ste_s1fmt = STE_S1FMT(ste);
+    ste_s2aa64 = STE_S2AA64(ste);
+    ste_s1cdmax = STE_S1CDMAX(ste); /*CD bit # S1ContextPtr */
+    ste_strw = STE_STRW(ste); /* stream world control */
+
+    if (!STE_VALID(ste)) {
+        error_report("STE NOT valid");
+        return false;
+    }
+
+    granule_supported = is_s2granule_valid(ste);
+
+    /* As S1/S2 combinations are supported do not check
+     * corresponding STE config values */
+
+    if (!config[2]) {
+        /* Report abort to device, no event recorded */
+        error_report("STE config 0b000 not implemented");
+        return false;
+    }
+
+    if (!SMMU_IDR1_SIDSIZE && ste_s1cdmax && config[0] &&
+        !SMMU_IDR0_CD2L && (ste_s1fmt == 1 || ste_s1fmt == 2)) {
+        error_report("STE inconsistant, CD mismatch");
+        return false;
+    }
+    if (SMMU_IDR0_ATS && ((_config & 0x3) == 0) &&
+        ((ste_eats == 2 && (_config != 0x7 || ste_s2s)) ||
+        (ste_eats == 1 && !ste_s2s))) {
+        error_report("STE inconsistant, EATS/S2S mismatch");
+        return false;
+    }
+    if (config[0] && (SMMU_IDR1_SIDSIZE &&
+        (ste_s1cdmax > SMMU_IDR1_SIDSIZE))) {
+        error_report("STE inconsistant, SSID out of range");
+        return false;
+    }
+
+    strw_unused = (!SMMU_IDR0_S1P || !SMMU_IDR0_HYP || (_config == 4));
+
+    addr_out_of_range = STE_S2TTB(ste) > MAX_PA(ste);
+
+    if (is_ste_stage2(ste)) {
+        if ((ste_s2aa64 && !is_s2granule_valid(ste)) ||
+            (!ste_s2aa64 && !(SMMU_IDR0_TTF & 0x1)) ||
+            (ste_s2aa64 && !(SMMU_IDR0_TTF & 0x2))  ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !ste_s2aa64) ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !SMMU_IDR0_HTTU) ||
+            (STE_S2HD(ste) && (SMMU_IDR0_HTTU == 1)) || addr_out_of_range) {
+            error_report("STE inconsistant");
+            trace_smmuv3_is_ste_consistent(config[1], granule_supported,
+                                           addr_out_of_range, ste_s2aa64,
+                                           STE_S2HA(ste), STE_S2HD(ste),
+                                           STE_S2TTB(ste));
+            return false;
+        }
+    }
+    if (SMMU_IDR0_S2P && (config[0] == 0 && config[1]) &&
+        (strw_unused || !ste_strw) && !SMMU_IDR0_VMID16 && !(ste_vmid >> 8)) {
+        error_report("STE inconsistant, VMID out of range");
+        return false;
+    }
+
+    return true;
+}
+
+/**
+ * smmu_find_ste - Return the stream table entry associated
+ * to the sid
+ *
+ * @s: smmuv3 handle
+ * @sid: stream ID
+ * @ste: returned stream table entry
+ * Supports linear and 2-level stream table
+ */
+static int smmu_find_ste(SMMUV3State *s, uint16_t sid, Ste *ste)
+{
+    hwaddr addr;
+
+    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
+    /* Check SID range */
+    if (sid > (1 << s->sid_size)) {
+        return SMMU_EVT_C_BAD_SID;
+    }
+    if (s->features & SMMU_FEATURE_2LVL_STE) {
+        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+        hwaddr l1ptr, l2ptr;
+        STEDesc l1std;
+
+        l1_ste_offset = sid >> s->sid_split;
+        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
+        l1ptr = (hwaddr)(s->strtab_base + l1_ste_offset * sizeof(l1std));
+        smmu_read_sysmem(l1ptr, &l1std, sizeof(l1std), false);
+        span = L1STD_SPAN(&l1std);
+
+        if (!span) {
+            /* l2ptr is not valid */
+            error_report("invalid sid=%d (L1STD span=0)", sid);
+            return SMMU_EVT_C_BAD_SID;
+        }
+        max_l2_ste = (1 << span) - 1;
+        l2ptr = L1STD_L2PTR(&l1std);
+        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
+                                   l2ptr, l2_ste_offset, max_l2_ste);
+        if (l2_ste_offset > max_l2_ste) {
+            error_report("l2_ste_offset=%d > max_l2_ste=%d",
+                         l2_ste_offset, max_l2_ste);
+            return SMMU_EVT_C_BAD_STE;
+        }
+        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
+    } else {
+        addr = s->strtab_base + sid * sizeof(*ste);
+    }
+
+    if (smmu_get_ste(s, addr, ste)) {
+        error_report("Unable to Fetch STE");
+        return SMMU_EVT_F_UUT;
+    }
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s1 - Populate the stage 1 translation config
+ * from the context descriptor
+ */
+static int smmu_cfg_populate_s1(SMMUTransCfg *cfg, Cd *cd)
+{
+    bool s1a64 = CD_AARCH64(cd);
+    int epd0 = CD_EPD0(cd);
+    int tg;
+
+    cfg->stage   = 1;
+    tg           = epd0 ? CD_TG1(cd) : CD_TG0(cd);
+    cfg->tsz     = epd0 ? CD_T1SZ(cd) : CD_T0SZ(cd);
+    cfg->ttbr    = epd0 ? CD_TTB1(cd) : CD_TTB0(cd);
+    cfg->oas     = oas2bits(CD_IPS(cd));
+
+    if (s1a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, epd0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+    cfg->aa64 = s1a64;
+    cfg->initial_level  = 4 - (64 - cfg->tsz - 4) / (cfg->granule_sz - 3);
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz, cfg->initial_level);
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s2 - Populate the stage 2 translation config
+ * from the Stream Table Entry
+ */
+static int smmu_cfg_populate_s2(SMMUTransCfg *cfg, Ste *ste)
+{
+    bool s2a64 = STE_S2AA64(ste);
+    int default_initial_level;
+    int tg;
+
+    cfg->stage = 2;
+
+    tg           = STE_S2TG(ste);
+    cfg->tsz     = STE_S2T0SZ(ste);
+    cfg->ttbr    = STE_S2TTB(ste);
+    cfg->oas     = pa_range(ste);
+
+    cfg->aa64    = s2a64;
+
+    if (s2a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, 0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+
+    default_initial_level = 4 - (64 - cfg->tsz - 4) / (cfg->granule_sz - 3);
+    cfg->initial_level = ~STE_S2SL0(ste);
+    if (cfg->initial_level  != default_initial_level) {
+        error_report("%s concatenated translation tables at initial S2 lookup"
+                     " not supported", __func__);
+        return -1;
+    }
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz, cfg->initial_level);
+
+    return 0;
+}
+
+static MemTxResult smmu_write_evtq(SMMUV3State *s, Evt *evt)
+{
+    SMMUQueue *q = &s->evtq;
+    int ret = smmu_q_write(q, evt);
+    uint32_t val = 0;
+
+    val |= (q->wrap.prod << q->shift) | q->prod;
+
+    smmu_write32_reg(s, SMMU_REG_EVTQ_PROD, val);
+
+    return ret;
+}
+
+/*
+ * Events created on the EventQ
+ */
+static void smmu_create_event(SMMUV3State *s, hwaddr iova,
+                              uint32_t sid, bool is_write, int error)
+{
+    SMMUQueue *q = &s->evtq;
+    uint64_t head;
+    Evt evt;
+
+    if (!smmu_evt_q_enabled(s)) {
+        return;
+    }
+
+    EVT_SET_TYPE(&evt, error);
+    EVT_SET_SID(&evt, sid);
+
+    switch (error) {
+    case SMMU_EVT_F_UUT:
+    case SMMU_EVT_C_BAD_STE:
+        break;
+    case SMMU_EVT_C_BAD_CD:
+    case SMMU_EVT_F_CD_FETCH:
+        break;
+    case SMMU_EVT_F_TRANS_FORBIDDEN:
+    case SMMU_EVT_F_WALK_EXT_ABRT:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+    default:
+        break;
+    }
+
+    smmu_write_evtq(s, &evt);
+
+    head = Q_IDX(q, q->prod);
+
+    if (smmu_is_q_full(s, &s->evtq)) {
+        head = q->prod ^ (1 << 31);     /* Set overflow */
+    }
+
+    smmu_write32_reg(s, SMMU_REG_EVTQ_PROD, head);
+
+    smmu_irq_raise(s, SMMU_IRQ_EVTQ, 0);
+}
+
+/**
+ * smmuv3_config_config - Prepare the translation configuration
+ * for the @mr iommu region
+ * @mr: iommu memory region the translation config must be prepared for
+ * @cfg: output translation configuration
+ *
+ * return 0 on success or error code on failure
+ */
+static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    int sid = smmu_get_sid(sdev);
+    SMMUV3State *s = sdev->smmu;
+    Ste ste;
+    Cd cd;
+    int ret = 0;
+
+    if (!smmu_enabled(s)) {
+        cfg->disabled = true;
+        return 0;
+    }
+    ret = smmu_find_ste(s, sid, &ste);
+    if (ret) {
+        return ret;
+    }
+
+    if (!STE_VALID(&ste)) {
+        return SMMU_EVT_C_BAD_STE;
+    }
+
+    switch (STE_CONFIG(&ste)) {
+    case STE_CONFIG_BYPASS:
+        cfg->bypassed = true;
+        return 0;
+    case STE_CONFIG_S1:
+         break;
+    case STE_CONFIG_S2:
+         break;
+    default: /* reserved, abort, nested */
+        return -1;
+    }
+
+    /* S1 or S2 */
+
+    if (!is_ste_consistent(s, &ste)) {
+        return SMMU_EVT_C_BAD_STE;
+    }
+
+    if (is_ste_stage1(&ste)) {
+        ret = smmu_get_cd(s, &ste, 0, &cd); /* We dont have SSID yet */
+        if (ret) {
+            return ret;
+        }
+
+        if (!is_cd_valid(s, &ste, &cd)) {
+            return SMMU_EVT_C_BAD_CD;
+        }
+        return smmu_cfg_populate_s1(cfg, &cd);
+    }
+
+    return smmu_cfg_populate_s2(cfg, &ste);
+}
+
+static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+                                      IOMMUAccessFlags flag)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    SMMUState *sys = SMMU_SYS_DEV(s);
+    bool is_write = flag & IOMMU_WO;
+    uint16_t sid = 0;
+    SMMUEvtErr ret;
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = IOMMU_NONE,
+    };
+
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret || cfg.disabled || cfg.bypassed) {
+        goto out;
+    }
+
+    ret = smmu_page_walk(sys, &cfg, &entry, is_write);
+
+    entry.perm = is_write ? IOMMU_RW : IOMMU_RO;
+
+    trace_smmuv3_translate_ok(mr->parent_obj.name, sid, addr,
+                              entry.translated_addr, entry.perm);
+out:
+    if (ret) {
+        error_report("%s translation failed for iova=0x%"PRIx64,
+                     mr->parent_obj.name, addr);
+        smmu_create_event(s, entry.iova, sid, is_write, ret);
+    }
+    return entry;
+}
+
+
+static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
+                                        uint64_t val)
+{
+    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
+}
+
+static void smmu_update_qreg(SMMUV3State *s, SMMUQueue *q, hwaddr reg,
+                             uint32_t off, uint64_t val, unsigned size)
+{
+   if (size == 8 && off == 0) {
+        smmu_write64_reg(s, reg, val);
+    } else {
+        smmu_write_reg(s, reg, val);
+    }
+
+    switch (off) {
+    case 0:                             /* BASE register */
+        val = smmu_read64_reg(s, reg);
+        q->shift = val & 0x1f;
+        q->entries = 1 << (q->shift);
+        smmu_update_base_reg(s, &q->base, val);
+        break;
+
+    case 8:                             /* PROD */
+        q->prod = Q_IDX(q, val);
+        q->wrap.prod = val >> q->shift;
+        break;
+
+    case 12:                             /* CONS */
+        q->cons = Q_IDX(q, val);
+        q->wrap.cons = val >> q->shift;
+        trace_smmuv3_update_qreg(q->cons, val);
+        break;
+
+    }
+
+    switch (reg) {
+    case SMMU_REG_CMDQ_PROD:            /* should be only for CMDQ_PROD */
+    case SMMU_REG_CMDQ_CONS:            /* but we do it anyway */
+        smmu_update(s);
+        break;
+    }
+}
+
+static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
+{
+    switch (*addr) {
+    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
+    case 0x100c8: case 0x100cc:
+        *addr ^= (hwaddr)0x10000;
+    }
+}
+
+static void smmu_write_mmio(void *opaque, hwaddr addr,
+                            uint64_t val, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    bool update = false;
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    trace_smmuv3_write_mmio(addr, val, size);
+
+    switch (addr) {
+    case 0xFDC ... 0xFFC:
+    case SMMU_REG_IDR0 ... SMMU_REG_IDR5:
+        trace_smmuv3_write_mmio_idr(addr, val);
+        return;
+
+    case SMMU_REG_GERRORN:
+        smmu_update_irq(s, addr, val);
+        return;
+
+    case SMMU_REG_CR0:
+        smmu_write32_reg(s, SMMU_REG_CR0, val);
+        smmu_write32_reg(s, SMMU_REG_CR0_ACK, val);
+        update = true;
+        break;
+
+    case SMMU_REG_IRQ_CTRL:
+        smmu_write32_reg(s, SMMU_REG_IRQ_CTRL_ACK, val);
+        update = true;
+        break;
+
+    case SMMU_REG_STRTAB_BASE:
+        smmu_update_base_reg(s, &s->strtab_base, val);
+        return;
+
+    case SMMU_REG_STRTAB_BASE_CFG:
+        if (((val >> 16) & 0x3) == 0x1) {
+            s->sid_split = (val >> 6) & 0x1f;
+            s->features |= SMMU_FEATURE_2LVL_STE;
+        }
+        break;
+
+    case SMMU_REG_CMDQ_PROD:
+    case SMMU_REG_CMDQ_CONS:
+    case SMMU_REG_CMDQ_BASE:
+    case SMMU_REG_CMDQ_BASE + 4:
+        smmu_update_qreg(s, &s->cmdq, addr, addr - SMMU_REG_CMDQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_EVTQ_CONS:            /* fallthrough */
+    {
+        SMMUQueue *evtq = &s->evtq;
+        evtq->cons = Q_IDX(evtq, val);
+        evtq->wrap.cons = Q_WRAP(evtq, val);
+
+        trace_smmuv3_write_mmio_evtq_cons_bef_clear(evtq->prod, evtq->cons,
+                                                    evtq->wrap.prod,
+                                                    evtq->wrap.cons);
+        if (smmu_is_q_empty(s, &s->evtq)) {
+            trace_smmuv3_write_mmio_evtq_cons_after_clear(evtq->prod,
+                                                          evtq->cons,
+                                                          evtq->wrap.prod,
+                                                          evtq->wrap.cons);
+            qemu_irq_lower(s->irq[SMMU_IRQ_EVTQ]);
+        }
+    }
+    case SMMU_REG_EVTQ_BASE:
+    case SMMU_REG_EVTQ_BASE + 4:
+    case SMMU_REG_EVTQ_PROD:
+        smmu_update_qreg(s, &s->evtq, addr, addr - SMMU_REG_EVTQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_PRIQ_CONS:
+    case SMMU_REG_PRIQ_BASE:
+    case SMMU_REG_PRIQ_BASE + 4:
+    case SMMU_REG_PRIQ_PROD:
+        smmu_update_qreg(s, &s->priq, addr, addr - SMMU_REG_PRIQ_BASE,
+                         val, size);
+        return;
+    }
+
+    if (size == 8) {
+        smmu_write_reg(s, addr, val);
+    } else {
+        smmu_write32_reg(s, addr, (uint32_t)val);
+    }
+
+    if (update) {
+        smmu_update(s);
+    }
+}
+
+static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    uint64_t val;
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    /* Primecell/Corelink ID registers */
+    switch (addr) {
+    case 0xFF0 ... 0xFFC:
+    case 0xFDC ... 0xFE4:
+        val = 0;
+        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
+        break;
+
+    default:
+        val = (uint64_t)smmu_read32_reg(s, addr);
+        break;
+
+    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
+    case SMMU_REG_EVTQ_BASE:
+    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
+        val = smmu_read64_reg(s, addr);
+        break;
+    }
+
+    trace_smmuv3_read_mmio(addr, val, size);
+    return val;
+}
+
+static const MemoryRegionOps smmu_mem_ops = {
+    .read = smmu_read_mmio,
+    .write = smmu_write_mmio,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
+static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
+        sysbus_init_irq(dev, &s->irq[i]);
+    }
+}
+
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+{
+    SMMUState *s = opaque;
+    uintptr_t key = (uintptr_t)bus;
+    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, &key);
+    SMMUDevice *sdev;
+
+    if (!sbus) {
+        uintptr_t *new_key = g_malloc(sizeof(*new_key));
+
+        *new_key = (uintptr_t)bus;
+        sbus = g_malloc0(sizeof(SMMUPciBus) +
+                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->smmu_as_by_busptr, new_key, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        char *name = g_strdup_printf("%s-%d-%d", TYPE_SMMU_V3_DEV,
+                                      pci_bus_num(bus), devfn);
+        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(SMMUDevice));
+
+        sdev->smmu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
+                                 TYPE_SMMUV3_IOMMU_MEMORY_REGION,
+                                 OBJECT(s), name, 1ULL << 48);
+        address_space_init(&sdev->as,
+                           MEMORY_REGION(&sdev->iommu), TYPE_SMMU_V3_DEV);
+    }
+
+    return &sdev->as;
+
+}
+
+static void smmu_init_iommu_as(SMMUV3State *sys)
+{
+    SMMUState *s = SMMU_SYS_DEV(sys);
+    PCIBus *pcibus = pci_find_primary_bus();
+
+    if (pcibus) {
+        pci_setup_iommu(pcibus, smmu_find_add_as, s);
+    } else {
+        error_report("No PCI bus, SMMU is not registered");
+    }
+}
+
+static void smmu_reset(DeviceState *dev)
+{
+    SMMUV3State *s = SMMU_V3_DEV(dev);
+    smmuv3_init(s);
+}
+
+static int smmu_populate_internal_state(void *opaque, int version_id)
+{
+    SMMUV3State *s = opaque;
+
+    smmu_update(s);
+    return 0;
+}
+
+static void smmu_realize(DeviceState *d, Error **errp)
+{
+    SMMUState *sys = SMMU_SYS_DEV(d);
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    SysBusDevice *dev = SYS_BUS_DEVICE(d);
+
+    memset(sys->smmu_as_by_bus_num, 0, sizeof(sys->smmu_as_by_bus_num));
+    memory_region_init_io(&sys->iomem, OBJECT(s),
+                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
+
+    sys->smmu_as_by_busptr = g_hash_table_new_full(smmu_uint64_hash,
+                                                   smmu_uint64_equal,
+                                                   g_free, g_free);
+    sysbus_init_mmio(dev, &sys->iomem);
+
+    smmuv3_init_queues(s);
+
+    smmu_init_irq(s, dev);
+
+    smmu_init_iommu_as(s);
+}
+
+static const VMStateDescription vmstate_smmuv3 = {
+    .name = "smmuv3",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = smmu_populate_internal_state,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
+        VMSTATE_END_OF_LIST(),
+    },
+};
+
+static void smmuv3_instance_init(Object *obj)
+{
+    /* Nothing much to do here as of now */
+}
+
+static void smmuv3_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset   = smmu_reset;
+    dc->vmsd    = &vmstate_smmuv3;
+    dc->realize = smmu_realize;
+}
+
+static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
+                                                  void *data)
+{
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+    imrc->translate = smmuv3_translate;
+}
+
+static const TypeInfo smmuv3_type_info = {
+    .name          = TYPE_SMMU_V3_DEV,
+    .parent        = TYPE_SMMU_DEV_BASE,
+    .instance_size = sizeof(SMMUV3State),
+    .instance_init = smmuv3_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUV3Class),
+    .class_init    = smmuv3_class_init,
+};
+
+static const TypeInfo smmuv3_iommu_memory_region_info = {
+    .parent = TYPE_IOMMU_MEMORY_REGION,
+    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
+    .class_init = smmuv3_iommu_memory_region_class_init,
+};
+
+static void smmuv3_register_types(void)
+{
+    type_register(&smmuv3_type_info);
+    type_register(&smmuv3_iommu_memory_region_info);
+}
+
+type_init(smmuv3_register_types)
+
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index b371b4d..f9b9cbe 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -16,3 +16,37 @@ smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "base
 smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
 smmu_walk_pgtable(hwaddr iova, bool is_write) "Input addr: 0x%"PRIx64", is_write=%d"
 smmu_walk_pgtable_out(hwaddr addr, uint32_t mask, int perm) "DONE: o/p addr:0x%"PRIx64" mask:0x%x perm:%d"
+
+#hw/arm/smmuv3.c
+smmuv3_irq_update(uint32_t error, uint32_t gerror, uint32_t gerrorn) "<<<< error:0x%x gerror:0x%x gerrorn:0x%x"
+smmuv3_irq_raise(int irq) "irq:%d"
+smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
+smmuv3_cmdq_consume(int error, bool enabled, uint32_t prod, uint32_t cons, uint8_t wrap_prod, uint8_t wrap_cons) "error=%d, enabled=%d prod=%d cons=%d wrap.prod=%d wrap.cons=%d"
+smmuv3_cmdq_consume_details(hwaddr base, uint32_t cons, uint32_t prod, uint32_t word, uint8_t wrap_cons) "CMDQ base: 0x%"PRIx64" cons:%d prod:%d val:0x%x wrap:%d"
+smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
+smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
+smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
+smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
+smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
+smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
+smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
+smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
+smmuv3_update_irq(bool is_pending, uint32_t gerror, uint32_t gerrorn) "irq pend: %d gerror:0x%x gerrorn:0x%x"
+smmuv3_is_ste_consistent(bool cfg, bool granule_supported, bool addr_oor, uint32_t aa64, int s2ha, int s2hd, uint64_t s2ttb ) "config[1]:%d gran:%d addr:%d aa64:%d s2ha:%d s2hd:%d s2ttb:0x%"PRIx64
+smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
+smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:0x%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
+smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
+smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
+smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
+smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
+smmuv3_translate_ok(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x"
+smmuv3_update_qreg(uint32_t cons, uint64_t val) "cons written : %d val:0x%"PRIx64
+smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
+smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t STE[%2d]: 0x%x"
+smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
+smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
+smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
new file mode 100644
index 0000000..bdeea1b
--- /dev/null
+++ b/include/hw/arm/smmuv3.h
@@ -0,0 +1,89 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMUV3_H
+#define HW_ARM_SMMUV3_H
+
+#include "hw/arm/smmu-common.h"
+
+#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-iommu-memory-region"
+
+#define SMMU_NREGS            0x200
+
+typedef struct SMMUQueue {
+     hwaddr base;
+     uint32_t prod;
+     uint32_t cons;
+     union {
+          struct {
+               uint8_t prod:1;
+               uint8_t cons:1;
+          };
+          uint8_t unused;
+     } wrap;
+
+     uint16_t entries;           /* Number of entries */
+     uint8_t  ent_size;          /* Size of entry in bytes */
+     uint8_t  shift;             /* Size in log2 */
+} SMMUQueue;
+
+typedef struct SMMUV3State {
+    SMMUState     smmu_state;
+
+#define SMMU_FEATURE_2LVL_STE (1 << 0)
+    /* Local cache of most-frequently used register */
+    uint32_t     features;
+    uint16_t     sid_size;
+    uint16_t     sid_split;
+    uint64_t     strtab_base;
+
+    uint32_t    regs[SMMU_NREGS];
+
+    qemu_irq     irq[4];
+
+    SMMUQueue    cmdq, evtq, priq;
+
+    /* IOMMU Address space */
+    MemoryRegion iommu;
+    AddressSpace iommu_as;
+    /*
+     * Bus number is not populated in the beginning, hence we need
+     * a mechanism to retrieve the corresponding address space for each
+     * pci device.
+    */
+    GHashTable   *smmu_as_by_busptr;
+} SMMUV3State;
+
+typedef enum {
+    SMMU_IRQ_GERROR,
+    SMMU_IRQ_PRIQ,
+    SMMU_IRQ_EVTQ,
+    SMMU_IRQ_CMD_SYNC,
+} SMMUIrq;
+
+typedef struct {
+    SMMUBaseClass smmu_base_class;
+} SMMUV3Class;
+
+#define TYPE_SMMU_V3_DEV   "smmuv3"
+#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
+#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
+
+#endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 3/9] hw/arm/virt: Add SMMUv3 to the virt board
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 1/9] hw/arm/smmu-common: smmu base class Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 2/9] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 4/9] hw/arm/virt: Add 2.11 machine type Eric Auger
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

From: Prem Mallappa <prem.mallappa@broadcom.com>

Add code to instantiate an smmu-v3 in mach-virt. A new boolean flag
is introduced in VirtMachineState to allow this instantiation. It
is currently false.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v4 -> v5:
- add dma-coherent property

v2 -> v3:
- vbi was removed. Use vms instead
- migrate to new smmu binding format (iommu-map)
- don't use appendprop anymore
- add vms->smmu and guard instantiation with this latter
- interrupts type changed to edge
---
 hw/arm/smmuv3.c         |  5 +++--
 hw/arm/virt.c           | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/hw/arm/smmuv3.h |  2 +-
 include/hw/arm/virt.h   |  4 ++++
 4 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index a3199f1..e195a0e 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1026,8 +1026,9 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
 
     sdev = sbus->pbdev[devfn];
     if (!sdev) {
-        char *name = g_strdup_printf("%s-%d-%d", TYPE_SMMU_V3_DEV,
-                                      pci_bus_num(bus), devfn);
+        char *name = g_strdup_printf("%s-%d-%d",
+                                     TYPE_SMMUV3_IOMMU_MEMORY_REGION,
+                                     pci_bus_num(bus), devfn);
         sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(SMMUDevice));
 
         sdev->smmu = s;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6b7a0fe..b9246b9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -56,6 +56,7 @@
 #include "hw/smbios/smbios.h"
 #include "qapi/visitor.h"
 #include "standard-headers/linux/input.h"
+#include "hw/arm/smmuv3.h"
 
 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
     static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
@@ -139,6 +140,7 @@ static const MemMapEntry a15memmap[] = {
     [VIRT_FW_CFG] =             { 0x09020000, 0x00000018 },
     [VIRT_GPIO] =               { 0x09030000, 0x00001000 },
     [VIRT_SECURE_UART] =        { 0x09040000, 0x00001000 },
+    [VIRT_SMMU] =               { 0x09050000, 0x00020000 }, /* 128K, needed */
     [VIRT_MMIO] =               { 0x0a000000, 0x00000200 },
     /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
     [VIRT_PLATFORM_BUS] =       { 0x0c000000, 0x02000000 },
@@ -159,6 +161,7 @@ static const int a15irqmap[] = {
     [VIRT_SECURE_UART] = 8,
     [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
     [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
+    [VIRT_SMMU] = 74,    /* ...to 74 + NUM_SMMU_IRQS - 1 */
     [VIRT_PLATFORM_BUS] = 112, /* ...to 112 + PLATFORM_BUS_NUM_IRQS -1 */
 };
 
@@ -991,6 +994,53 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
+static void alloc_smmu_phandle(VirtMachineState *vms)
+{
+    if (vms->smmu && !vms->smmu_phandle) {
+        vms->smmu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+    }
+}
+
+static void create_smmu(const VirtMachineState *vms, qemu_irq *pic)
+{
+    char *smmu;
+    const char compat[] = "arm,smmu-v3";
+    int irq =  vms->irqmap[VIRT_SMMU];
+    hwaddr base = vms->memmap[VIRT_SMMU].base;
+    hwaddr size = vms->memmap[VIRT_SMMU].size;
+    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
+
+    if (!vms->smmu) {
+        return;
+    }
+
+    sysbus_create_varargs("smmuv3", base, pic[irq], pic[irq + 1],
+                          pic[irq + 2], pic[irq + 3], NULL);
+
+    smmu = g_strdup_printf("/smmuv3@%" PRIx64, base);
+    qemu_fdt_add_subnode(vms->fdt, smmu);
+    qemu_fdt_setprop(vms->fdt, smmu, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vms->fdt, smmu, "reg", 2, base, 2, size);
+
+    qemu_fdt_setprop_cells(vms->fdt, smmu, "interrupts",
+            GIC_FDT_IRQ_TYPE_SPI, irq    , GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 1, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 2, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI,
+            GIC_FDT_IRQ_TYPE_SPI, irq + 3, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+
+    qemu_fdt_setprop(vms->fdt, smmu, "interrupt-names", irq_names,
+                     sizeof(irq_names));
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "clocks", vms->clock_phandle);
+    qemu_fdt_setprop_string(vms->fdt, smmu, "clock-names", "apb_pclk");
+    qemu_fdt_setprop(vms->fdt, smmu, "dma-coherent", NULL, 0);
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "#iommu-cells", 1);
+
+    qemu_fdt_setprop_cell(vms->fdt, smmu, "phandle", vms->smmu_phandle);
+    g_free(smmu);
+}
+
 static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
@@ -1103,6 +1153,11 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
 
+    if (vms->smmu) {
+        qemu_fdt_setprop_cells(vms->fdt, nodename, "iommu-map",
+                               0x0, vms->smmu_phandle, 0x0, 0x10000);
+    }
+
     g_free(nodename);
 }
 
@@ -1448,8 +1503,12 @@ static void machvirt_init(MachineState *machine)
 
     create_rtc(vms, pic);
 
+    alloc_smmu_phandle(vms);
+
     create_pcie(vms, pic);
 
+    create_smmu(vms, pic);
+
     create_gpio(vms, pic);
 
     /* Create mmio transports, so the user can create virtio backends
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index bdeea1b..dbc5a57 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -22,7 +22,7 @@
 
 #include "hw/arm/smmu-common.h"
 
-#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-iommu-memory-region"
+#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
 
 #define SMMU_NREGS            0x200
 
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..164a531 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -38,6 +38,7 @@
 
 #define NUM_GICV2M_SPIS       64
 #define NUM_VIRTIO_TRANSPORTS 32
+#define NUM_SMMU_IRQS          4
 
 #define ARCH_GICV3_MAINT_IRQ  9
 
@@ -59,6 +60,7 @@ enum {
     VIRT_GIC_V2M,
     VIRT_GIC_ITS,
     VIRT_GIC_REDIST,
+    VIRT_SMMU,
     VIRT_UART,
     VIRT_MMIO,
     VIRT_RTC,
@@ -95,6 +97,7 @@ typedef struct {
     bool highmem;
     bool its;
     bool virt;
+    bool smmu;
     int32_t gic_version;
     struct arm_boot_info bootinfo;
     const MemMapEntry *memmap;
@@ -105,6 +108,7 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t smmu_phandle;
     int psci_conduit;
 } VirtMachineState;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 4/9] hw/arm/virt: Add 2.11 machine type
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (2 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 3/9] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 5/9] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

The new machine type allows smmuv3 instantiation. A new option
is introduced to turn the feature on/off (off by default).

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6: machine 2_11

Another alternative would be to use the -device option as
done on x86. As the smmu is a sysbus device, we would need to
use the platform bus framework.
---
 hw/arm/virt.c         | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
 include/hw/arm/virt.h |  1 +
 include/hw/compat.h   |  3 +++
 3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b9246b9..b758173 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1543,6 +1543,20 @@ static void machvirt_init(MachineState *machine)
     create_platform_bus(vms, pic);
 }
 
+static bool virt_get_smmu(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->smmu;
+}
+
+static void virt_set_smmu(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->smmu = value;
+}
+
 static bool virt_get_secure(Object *obj, Error **errp)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -1698,7 +1712,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_10_instance_init(Object *obj)
+static void virt_2_11_instance_init(Object *obj)
 {
     VirtMachineState *vms = VIRT_MACHINE(obj);
     VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1754,14 +1768,46 @@ static void virt_2_10_instance_init(Object *obj)
                                         NULL);
     }
 
+    if (vmc->no_smmu) {
+        vms->smmu = false;
+    } else {
+        /* Default disallows smmu instantiation */
+        vms->smmu = false;
+        object_property_add_bool(obj, "smmu", virt_get_smmu,
+                                 virt_set_smmu, NULL);
+        object_property_set_description(obj, "smmu",
+                                        "Set on/off to enable/disable "
+                                        "smmu instantiation (default off)",
+                                        NULL);
+    }
+
     vms->memmap = a15memmap;
     vms->irqmap = a15irqmap;
 }
 
+static void virt_machine_2_11_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(2, 11)
+
+#define VIRT_COMPAT_2_10 \
+    HW_COMPAT_2_10
+
+static void virt_2_10_instance_init(Object *obj)
+{
+    virt_2_11_instance_init(obj);
+}
+
 static void virt_machine_2_10_options(MachineClass *mc)
 {
+    VirtMachineClass *vmc = VIRT_MACHINE_CLASS(OBJECT_CLASS(mc));
+
+    virt_machine_2_11_options(mc);
+    SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);
+
+    vmc->no_smmu = true;
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 10)
+DEFINE_VIRT_MACHINE(2, 10)
 
 #define VIRT_COMPAT_2_9 \
     HW_COMPAT_2_9
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 164a531..cd2c82e 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -86,6 +86,7 @@ typedef struct {
     bool disallow_affinity_adjustment;
     bool no_its;
     bool no_pmu;
+    bool no_smmu;
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 08f3600..3e101f8 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -1,6 +1,9 @@
 #ifndef HW_COMPAT_H
 #define HW_COMPAT_H
 
+#define HW_COMPAT_2_10 \
+    /* empty */
+
 #define HW_COMPAT_2_9 \
     {\
         .driver   = "pci-bridge",\
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 5/9] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (3 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 4/9] hw/arm/virt: Add 2.11 machine type Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 6/9] hw/arm/virt: Add tlbi-on-map property to the smmuv3 node Eric Auger
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch builds the smmuv3 node in the ACPI IORT table.

The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.

The guest must feature the IOMMU probe deferral series
(https://lkml.org/lkml/2017/4/10/214) wich fixes streamid
multiple lookup. This bug is not related to the SMMU emulation.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v2 -> v3:
- integrate into the existing IORT table made up of ITS, RC nodes
- take into account vms->smmu
- match linux actbl2.h acpi_iort_smmu_v3 field names
---
 hw/arm/virt-acpi-build.c    | 56 +++++++++++++++++++++++++++++++++++++++------
 include/hw/acpi/acpi-defs.h | 15 ++++++++++++
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3d78ff6..ac2cd3e 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -393,19 +393,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
 }
 
 static void
-build_iort(GArray *table_data, BIOSLinker *linker)
+build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-    int iort_start = table_data->len;
+    int nb_nodes, iort_start = table_data->len;
     AcpiIortIdMapping *idmap;
     AcpiIortItsGroup *its;
     AcpiIortTable *iort;
-    size_t node_size, iort_length;
+    AcpiIortSmmu3 *smmu;
+    size_t node_size, iort_length, smmu_offset = 0;
     AcpiIortRC *rc;
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
+    if (vms->smmu) {
+        nb_nodes = 3; /* RC, ITS, SMMUv3 */
+    } else {
+        nb_nodes = 2; /* RC, ITS */
+    }
+
     iort_length = sizeof(*iort);
-    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
+    iort->node_count = cpu_to_le32(nb_nodes);
     iort->node_offset = cpu_to_le32(sizeof(*iort));
 
     /* ITS group node */
@@ -418,6 +425,35 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
+    if (vms->smmu) {
+        int irq =  vms->irqmap[VIRT_SMMU];
+
+        /* SMMUv3 node */
+        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
+        node_size = sizeof(*smmu) + sizeof(*idmap);
+        iort_length += node_size;
+        smmu = acpi_data_push(table_data, node_size);
+
+
+        smmu->type = ACPI_IORT_NODE_SMMU_V3;
+        smmu->length = cpu_to_le16(node_size);
+        smmu->mapping_count = cpu_to_le32(1);
+        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
+        smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
+        smmu->event_gsiv = cpu_to_le32(irq);
+        smmu->pri_gsiv = cpu_to_le32(irq + 1);
+        smmu->gerr_gsiv = cpu_to_le32(irq + 2);
+        smmu->sync_gsiv = cpu_to_le32(irq + 3);
+
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &smmu->id_mapping_array[0];
+        idmap->input_base = 0;
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = 0;
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
     /* Root Complex Node */
     node_size = sizeof(*rc) + sizeof(*idmap);
     iort_length += node_size;
@@ -438,8 +474,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     idmap->input_base = 0;
     idmap->id_count = cpu_to_le32(0xFFFF);
     idmap->output_base = 0;
-    /* output IORT node is the ITS group node (the first node) */
-    idmap->output_reference = cpu_to_le32(iort->node_offset);
+
+    if (vms->smmu) {
+        /* output IORT node is the smmuv3 node */
+        idmap->output_reference = cpu_to_le32(smmu_offset);
+    } else {
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
 
     iort->length = cpu_to_le32(iort_length);
 
@@ -782,7 +824,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
-        build_iort(tables_blob, tables->linker);
+        build_iort(tables_blob, tables->linker, vms);
     }
 
     /* XSDT is pointed to by RSDP */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 72be675..69307b7 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -697,6 +697,21 @@ struct AcpiIortItsGroup {
 } QEMU_PACKED;
 typedef struct AcpiIortItsGroup AcpiIortItsGroup;
 
+struct AcpiIortSmmu3 {
+    ACPI_IORT_NODE_HEADER_DEF
+    uint64_t base_address;
+    uint32_t flags;
+    uint32_t reserved2;
+    uint64_t vatos_address;
+    uint32_t model;
+    uint32_t event_gsiv;
+    uint32_t pri_gsiv;
+    uint32_t gerr_gsiv;
+    uint32_t sync_gsiv;
+    AcpiIortIdMapping id_mapping_array[0];
+} QEMU_PACKED;
+typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
+
 struct AcpiIortRC {
     ACPI_IORT_NODE_HEADER_DEF
     AcpiIortMemoryAccess memory_properties;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 6/9] hw/arm/virt: Add tlbi-on-map property to the smmuv3 node
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (4 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 5/9] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 7/9] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

For VFIO integration we need to update physical IOMMU mappings
each time the guest updates the vIOMMU translation structures.
For that, we rely on a special smmuv3 option, "tlbi-on-map"
which forces TLB invalidations on map (this mode is similar to
the Intel VTD caching Mode). The smmuv3 driver then sends
SMMU_CMD_TLBI_NH_VA commands, upon which we will update the physical
mappings.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b758173..c2ac8c6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1034,6 +1034,7 @@ static void create_smmu(const VirtMachineState *vms, qemu_irq *pic)
     qemu_fdt_setprop_cell(vms->fdt, smmu, "clocks", vms->clock_phandle);
     qemu_fdt_setprop_string(vms->fdt, smmu, "clock-names", "apb_pclk");
     qemu_fdt_setprop(vms->fdt, smmu, "dma-coherent", NULL, 0);
+    qemu_fdt_setprop(vms->fdt, smmu, "tlbi-on-map", NULL, 0);
 
     qemu_fdt_setprop_cell(vms->fdt, smmu, "#iommu-cells", 1);
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 7/9] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (5 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 6/9] hw/arm/virt: Add tlbi-on-map property to the smmuv3 node Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration Eric Auger
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

In case the MSI is translated by an IOMMU we need to fixup the
MSI route with the translated address.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use IOMMUMemoryRegionClass API

It is still unclear to me if we need to register an IOMMUNotifier
to handle any change in the MSI doorbell which would occur behind
the scene and would not lead to any call to kvm_arch_fixup_msi_route().
---
 target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
 target/arm/trace-events |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7c17f0d..a2fa948 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -20,8 +20,13 @@
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
 #include "cpu.h"
+#include "trace.h"
 #include "internals.h"
 #include "hw/arm/arm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/arm/smmu-common.h"
+#include "hw/arm/smmuv3.h"
 #include "exec/memattrs.h"
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
@@ -662,6 +667,28 @@ int kvm_arm_vgic_probe(void)
 int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
+    AddressSpace *as = pci_device_iommu_address_space(dev);
+    IOMMUMemoryRegionClass *imrc;
+    IOMMUTLBEntry entry;
+    SMMUDevice *sdev;
+
+    if (as == &address_space_memory) {
+        return 0;
+    }
+
+    /* MSI doorbell address is translated by an IOMMU */
+    sdev = container_of(as, SMMUDevice, as);
+    imrc = IOMMU_MEMORY_REGION_GET_CLASS(&sdev->iommu);
+
+    entry = imrc->translate(&sdev->iommu, address, IOMMU_WO);
+
+    route->u.msi.address_lo = entry.translated_addr;
+    route->u.msi.address_hi = entry.translated_addr >> 32;
+
+    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
+                                  sdev->iommu.parent_obj.name,
+                                  entry.translated_addr);
+
     return 0;
 }
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 9e37131..8b3c220 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value 0x%"
 arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value 0x%" PRIx64
 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d"
 arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value 0x%" PRIx64
+
+# target/arm/kvm.c
+kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (6 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 7/9] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-21  1:21   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  2017-08-23  4:24   ` Linu Cherian
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 9/9] hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model Eric Auger
  2017-08-11 15:38 ` [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support no-reply
  9 siblings, 2 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

This patch allows doing PCIe passthrough with a guest exposed
with a vSMMUv3. It implements the replay and notify_flag_changed
iommu ops. Also on TLB and data structure invalidation commands,
we replay the mappings so that the physical IOMMU implements
updated stage 1 settings (Guest IOVA -> Guest PA) + stage 2 settings.

This works only if the guest smmuv3 driver implements the
"tlbi-on-map" option.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use IOMMUMemoryRegion
- handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
  (goes along with TLBI_ON_MAP FW quirk)
- replay systematically unmap the whole range first
- smmuv3_map_hook does not unmap anymore and the unmap is done
  before the replay
- add and use smmuv3_context_device_invalidate instead of
  blindly replaying everything
---
 hw/arm/smmuv3-internal.h |   1 +
 hw/arm/smmuv3.c          | 265 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/arm/trace-events      |  14 +++
 3 files changed, 277 insertions(+), 3 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index e255df1..ac4628f 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -344,6 +344,7 @@ enum {
     SMMU_CMD_RESUME          = 0x44,
     SMMU_CMD_STALL_TERM,
     SMMU_CMD_SYNC,          /* 0x46 */
+    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
 };
 
 static const char *cmd_stringify[] = {
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index e195a0e..89fb116 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -25,6 +25,7 @@
 #include "exec/address-spaces.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "exec/target_page.h"
 
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-internal.h"
@@ -143,6 +144,71 @@ static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
     return ret;
 }
 
+static void smmuv3_replay_all(SMMUState *s)
+{
+    SMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        trace_smmuv3_replay_all(node->sdev->iommu.parent_obj.name);
+        memory_region_iommu_replay_all(&node->sdev->iommu);
+    }
+}
+
+/* Replay the mappings for a given streamid */
+static void smmuv3_context_device_invalidate(SMMUState *s, uint16_t sid)
+{
+    uint8_t bus_n, devfn;
+    SMMUPciBus *smmu_bus;
+    SMMUDevice *smmu;
+
+    trace_smmuv3_context_device_invalidate(sid);
+    bus_n = PCI_BUS_NUM(sid);
+    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
+    if (smmu_bus) {
+        devfn = PCI_FUNC(sid);
+        smmu = smmu_bus->pbdev[devfn];
+        if (smmu) {
+            memory_region_iommu_replay_all(&smmu->iommu);
+        }
+    }
+}
+
+static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
+                                 uint64_t iova);
+
+static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
+                                 uint64_t iova, size_t nb_pages);
+
+static void smmuv3_notify_single(SMMUState *s, uint64_t iova)
+{
+    SMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+        IOMMUNotifier *n;
+
+        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
+        IOMMU_NOTIFIER_FOREACH(n, mr) {
+            smmuv3_replay_single(mr, n, iova);
+        }
+    }
+}
+
+static void smmuv3_notify_range(SMMUState *s, uint64_t iova, size_t size)
+{
+    SMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+        IOMMUNotifier *n;
+
+        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
+        IOMMU_NOTIFIER_FOREACH(n, mr) {
+            smmuv3_replay_range(mr, n, iova, size);
+        }
+    }
+}
+
 static int smmu_cmdq_consume(SMMUV3State *s)
 {
     uint32_t error = SMMU_CMD_ERR_NONE;
@@ -178,28 +244,38 @@ static int smmu_cmdq_consume(SMMUV3State *s)
             break;
         case SMMU_CMD_PREFETCH_CONFIG:
         case SMMU_CMD_PREFETCH_ADDR:
+            break;
         case SMMU_CMD_CFGI_STE:
         {
              uint32_t streamid = cmd.word[1];
 
              trace_smmuv3_cmdq_cfgi_ste(streamid);
-            break;
+             smmuv3_context_device_invalidate(&s->smmu_state, streamid);
+             break;
         }
         case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
         {
-            uint32_t start = cmd.word[1], range, end;
+            uint32_t start = cmd.word[1], range, end, i;
 
             range = extract32(cmd.word[2], 0, 5);
             end = start + (1 << (range + 1)) - 1;
             trace_smmuv3_cmdq_cfgi_ste_range(start, end);
+            for (i = start; i <= end; i++) {
+                smmuv3_context_device_invalidate(&s->smmu_state, i);
+            }
             break;
         }
         case SMMU_CMD_CFGI_CD:
         case SMMU_CMD_CFGI_CD_ALL:
+        {
+             uint32_t streamid = cmd.word[1];
+
+            smmuv3_context_device_invalidate(&s->smmu_state, streamid);
             break;
+        }
         case SMMU_CMD_TLBI_NH_ALL:
         case SMMU_CMD_TLBI_NH_ASID:
-            printf("%s TLBI* replay\n", __func__);
+            smmuv3_replay_all(&s->smmu_state);
             break;
         case SMMU_CMD_TLBI_NH_VA:
         {
@@ -210,6 +286,20 @@ static int smmu_cmdq_consume(SMMUV3State *s)
             uint64_t addr = high << 32 | (low << 12);
 
             trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
+            smmuv3_notify_single(&s->smmu_state, addr);
+            break;
+        }
+        case SMMU_CMD_TLBI_NH_VA_AM:
+        {
+            int asid = extract32(cmd.word[1], 16, 16);
+            int am = extract32(cmd.word[1], 0, 16);
+            uint64_t low = extract32(cmd.word[2], 12, 20);
+            uint64_t high = cmd.word[3];
+            uint64_t addr = high << 32 | (low << 12);
+            size_t size = am << 12;
+
+            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
+            smmuv3_notify_range(&s->smmu_state, addr, size);
             break;
         }
         case SMMU_CMD_TLBI_NH_VAA:
@@ -222,6 +312,7 @@ static int smmu_cmdq_consume(SMMUV3State *s)
         case SMMU_CMD_TLBI_S12_VMALL:
         case SMMU_CMD_TLBI_S2_IPA:
         case SMMU_CMD_TLBI_NSNH_ALL:
+            smmuv3_replay_all(&s->smmu_state);
             break;
         case SMMU_CMD_ATC_INV:
         case SMMU_CMD_PRI_RESP:
@@ -804,6 +895,172 @@ out:
     return entry;
 }
 
+static int smmuv3_replay_hook(IOMMUTLBEntry *entry, void *private)
+{
+    trace_smmuv3_replay_hook(entry->iova, entry->translated_addr,
+                             entry->addr_mask, entry->perm);
+    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    return 0;
+}
+
+static int smmuv3_map_hook(IOMMUTLBEntry *entry, void *private)
+{
+    trace_smmuv3_map_hook(entry->iova, entry->translated_addr,
+                          entry->addr_mask, entry->perm);
+    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    return 0;
+}
+
+/* Unmap the whole range in the notifier's scope. */
+static void smmuv3_unmap_notifier(SMMUDevice *sdev, IOMMUNotifier *n)
+{
+    IOMMUTLBEntry entry;
+    hwaddr size;
+    hwaddr start = n->start;
+    hwaddr end = n->end;
+
+    size = end - start + 1;
+
+    entry.target_as = &address_space_memory;
+    /* Adjust iova for the size */
+    entry.iova = n->start & ~(size - 1);
+    /* This field is meaningless for unmap */
+    entry.translated_addr = 0;
+    entry.perm = IOMMU_NONE;
+    entry.addr_mask = size - 1;
+
+    /* TODO: check start/end/size/mask */
+
+    trace_smmuv3_unmap_notifier(pci_bus_num(sdev->bus),
+                                PCI_SLOT(sdev->devfn),
+                                PCI_FUNC(sdev->devfn),
+                                entry.iova, size);
+
+    memory_region_notify_one(n, &entry);
+}
+
+static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
+    SMMUTransCfg cfg = {};
+    int ret;
+
+    smmuv3_unmap_notifier(sdev, n);
+
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret) {
+        error_report("%s error decoding the configuration for iommu mr=%s",
+                     __func__, mr->parent_obj.name);
+    }
+
+    if (cfg.disabled || cfg.bypassed) {
+        return;
+    }
+    /* is the smmu enabled */
+    sbc->page_walk_64(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
+                      smmuv3_replay_hook, n);
+}
+static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
+                                 uint64_t iova, size_t size)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry;
+    int ret;
+
+    trace_smmuv3_replay_range(mr->parent_obj.name, iova, size, n);
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret) {
+        error_report("%s error decoding the configuration for iommu mr=%s",
+                     __func__, mr->parent_obj.name);
+    }
+
+    if (cfg.disabled || cfg.bypassed) {
+        return;
+    }
+
+    /* first unmap */
+    entry.target_as = &address_space_memory;
+    entry.iova = iova & ~(size - 1);
+    entry.addr_mask = size - 1;
+    entry.perm = IOMMU_NONE;
+
+    memory_region_notify_one(n, &entry);
+
+    /* then figure out if a new mapping needs to be applied */
+    sbc->page_walk_64(&cfg, iova, iova + entry.addr_mask , false,
+                      smmuv3_map_hook, n);
+}
+
+static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
+                                 uint64_t iova)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    size_t target_page_size = qemu_target_page_size();
+    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry;
+    int ret;
+
+    trace_smmuv3_replay_single(mr->parent_obj.name, iova, n);
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret) {
+        error_report("%s error decoding the configuration for iommu mr=%s",
+                     __func__, mr->parent_obj.name);
+    }
+
+    if (cfg.disabled || cfg.bypassed) {
+        return;
+    }
+
+    /* first unmap */
+    entry.target_as = &address_space_memory;
+    entry.iova = iova & ~(target_page_size - 1);
+    entry.addr_mask = target_page_size - 1;
+    entry.perm = IOMMU_NONE;
+
+    memory_region_notify_one(n, &entry);
+
+    /* then figure out if a new mapping needs to be applied */
+    sbc->page_walk_64(&cfg, iova, iova + 1, false,
+                      smmuv3_map_hook, n);
+}
+
+static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
+                                       IOMMUNotifierFlag old,
+                                       IOMMUNotifierFlag new)
+{
+    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
+    SMMUV3State *s3 = sdev->smmu;
+    SMMUState *s = &(s3->smmu_state);
+    SMMUNotifierNode *node = NULL;
+    SMMUNotifierNode *next_node = NULL;
+
+    if (old == IOMMU_NOTIFIER_NONE) {
+        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
+        node = g_malloc0(sizeof(*node));
+        node->sdev = sdev;
+        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
+        return;
+    }
+
+    /* update notifier node with new flags */
+    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+        if (node->sdev == sdev) {
+            if (new == IOMMU_NOTIFIER_NONE) {
+                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+                QLIST_REMOVE(node, next);
+                g_free(node);
+            }
+            return;
+        }
+    }
+}
 
 static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
                                         uint64_t val)
@@ -1125,6 +1382,8 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = smmuv3_translate;
+    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
+    imrc->replay = smmuv3_replay;
 }
 
 static const TypeInfo smmuv3_type_info = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index f9b9cbe..8228e26 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -27,6 +27,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
 smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
 smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
 smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
+smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
 smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
 smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
 smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
@@ -50,3 +51,16 @@ smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t
 smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
 smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
 smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
+
+smmuv3_replay(uint16_t sid, bool enabled) "sid=%d, enabled=%d"
+smmuv3_replay_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
+smmuv3_map_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
+smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
+smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
+smmuv3_replay_single(const char *name, uint64_t iova, void *n) "iommu mr=%s iova=0x%"PRIx64" n=%p"
+smmuv3_replay_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
+smmuv3_replay_all(const char *name) "iommu mr=%s"
+smmuv3_notify_all(const char *name, uint64_t iova) "iommu mr=%s iova=0x%"PRIx64
+smmuv3_unmap_notifier(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
+smmuv3_context_device_invalidate(uint32_t sid) "sid=%d"
+
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [RFC v6 9/9] hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (7 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration Eric Auger
@ 2017-08-11 14:22 ` Eric Auger
  2017-08-11 15:38 ` [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support no-reply
  9 siblings, 0 replies; 15+ messages in thread
From: Eric Auger @ 2017-08-11 14:22 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias

To allow VFIO use case, let's set the smmu model to
ACPI_IORT_SMMU_V3_CACHING_MODE.

An important notice is this model is not standardized in the
ACPI IORT as this work is a proof of concept.

We also set the COHACC override flag which seems to be mandated.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt-acpi-build.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index ac2cd3e..9103117 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -437,6 +437,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 
         smmu->type = ACPI_IORT_NODE_SMMU_V3;
         smmu->length = cpu_to_le16(node_size);
+		smmu->model = 0x3; /* ACPI_IORT_SMMU_V3_CACHING_MODE */
         smmu->mapping_count = cpu_to_le32(1);
         smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
         smmu->base_address = cpu_to_le64(vms->memmap[VIRT_SMMU].base);
@@ -444,6 +445,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
         smmu->pri_gsiv = cpu_to_le32(irq + 1);
         smmu->gerr_gsiv = cpu_to_le32(irq + 2);
         smmu->sync_gsiv = cpu_to_le32(irq + 3);
+        smmu->flags = 0x1; /* COHACC Override */
 
         /* Identity RID mapping covering the whole input RID range */
         idmap = &smmu->id_mapping_array[0];
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support
  2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
                   ` (8 preceding siblings ...)
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 9/9] hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model Eric Auger
@ 2017-08-11 15:38 ` no-reply
  9 siblings, 0 replies; 15+ messages in thread
From: no-reply @ 2017-08-11 15:38 UTC (permalink / raw)
  To: eric.auger
  Cc: famz, eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, edgar.iglesias,
	bharat.bhushan, christoffer.dall

Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 1502461354-11327-1-git-send-email-eric.auger@redhat.com
Subject: [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
3e62cedd73 hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model
3a51094ee9 hw/arm/smmuv3: VFIO integration
815575515f target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
f72bc6ae67 hw/arm/virt: Add tlbi-on-map property to the smmuv3 node
52a2569fe7 hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
62c3c2fcae hw/arm/virt: Add 2.11 machine type
80671b4e90 hw/arm/virt: Add SMMUv3 to the virt board
ae1c203d73 hw/arm/smmuv3: smmuv3 emulation model
bcb7d6e45d hw/arm/smmu-common: smmu base class

=== OUTPUT BEGIN ===
Checking PATCH 1/9: hw/arm/smmu-common: smmu base class...
Checking PATCH 2/9: hw/arm/smmuv3: smmuv3 emulation model...
Checking PATCH 3/9: hw/arm/virt: Add SMMUv3 to the virt board...
Checking PATCH 4/9: hw/arm/virt: Add 2.11 machine type...
Checking PATCH 5/9: hw/arm/virt-acpi-build: Add smmuv3 node in IORT table...
Checking PATCH 6/9: hw/arm/virt: Add tlbi-on-map property to the smmuv3 node...
Checking PATCH 7/9: target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route...
Checking PATCH 8/9: hw/arm/smmuv3: VFIO integration...
Checking PATCH 9/9: hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model...
ERROR: code indent should never use tabs
#26: FILE: hw/arm/virt-acpi-build.c:440:
+^I^Ismmu->model = 0x3; /* ACPI_IORT_SMMU_V3_CACHING_MODE */$

total: 1 errors, 0 warnings, 14 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration Eric Auger
@ 2017-08-21  1:21   ` Linu Cherian
  2017-08-23  4:24   ` Linu Cherian
  1 sibling, 0 replies; 15+ messages in thread
From: Linu Cherian @ 2017-08-21  1:21 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall

Hi Eric,


On Fri Aug 11, 2017 at 04:22:33PM +0200, Eric Auger wrote:
> This patch allows doing PCIe passthrough with a guest exposed
> with a vSMMUv3. It implements the replay and notify_flag_changed
> iommu ops. Also on TLB and data structure invalidation commands,
> we replay the mappings so that the physical IOMMU implements
> updated stage 1 settings (Guest IOVA -> Guest PA) + stage 2 settings.
> 
> This works only if the guest smmuv3 driver implements the
> "tlbi-on-map" option.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>

Appreciate if you could point to some documentation that explains 
the attach, detach, map,unmap flows for a passthrough device 
with emulated IOMMU. 
Thanks. 
> ---
> 
> v5 -> v6:
> - use IOMMUMemoryRegion
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> ---
>  hw/arm/smmuv3-internal.h |   1 +
>  hw/arm/smmuv3.c          | 265 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/arm/trace-events      |  14 +++
>  3 files changed, 277 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index e255df1..ac4628f 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -344,6 +344,7 @@ enum {
>      SMMU_CMD_RESUME          = 0x44,
>      SMMU_CMD_STALL_TERM,
>      SMMU_CMD_SYNC,          /* 0x46 */
> +    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
>  };
>  
>  static const char *cmd_stringify[] = {
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index e195a0e..89fb116 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -25,6 +25,7 @@
>  #include "exec/address-spaces.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
> +#include "exec/target_page.h"
>  
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-internal.h"
> @@ -143,6 +144,71 @@ static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
>      return ret;
>  }
>  
> +static void smmuv3_replay_all(SMMUState *s)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        trace_smmuv3_replay_all(node->sdev->iommu.parent_obj.name);
> +        memory_region_iommu_replay_all(&node->sdev->iommu);
> +    }
> +}
> +
> +/* Replay the mappings for a given streamid */
> +static void smmuv3_context_device_invalidate(SMMUState *s, uint16_t sid)
> +{
> +    uint8_t bus_n, devfn;
> +    SMMUPciBus *smmu_bus;
> +    SMMUDevice *smmu;
> +
> +    trace_smmuv3_context_device_invalidate(sid);
> +    bus_n = PCI_BUS_NUM(sid);
> +    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
> +    if (smmu_bus) {
> +        devfn = PCI_FUNC(sid);
> +        smmu = smmu_bus->pbdev[devfn];
> +        if (smmu) {
> +            memory_region_iommu_replay_all(&smmu->iommu);
> +        }
> +    }
> +}
> +
> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova);
> +
> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova, size_t nb_pages);
> +
> +static void smmuv3_notify_single(SMMUState *s, uint64_t iova)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> +        IOMMUNotifier *n;
> +
> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> +            smmuv3_replay_single(mr, n, iova);
> +        }
> +    }
> +}
> +
> +static void smmuv3_notify_range(SMMUState *s, uint64_t iova, size_t size)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> +        IOMMUNotifier *n;
> +
> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> +            smmuv3_replay_range(mr, n, iova, size);
> +        }
> +    }
> +}
> +
>  static int smmu_cmdq_consume(SMMUV3State *s)
>  {
>      uint32_t error = SMMU_CMD_ERR_NONE;
> @@ -178,28 +244,38 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>              break;
>          case SMMU_CMD_PREFETCH_CONFIG:
>          case SMMU_CMD_PREFETCH_ADDR:
> +            break;
>          case SMMU_CMD_CFGI_STE:
>          {
>               uint32_t streamid = cmd.word[1];
>  
>               trace_smmuv3_cmdq_cfgi_ste(streamid);
> -            break;
> +             smmuv3_context_device_invalidate(&s->smmu_state, streamid);
> +             break;
>          }
>          case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
>          {
> -            uint32_t start = cmd.word[1], range, end;
> +            uint32_t start = cmd.word[1], range, end, i;
>  
>              range = extract32(cmd.word[2], 0, 5);
>              end = start + (1 << (range + 1)) - 1;
>              trace_smmuv3_cmdq_cfgi_ste_range(start, end);
> +            for (i = start; i <= end; i++) {
> +                smmuv3_context_device_invalidate(&s->smmu_state, i);
> +            }
>              break;
>          }
>          case SMMU_CMD_CFGI_CD:
>          case SMMU_CMD_CFGI_CD_ALL:
> +        {
> +             uint32_t streamid = cmd.word[1];
> +
> +            smmuv3_context_device_invalidate(&s->smmu_state, streamid);
>              break;
> +        }
>          case SMMU_CMD_TLBI_NH_ALL:
>          case SMMU_CMD_TLBI_NH_ASID:
> -            printf("%s TLBI* replay\n", __func__);
> +            smmuv3_replay_all(&s->smmu_state);
>              break;
>          case SMMU_CMD_TLBI_NH_VA:
>          {
> @@ -210,6 +286,20 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>              uint64_t addr = high << 32 | (low << 12);
>  
>              trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
> +            smmuv3_notify_single(&s->smmu_state, addr);
> +            break;
> +        }
> +        case SMMU_CMD_TLBI_NH_VA_AM:
> +        {
> +            int asid = extract32(cmd.word[1], 16, 16);
> +            int am = extract32(cmd.word[1], 0, 16);
> +            uint64_t low = extract32(cmd.word[2], 12, 20);
> +            uint64_t high = cmd.word[3];
> +            uint64_t addr = high << 32 | (low << 12);
> +            size_t size = am << 12;
> +
> +            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
> +            smmuv3_notify_range(&s->smmu_state, addr, size);
>              break;
>          }
>          case SMMU_CMD_TLBI_NH_VAA:
> @@ -222,6 +312,7 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>          case SMMU_CMD_TLBI_S12_VMALL:
>          case SMMU_CMD_TLBI_S2_IPA:
>          case SMMU_CMD_TLBI_NSNH_ALL:
> +            smmuv3_replay_all(&s->smmu_state);
>              break;
>          case SMMU_CMD_ATC_INV:
>          case SMMU_CMD_PRI_RESP:
> @@ -804,6 +895,172 @@ out:
>      return entry;
>  }
>  
> +static int smmuv3_replay_hook(IOMMUTLBEntry *entry, void *private)
> +{
> +    trace_smmuv3_replay_hook(entry->iova, entry->translated_addr,
> +                             entry->addr_mask, entry->perm);
> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> +    return 0;
> +}
> +
> +static int smmuv3_map_hook(IOMMUTLBEntry *entry, void *private)
> +{
> +    trace_smmuv3_map_hook(entry->iova, entry->translated_addr,
> +                          entry->addr_mask, entry->perm);
> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> +    return 0;
> +}
> +
> +/* Unmap the whole range in the notifier's scope. */
> +static void smmuv3_unmap_notifier(SMMUDevice *sdev, IOMMUNotifier *n)
> +{
> +    IOMMUTLBEntry entry;
> +    hwaddr size;
> +    hwaddr start = n->start;
> +    hwaddr end = n->end;
> +
> +    size = end - start + 1;
> +
> +    entry.target_as = &address_space_memory;
> +    /* Adjust iova for the size */
> +    entry.iova = n->start & ~(size - 1);
> +    /* This field is meaningless for unmap */
> +    entry.translated_addr = 0;
> +    entry.perm = IOMMU_NONE;
> +    entry.addr_mask = size - 1;
> +
> +    /* TODO: check start/end/size/mask */
> +
> +    trace_smmuv3_unmap_notifier(pci_bus_num(sdev->bus),
> +                                PCI_SLOT(sdev->devfn),
> +                                PCI_FUNC(sdev->devfn),
> +                                entry.iova, size);
> +
> +    memory_region_notify_one(n, &entry);
> +}
> +
> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    int ret;
> +
> +    smmuv3_unmap_notifier(sdev, n);
> +
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +    /* is the smmu enabled */
> +    sbc->page_walk_64(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
> +                      smmuv3_replay_hook, n);
> +}
> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova, size_t size)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    IOMMUTLBEntry entry;
> +    int ret;
> +
> +    trace_smmuv3_replay_range(mr->parent_obj.name, iova, size, n);
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +
> +    /* first unmap */
> +    entry.target_as = &address_space_memory;
> +    entry.iova = iova & ~(size - 1);
> +    entry.addr_mask = size - 1;
> +    entry.perm = IOMMU_NONE;
> +
> +    memory_region_notify_one(n, &entry);
> +
> +    /* then figure out if a new mapping needs to be applied */
> +    sbc->page_walk_64(&cfg, iova, iova + entry.addr_mask , false,
> +                      smmuv3_map_hook, n);
> +}
> +
> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    size_t target_page_size = qemu_target_page_size();
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    IOMMUTLBEntry entry;
> +    int ret;
> +
> +    trace_smmuv3_replay_single(mr->parent_obj.name, iova, n);
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +
> +    /* first unmap */
> +    entry.target_as = &address_space_memory;
> +    entry.iova = iova & ~(target_page_size - 1);
> +    entry.addr_mask = target_page_size - 1;
> +    entry.perm = IOMMU_NONE;
> +
> +    memory_region_notify_one(n, &entry);
> +
> +    /* then figure out if a new mapping needs to be applied */
> +    sbc->page_walk_64(&cfg, iova, iova + 1, false,
> +                      smmuv3_map_hook, n);
> +}
> +
> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
> +                                       IOMMUNotifierFlag old,
> +                                       IOMMUNotifierFlag new)
> +{
> +    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
> +    SMMUV3State *s3 = sdev->smmu;
> +    SMMUState *s = &(s3->smmu_state);
> +    SMMUNotifierNode *node = NULL;
> +    SMMUNotifierNode *next_node = NULL;
> +
> +    if (old == IOMMU_NOTIFIER_NONE) {
> +        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
> +        node = g_malloc0(sizeof(*node));
> +        node->sdev = sdev;
> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
> +        return;
> +    }
> +
> +    /* update notifier node with new flags */
> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
> +        if (node->sdev == sdev) {
> +            if (new == IOMMU_NOTIFIER_NONE) {
> +                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
> +                QLIST_REMOVE(node, next);
> +                g_free(node);
> +            }
> +            return;
> +        }
> +    }
> +}
>  
>  static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>                                          uint64_t val)
> @@ -1125,6 +1382,8 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>  
>      imrc->translate = smmuv3_translate;
> +    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
> +    imrc->replay = smmuv3_replay;
>  }
>  
>  static const TypeInfo smmuv3_type_info = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index f9b9cbe..8228e26 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -27,6 +27,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>  smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
>  smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
>  smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
> +smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
>  smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
>  smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
>  smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
> @@ -50,3 +51,16 @@ smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t
>  smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
>  smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
>  smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
> +
> +smmuv3_replay(uint16_t sid, bool enabled) "sid=%d, enabled=%d"
> +smmuv3_replay_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> +smmuv3_map_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> +smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
> +smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
> +smmuv3_replay_single(const char *name, uint64_t iova, void *n) "iommu mr=%s iova=0x%"PRIx64" n=%p"
> +smmuv3_replay_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
> +smmuv3_replay_all(const char *name) "iommu mr=%s"
> +smmuv3_notify_all(const char *name, uint64_t iova) "iommu mr=%s iova=0x%"PRIx64
> +smmuv3_unmap_notifier(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
> +smmuv3_context_device_invalidate(uint32_t sid) "sid=%d"
> +
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration
  2017-08-11 14:22 ` [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration Eric Auger
  2017-08-21  1:21   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-08-23  4:24   ` Linu Cherian
  2017-08-23  6:39     ` Auger Eric
  1 sibling, 1 reply; 15+ messages in thread
From: Linu Cherian @ 2017-08-23  4:24 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	alex.williamson, prem.mallappa, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, linu.cherian

Hi Eric,


On Fri Aug 11, 2017 at 04:22:33PM +0200, Eric Auger wrote:
> This patch allows doing PCIe passthrough with a guest exposed
> with a vSMMUv3. It implements the replay and notify_flag_changed
> iommu ops. Also on TLB and data structure invalidation commands,
> we replay the mappings so that the physical IOMMU implements
> updated stage 1 settings (Guest IOVA -> Guest PA) + stage 2 settings.
> 
> This works only if the guest smmuv3 driver implements the
> "tlbi-on-map" option.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

Tried out launching a guest with Qemu option "-machine virt-2.10,smmu"
and a 1G Ethernet controller as vfio-pci device. It works fine for me.

Qemu source: https://github.com/eauger/qemu.git Branch: v2.10.0-rc2-SMMU-v6  

But had to make this change, 

--- a/hw/arm/virt.c                                                                    
+++ b/hw/arm/virt.c                                                                    
@@ -1806,7 +1806,7 @@ static void virt_machine_2_10_options(MachineClass *mc)          
     virt_machine_2_11_options(mc);                                                    
     SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);                                         
                                                                                       
-    vmc->no_smmu = true;                                                              
+    vmc->no_smmu = false;                                                             
 }                                                                                     
 DEFINE_VIRT_MACHINE(2, 10) 

so that qemu doesnt complain about "Property .smmu not found"

Will let you know if i have updates on further testing.

Thanks.

> 
> ---
> 
> v5 -> v6:
> - use IOMMUMemoryRegion
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> ---
>  hw/arm/smmuv3-internal.h |   1 +
>  hw/arm/smmuv3.c          | 265 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/arm/trace-events      |  14 +++
>  3 files changed, 277 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index e255df1..ac4628f 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -344,6 +344,7 @@ enum {
>      SMMU_CMD_RESUME          = 0x44,
>      SMMU_CMD_STALL_TERM,
>      SMMU_CMD_SYNC,          /* 0x46 */
> +    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
>  };
>  
>  static const char *cmd_stringify[] = {
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index e195a0e..89fb116 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -25,6 +25,7 @@
>  #include "exec/address-spaces.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
> +#include "exec/target_page.h"
>  
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-internal.h"
> @@ -143,6 +144,71 @@ static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
>      return ret;
>  }
>  
> +static void smmuv3_replay_all(SMMUState *s)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        trace_smmuv3_replay_all(node->sdev->iommu.parent_obj.name);
> +        memory_region_iommu_replay_all(&node->sdev->iommu);
> +    }
> +}
> +
> +/* Replay the mappings for a given streamid */
> +static void smmuv3_context_device_invalidate(SMMUState *s, uint16_t sid)
> +{
> +    uint8_t bus_n, devfn;
> +    SMMUPciBus *smmu_bus;
> +    SMMUDevice *smmu;
> +
> +    trace_smmuv3_context_device_invalidate(sid);
> +    bus_n = PCI_BUS_NUM(sid);
> +    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
> +    if (smmu_bus) {
> +        devfn = PCI_FUNC(sid);
> +        smmu = smmu_bus->pbdev[devfn];
> +        if (smmu) {
> +            memory_region_iommu_replay_all(&smmu->iommu);
> +        }
> +    }
> +}
> +
> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova);
> +
> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova, size_t nb_pages);
> +
> +static void smmuv3_notify_single(SMMUState *s, uint64_t iova)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> +        IOMMUNotifier *n;
> +
> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> +            smmuv3_replay_single(mr, n, iova);
> +        }
> +    }
> +}
> +
> +static void smmuv3_notify_range(SMMUState *s, uint64_t iova, size_t size)
> +{
> +    SMMUNotifierNode *node;
> +
> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> +        IOMMUNotifier *n;
> +
> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> +            smmuv3_replay_range(mr, n, iova, size);
> +        }
> +    }
> +}
> +
>  static int smmu_cmdq_consume(SMMUV3State *s)
>  {
>      uint32_t error = SMMU_CMD_ERR_NONE;
> @@ -178,28 +244,38 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>              break;
>          case SMMU_CMD_PREFETCH_CONFIG:
>          case SMMU_CMD_PREFETCH_ADDR:
> +            break;
>          case SMMU_CMD_CFGI_STE:
>          {
>               uint32_t streamid = cmd.word[1];
>  
>               trace_smmuv3_cmdq_cfgi_ste(streamid);
> -            break;
> +             smmuv3_context_device_invalidate(&s->smmu_state, streamid);
> +             break;
>          }
>          case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
>          {
> -            uint32_t start = cmd.word[1], range, end;
> +            uint32_t start = cmd.word[1], range, end, i;
>  
>              range = extract32(cmd.word[2], 0, 5);
>              end = start + (1 << (range + 1)) - 1;
>              trace_smmuv3_cmdq_cfgi_ste_range(start, end);
> +            for (i = start; i <= end; i++) {
> +                smmuv3_context_device_invalidate(&s->smmu_state, i);
> +            }
>              break;
>          }
>          case SMMU_CMD_CFGI_CD:
>          case SMMU_CMD_CFGI_CD_ALL:
> +        {
> +             uint32_t streamid = cmd.word[1];
> +
> +            smmuv3_context_device_invalidate(&s->smmu_state, streamid);
>              break;
> +        }
>          case SMMU_CMD_TLBI_NH_ALL:
>          case SMMU_CMD_TLBI_NH_ASID:
> -            printf("%s TLBI* replay\n", __func__);
> +            smmuv3_replay_all(&s->smmu_state);
>              break;
>          case SMMU_CMD_TLBI_NH_VA:
>          {
> @@ -210,6 +286,20 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>              uint64_t addr = high << 32 | (low << 12);
>  
>              trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
> +            smmuv3_notify_single(&s->smmu_state, addr);
> +            break;
> +        }
> +        case SMMU_CMD_TLBI_NH_VA_AM:
> +        {
> +            int asid = extract32(cmd.word[1], 16, 16);
> +            int am = extract32(cmd.word[1], 0, 16);
> +            uint64_t low = extract32(cmd.word[2], 12, 20);
> +            uint64_t high = cmd.word[3];
> +            uint64_t addr = high << 32 | (low << 12);
> +            size_t size = am << 12;
> +
> +            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
> +            smmuv3_notify_range(&s->smmu_state, addr, size);
>              break;
>          }
>          case SMMU_CMD_TLBI_NH_VAA:
> @@ -222,6 +312,7 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>          case SMMU_CMD_TLBI_S12_VMALL:
>          case SMMU_CMD_TLBI_S2_IPA:
>          case SMMU_CMD_TLBI_NSNH_ALL:
> +            smmuv3_replay_all(&s->smmu_state);
>              break;
>          case SMMU_CMD_ATC_INV:
>          case SMMU_CMD_PRI_RESP:
> @@ -804,6 +895,172 @@ out:
>      return entry;
>  }
>  
> +static int smmuv3_replay_hook(IOMMUTLBEntry *entry, void *private)
> +{
> +    trace_smmuv3_replay_hook(entry->iova, entry->translated_addr,
> +                             entry->addr_mask, entry->perm);
> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> +    return 0;
> +}
> +
> +static int smmuv3_map_hook(IOMMUTLBEntry *entry, void *private)
> +{
> +    trace_smmuv3_map_hook(entry->iova, entry->translated_addr,
> +                          entry->addr_mask, entry->perm);
> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> +    return 0;
> +}
> +
> +/* Unmap the whole range in the notifier's scope. */
> +static void smmuv3_unmap_notifier(SMMUDevice *sdev, IOMMUNotifier *n)
> +{
> +    IOMMUTLBEntry entry;
> +    hwaddr size;
> +    hwaddr start = n->start;
> +    hwaddr end = n->end;
> +
> +    size = end - start + 1;
> +
> +    entry.target_as = &address_space_memory;
> +    /* Adjust iova for the size */
> +    entry.iova = n->start & ~(size - 1);
> +    /* This field is meaningless for unmap */
> +    entry.translated_addr = 0;
> +    entry.perm = IOMMU_NONE;
> +    entry.addr_mask = size - 1;
> +
> +    /* TODO: check start/end/size/mask */
> +
> +    trace_smmuv3_unmap_notifier(pci_bus_num(sdev->bus),
> +                                PCI_SLOT(sdev->devfn),
> +                                PCI_FUNC(sdev->devfn),
> +                                entry.iova, size);
> +
> +    memory_region_notify_one(n, &entry);
> +}
> +
> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    int ret;
> +
> +    smmuv3_unmap_notifier(sdev, n);
> +
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +    /* is the smmu enabled */
> +    sbc->page_walk_64(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
> +                      smmuv3_replay_hook, n);
> +}
> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova, size_t size)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    IOMMUTLBEntry entry;
> +    int ret;
> +
> +    trace_smmuv3_replay_range(mr->parent_obj.name, iova, size, n);
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +
> +    /* first unmap */
> +    entry.target_as = &address_space_memory;
> +    entry.iova = iova & ~(size - 1);
> +    entry.addr_mask = size - 1;
> +    entry.perm = IOMMU_NONE;
> +
> +    memory_region_notify_one(n, &entry);
> +
> +    /* then figure out if a new mapping needs to be applied */
> +    sbc->page_walk_64(&cfg, iova, iova + entry.addr_mask , false,
> +                      smmuv3_map_hook, n);
> +}
> +
> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> +                                 uint64_t iova)
> +{
> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> +    SMMUV3State *s = sdev->smmu;
> +    size_t target_page_size = qemu_target_page_size();
> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> +    SMMUTransCfg cfg = {};
> +    IOMMUTLBEntry entry;
> +    int ret;
> +
> +    trace_smmuv3_replay_single(mr->parent_obj.name, iova, n);
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> +
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +
> +    /* first unmap */
> +    entry.target_as = &address_space_memory;
> +    entry.iova = iova & ~(target_page_size - 1);
> +    entry.addr_mask = target_page_size - 1;
> +    entry.perm = IOMMU_NONE;
> +
> +    memory_region_notify_one(n, &entry);
> +
> +    /* then figure out if a new mapping needs to be applied */
> +    sbc->page_walk_64(&cfg, iova, iova + 1, false,
> +                      smmuv3_map_hook, n);
> +}
> +
> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
> +                                       IOMMUNotifierFlag old,
> +                                       IOMMUNotifierFlag new)
> +{
> +    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
> +    SMMUV3State *s3 = sdev->smmu;
> +    SMMUState *s = &(s3->smmu_state);
> +    SMMUNotifierNode *node = NULL;
> +    SMMUNotifierNode *next_node = NULL;
> +
> +    if (old == IOMMU_NOTIFIER_NONE) {
> +        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
> +        node = g_malloc0(sizeof(*node));
> +        node->sdev = sdev;
> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
> +        return;
> +    }
> +
> +    /* update notifier node with new flags */
> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
> +        if (node->sdev == sdev) {
> +            if (new == IOMMU_NOTIFIER_NONE) {
> +                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
> +                QLIST_REMOVE(node, next);
> +                g_free(node);
> +            }
> +            return;
> +        }
> +    }
> +}
>  
>  static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>                                          uint64_t val)
> @@ -1125,6 +1382,8 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>  
>      imrc->translate = smmuv3_translate;
> +    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
> +    imrc->replay = smmuv3_replay;
>  }
>  
>  static const TypeInfo smmuv3_type_info = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index f9b9cbe..8228e26 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -27,6 +27,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>  smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
>  smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
>  smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
> +smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
>  smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
>  smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
>  smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
> @@ -50,3 +51,16 @@ smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t
>  smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
>  smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
>  smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
> +
> +smmuv3_replay(uint16_t sid, bool enabled) "sid=%d, enabled=%d"
> +smmuv3_replay_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> +smmuv3_map_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> +smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
> +smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
> +smmuv3_replay_single(const char *name, uint64_t iova, void *n) "iommu mr=%s iova=0x%"PRIx64" n=%p"
> +smmuv3_replay_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
> +smmuv3_replay_all(const char *name) "iommu mr=%s"
> +smmuv3_notify_all(const char *name, uint64_t iova) "iommu mr=%s iova=0x%"PRIx64
> +smmuv3_unmap_notifier(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
> +smmuv3_context_device_invalidate(uint32_t sid) "sid=%d"
> +
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration
  2017-08-23  4:24   ` Linu Cherian
@ 2017-08-23  6:39     ` Auger Eric
  2017-08-25 17:13       ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Auger Eric @ 2017-08-23  6:39 UTC (permalink / raw)
  To: Linu Cherian
  Cc: peter.maydell, drjones, tcain, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, jean-philippe.brucker, tn, bharat.bhushan, mst,
	will.deacon, qemu-devel, peterx, alex.williamson, qemu-arm,
	christoffer.dall, linu.cherian, robin.murphy, prem.mallappa,
	eric.auger.pro

Hi Linu,

On 23/08/2017 06:24, Linu Cherian wrote:
> Hi Eric,
> 
> 
> On Fri Aug 11, 2017 at 04:22:33PM +0200, Eric Auger wrote:
>> This patch allows doing PCIe passthrough with a guest exposed
>> with a vSMMUv3. It implements the replay and notify_flag_changed
>> iommu ops. Also on TLB and data structure invalidation commands,
>> we replay the mappings so that the physical IOMMU implements
>> updated stage 1 settings (Guest IOVA -> Guest PA) + stage 2 settings.
>>
>> This works only if the guest smmuv3 driver implements the
>> "tlbi-on-map" option.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> Tried out launching a guest with Qemu option "-machine virt-2.10,smmu"
> and a 1G Ethernet controller as vfio-pci device. It works fine for me.

Hum sorry, I forgot to update the cover letter. You need to use -machine
virt-2.11,smmu for the instantiation as the 2.10 machine was released as
part of 2.10 and those changes only apply on 2.11 mach virt introduced
in this series. Please apologize for the pain.

I am going to release a new version this week fixing my last DPDK bug
(at least I am aware of ). In this new version, the instantiation method
will change to -device smmuv3 which is closer to what is done on Intel.

By the way I will also take time to provide some more info about the
VFIO integration as it is implemented in this series although this
latter may evolve due to NAK of kernel FW quirk.

Thank you for testing!

Best Regards

Eric
> 
> Qemu source: https://github.com/eauger/qemu.git Branch: v2.10.0-rc2-SMMU-v6  
> 
> But had to make this change, 
> 
> --- a/hw/arm/virt.c                                                                    
> +++ b/hw/arm/virt.c                                                                    
> @@ -1806,7 +1806,7 @@ static void virt_machine_2_10_options(MachineClass *mc)          
>      virt_machine_2_11_options(mc);                                                    
>      SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);                                         
>                                                                                        
> -    vmc->no_smmu = true;                                                              
> +    vmc->no_smmu = false;                                                             
>  }                                                                                     
>  DEFINE_VIRT_MACHINE(2, 10) 
> 
> so that qemu doesnt complain about "Property .smmu not found"
> 
> Will let you know if i have updates on further testing.
> 
> Thanks.
> 
>>
>> ---
>>
>> v5 -> v6:
>> - use IOMMUMemoryRegion
>> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>>   (goes along with TLBI_ON_MAP FW quirk)
>> - replay systematically unmap the whole range first
>> - smmuv3_map_hook does not unmap anymore and the unmap is done
>>   before the replay
>> - add and use smmuv3_context_device_invalidate instead of
>>   blindly replaying everything
>> ---
>>  hw/arm/smmuv3-internal.h |   1 +
>>  hw/arm/smmuv3.c          | 265 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  hw/arm/trace-events      |  14 +++
>>  3 files changed, 277 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index e255df1..ac4628f 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -344,6 +344,7 @@ enum {
>>      SMMU_CMD_RESUME          = 0x44,
>>      SMMU_CMD_STALL_TERM,
>>      SMMU_CMD_SYNC,          /* 0x46 */
>> +    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
>>  };
>>  
>>  static const char *cmd_stringify[] = {
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index e195a0e..89fb116 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -25,6 +25,7 @@
>>  #include "exec/address-spaces.h"
>>  #include "trace.h"
>>  #include "qemu/error-report.h"
>> +#include "exec/target_page.h"
>>  
>>  #include "hw/arm/smmuv3.h"
>>  #include "smmuv3-internal.h"
>> @@ -143,6 +144,71 @@ static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
>>      return ret;
>>  }
>>  
>> +static void smmuv3_replay_all(SMMUState *s)
>> +{
>> +    SMMUNotifierNode *node;
>> +
>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +        trace_smmuv3_replay_all(node->sdev->iommu.parent_obj.name);
>> +        memory_region_iommu_replay_all(&node->sdev->iommu);
>> +    }
>> +}
>> +
>> +/* Replay the mappings for a given streamid */
>> +static void smmuv3_context_device_invalidate(SMMUState *s, uint16_t sid)
>> +{
>> +    uint8_t bus_n, devfn;
>> +    SMMUPciBus *smmu_bus;
>> +    SMMUDevice *smmu;
>> +
>> +    trace_smmuv3_context_device_invalidate(sid);
>> +    bus_n = PCI_BUS_NUM(sid);
>> +    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
>> +    if (smmu_bus) {
>> +        devfn = PCI_FUNC(sid);
>> +        smmu = smmu_bus->pbdev[devfn];
>> +        if (smmu) {
>> +            memory_region_iommu_replay_all(&smmu->iommu);
>> +        }
>> +    }
>> +}
>> +
>> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>> +                                 uint64_t iova);
>> +
>> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>> +                                 uint64_t iova, size_t nb_pages);
>> +
>> +static void smmuv3_notify_single(SMMUState *s, uint64_t iova)
>> +{
>> +    SMMUNotifierNode *node;
>> +
>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
>> +        IOMMUNotifier *n;
>> +
>> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
>> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
>> +            smmuv3_replay_single(mr, n, iova);
>> +        }
>> +    }
>> +}
>> +
>> +static void smmuv3_notify_range(SMMUState *s, uint64_t iova, size_t size)
>> +{
>> +    SMMUNotifierNode *node;
>> +
>> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
>> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
>> +        IOMMUNotifier *n;
>> +
>> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
>> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
>> +            smmuv3_replay_range(mr, n, iova, size);
>> +        }
>> +    }
>> +}
>> +
>>  static int smmu_cmdq_consume(SMMUV3State *s)
>>  {
>>      uint32_t error = SMMU_CMD_ERR_NONE;
>> @@ -178,28 +244,38 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>>              break;
>>          case SMMU_CMD_PREFETCH_CONFIG:
>>          case SMMU_CMD_PREFETCH_ADDR:
>> +            break;
>>          case SMMU_CMD_CFGI_STE:
>>          {
>>               uint32_t streamid = cmd.word[1];
>>  
>>               trace_smmuv3_cmdq_cfgi_ste(streamid);
>> -            break;
>> +             smmuv3_context_device_invalidate(&s->smmu_state, streamid);
>> +             break;
>>          }
>>          case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
>>          {
>> -            uint32_t start = cmd.word[1], range, end;
>> +            uint32_t start = cmd.word[1], range, end, i;
>>  
>>              range = extract32(cmd.word[2], 0, 5);
>>              end = start + (1 << (range + 1)) - 1;
>>              trace_smmuv3_cmdq_cfgi_ste_range(start, end);
>> +            for (i = start; i <= end; i++) {
>> +                smmuv3_context_device_invalidate(&s->smmu_state, i);
>> +            }
>>              break;
>>          }
>>          case SMMU_CMD_CFGI_CD:
>>          case SMMU_CMD_CFGI_CD_ALL:
>> +        {
>> +             uint32_t streamid = cmd.word[1];
>> +
>> +            smmuv3_context_device_invalidate(&s->smmu_state, streamid);
>>              break;
>> +        }
>>          case SMMU_CMD_TLBI_NH_ALL:
>>          case SMMU_CMD_TLBI_NH_ASID:
>> -            printf("%s TLBI* replay\n", __func__);
>> +            smmuv3_replay_all(&s->smmu_state);
>>              break;
>>          case SMMU_CMD_TLBI_NH_VA:
>>          {
>> @@ -210,6 +286,20 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>>              uint64_t addr = high << 32 | (low << 12);
>>  
>>              trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
>> +            smmuv3_notify_single(&s->smmu_state, addr);
>> +            break;
>> +        }
>> +        case SMMU_CMD_TLBI_NH_VA_AM:
>> +        {
>> +            int asid = extract32(cmd.word[1], 16, 16);
>> +            int am = extract32(cmd.word[1], 0, 16);
>> +            uint64_t low = extract32(cmd.word[2], 12, 20);
>> +            uint64_t high = cmd.word[3];
>> +            uint64_t addr = high << 32 | (low << 12);
>> +            size_t size = am << 12;
>> +
>> +            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
>> +            smmuv3_notify_range(&s->smmu_state, addr, size);
>>              break;
>>          }
>>          case SMMU_CMD_TLBI_NH_VAA:
>> @@ -222,6 +312,7 @@ static int smmu_cmdq_consume(SMMUV3State *s)
>>          case SMMU_CMD_TLBI_S12_VMALL:
>>          case SMMU_CMD_TLBI_S2_IPA:
>>          case SMMU_CMD_TLBI_NSNH_ALL:
>> +            smmuv3_replay_all(&s->smmu_state);
>>              break;
>>          case SMMU_CMD_ATC_INV:
>>          case SMMU_CMD_PRI_RESP:
>> @@ -804,6 +895,172 @@ out:
>>      return entry;
>>  }
>>  
>> +static int smmuv3_replay_hook(IOMMUTLBEntry *entry, void *private)
>> +{
>> +    trace_smmuv3_replay_hook(entry->iova, entry->translated_addr,
>> +                             entry->addr_mask, entry->perm);
>> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
>> +    return 0;
>> +}
>> +
>> +static int smmuv3_map_hook(IOMMUTLBEntry *entry, void *private)
>> +{
>> +    trace_smmuv3_map_hook(entry->iova, entry->translated_addr,
>> +                          entry->addr_mask, entry->perm);
>> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
>> +    return 0;
>> +}
>> +
>> +/* Unmap the whole range in the notifier's scope. */
>> +static void smmuv3_unmap_notifier(SMMUDevice *sdev, IOMMUNotifier *n)
>> +{
>> +    IOMMUTLBEntry entry;
>> +    hwaddr size;
>> +    hwaddr start = n->start;
>> +    hwaddr end = n->end;
>> +
>> +    size = end - start + 1;
>> +
>> +    entry.target_as = &address_space_memory;
>> +    /* Adjust iova for the size */
>> +    entry.iova = n->start & ~(size - 1);
>> +    /* This field is meaningless for unmap */
>> +    entry.translated_addr = 0;
>> +    entry.perm = IOMMU_NONE;
>> +    entry.addr_mask = size - 1;
>> +
>> +    /* TODO: check start/end/size/mask */
>> +
>> +    trace_smmuv3_unmap_notifier(pci_bus_num(sdev->bus),
>> +                                PCI_SLOT(sdev->devfn),
>> +                                PCI_FUNC(sdev->devfn),
>> +                                entry.iova, size);
>> +
>> +    memory_region_notify_one(n, &entry);
>> +}
>> +
>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>> +{
>> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
>> +    SMMUV3State *s = sdev->smmu;
>> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
>> +    SMMUTransCfg cfg = {};
>> +    int ret;
>> +
>> +    smmuv3_unmap_notifier(sdev, n);
>> +
>> +    ret = smmuv3_decode_config(mr, &cfg);
>> +    if (ret) {
>> +        error_report("%s error decoding the configuration for iommu mr=%s",
>> +                     __func__, mr->parent_obj.name);
>> +    }
>> +
>> +    if (cfg.disabled || cfg.bypassed) {
>> +        return;
>> +    }
>> +    /* is the smmu enabled */
>> +    sbc->page_walk_64(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>> +                      smmuv3_replay_hook, n);
>> +}
>> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>> +                                 uint64_t iova, size_t size)
>> +{
>> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
>> +    SMMUV3State *s = sdev->smmu;
>> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
>> +    SMMUTransCfg cfg = {};
>> +    IOMMUTLBEntry entry;
>> +    int ret;
>> +
>> +    trace_smmuv3_replay_range(mr->parent_obj.name, iova, size, n);
>> +    ret = smmuv3_decode_config(mr, &cfg);
>> +    if (ret) {
>> +        error_report("%s error decoding the configuration for iommu mr=%s",
>> +                     __func__, mr->parent_obj.name);
>> +    }
>> +
>> +    if (cfg.disabled || cfg.bypassed) {
>> +        return;
>> +    }
>> +
>> +    /* first unmap */
>> +    entry.target_as = &address_space_memory;
>> +    entry.iova = iova & ~(size - 1);
>> +    entry.addr_mask = size - 1;
>> +    entry.perm = IOMMU_NONE;
>> +
>> +    memory_region_notify_one(n, &entry);
>> +
>> +    /* then figure out if a new mapping needs to be applied */
>> +    sbc->page_walk_64(&cfg, iova, iova + entry.addr_mask , false,
>> +                      smmuv3_map_hook, n);
>> +}
>> +
>> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>> +                                 uint64_t iova)
>> +{
>> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
>> +    SMMUV3State *s = sdev->smmu;
>> +    size_t target_page_size = qemu_target_page_size();
>> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
>> +    SMMUTransCfg cfg = {};
>> +    IOMMUTLBEntry entry;
>> +    int ret;
>> +
>> +    trace_smmuv3_replay_single(mr->parent_obj.name, iova, n);
>> +    ret = smmuv3_decode_config(mr, &cfg);
>> +    if (ret) {
>> +        error_report("%s error decoding the configuration for iommu mr=%s",
>> +                     __func__, mr->parent_obj.name);
>> +    }
>> +
>> +    if (cfg.disabled || cfg.bypassed) {
>> +        return;
>> +    }
>> +
>> +    /* first unmap */
>> +    entry.target_as = &address_space_memory;
>> +    entry.iova = iova & ~(target_page_size - 1);
>> +    entry.addr_mask = target_page_size - 1;
>> +    entry.perm = IOMMU_NONE;
>> +
>> +    memory_region_notify_one(n, &entry);
>> +
>> +    /* then figure out if a new mapping needs to be applied */
>> +    sbc->page_walk_64(&cfg, iova, iova + 1, false,
>> +                      smmuv3_map_hook, n);
>> +}
>> +
>> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
>> +                                       IOMMUNotifierFlag old,
>> +                                       IOMMUNotifierFlag new)
>> +{
>> +    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
>> +    SMMUV3State *s3 = sdev->smmu;
>> +    SMMUState *s = &(s3->smmu_state);
>> +    SMMUNotifierNode *node = NULL;
>> +    SMMUNotifierNode *next_node = NULL;
>> +
>> +    if (old == IOMMU_NOTIFIER_NONE) {
>> +        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
>> +        node = g_malloc0(sizeof(*node));
>> +        node->sdev = sdev;
>> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
>> +        return;
>> +    }
>> +
>> +    /* update notifier node with new flags */
>> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
>> +        if (node->sdev == sdev) {
>> +            if (new == IOMMU_NOTIFIER_NONE) {
>> +                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
>> +                QLIST_REMOVE(node, next);
>> +                g_free(node);
>> +            }
>> +            return;
>> +        }
>> +    }
>> +}
>>  
>>  static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>>                                          uint64_t val)
>> @@ -1125,6 +1382,8 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
>>  
>>      imrc->translate = smmuv3_translate;
>> +    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>> +    imrc->replay = smmuv3_replay;
>>  }
>>  
>>  static const TypeInfo smmuv3_type_info = {
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index f9b9cbe..8228e26 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -27,6 +27,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>>  smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
>>  smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
>>  smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
>> +smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
>>  smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
>>  smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
>>  smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
>> @@ -50,3 +51,16 @@ smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t
>>  smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
>>  smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
>>  smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
>> +
>> +smmuv3_replay(uint16_t sid, bool enabled) "sid=%d, enabled=%d"
>> +smmuv3_replay_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>> +smmuv3_map_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>> +smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
>> +smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
>> +smmuv3_replay_single(const char *name, uint64_t iova, void *n) "iommu mr=%s iova=0x%"PRIx64" n=%p"
>> +smmuv3_replay_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>> +smmuv3_replay_all(const char *name) "iommu mr=%s"
>> +smmuv3_notify_all(const char *name, uint64_t iova) "iommu mr=%s iova=0x%"PRIx64
>> +smmuv3_unmap_notifier(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
>> +smmuv3_context_device_invalidate(uint32_t sid) "sid=%d"
>> +
>> -- 
>> 2.5.5
>>
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration
  2017-08-23  6:39     ` Auger Eric
@ 2017-08-25 17:13       ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2017-08-25 17:13 UTC (permalink / raw)
  To: Auger Eric
  Cc: Linu Cherian, peter.maydell, drjones, tcain, Radha.Chintakuntla,
	Sunil.Goutham, mohun106, jean-philippe.brucker, tn,
	bharat.bhushan, will.deacon, qemu-devel, peterx, alex.williamson,
	qemu-arm, christoffer.dall, linu.cherian, robin.murphy,
	prem.mallappa, eric.auger.pro

On Wed, Aug 23, 2017 at 08:39:18AM +0200, Auger Eric wrote:
> Hi Linu,
> 
> On 23/08/2017 06:24, Linu Cherian wrote:
> > Hi Eric,
> > 
> > 
> > On Fri Aug 11, 2017 at 04:22:33PM +0200, Eric Auger wrote:
> >> This patch allows doing PCIe passthrough with a guest exposed
> >> with a vSMMUv3. It implements the replay and notify_flag_changed
> >> iommu ops. Also on TLB and data structure invalidation commands,
> >> we replay the mappings so that the physical IOMMU implements
> >> updated stage 1 settings (Guest IOVA -> Guest PA) + stage 2 settings.
> >>
> >> This works only if the guest smmuv3 driver implements the
> >> "tlbi-on-map" option.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > 
> > Tried out launching a guest with Qemu option "-machine virt-2.10,smmu"
> > and a 1G Ethernet controller as vfio-pci device. It works fine for me.
> 
> Hum sorry, I forgot to update the cover letter. You need to use -machine
> virt-2.11,smmu for the instantiation as the 2.10 machine was released as
> part of 2.10 and those changes only apply on 2.11 mach virt introduced
> in this series. Please apologize for the pain.
> 
> I am going to release a new version this week fixing my last DPDK bug
> (at least I am aware of ). In this new version, the instantiation method
> will change to -device smmuv3 which is closer to what is done on Intel.
>
> By the way I will also take time to provide some more info about the
> VFIO integration as it is implemented in this series although this
> latter may evolve due to NAK of kernel FW quirk.

Yes I think changing compat string to avoid pretending this is smmuv3
was a hard requirement. So you might need to do -device qemu-smmuv3
if you want VFIO support with this PV quirk.

For now, how about a subset of the functionality with VFIO disabled?
People can use it with virtio for now.


> Thank you for testing!
> 
> Best Regards
> 
> Eric
> > 
> > Qemu source: https://github.com/eauger/qemu.git Branch: v2.10.0-rc2-SMMU-v6  
> > 
> > But had to make this change, 
> > 
> > --- a/hw/arm/virt.c                                                                    
> > +++ b/hw/arm/virt.c                                                                    
> > @@ -1806,7 +1806,7 @@ static void virt_machine_2_10_options(MachineClass *mc)          
> >      virt_machine_2_11_options(mc);                                                    
> >      SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_10);                                         
> >                                                                                        
> > -    vmc->no_smmu = true;                                                              
> > +    vmc->no_smmu = false;                                                             
> >  }                                                                                     
> >  DEFINE_VIRT_MACHINE(2, 10) 
> > 
> > so that qemu doesnt complain about "Property .smmu not found"
> > 
> > Will let you know if i have updates on further testing.
> > 
> > Thanks.
> > 
> >>
> >> ---
> >>
> >> v5 -> v6:
> >> - use IOMMUMemoryRegion
> >> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
> >>   (goes along with TLBI_ON_MAP FW quirk)
> >> - replay systematically unmap the whole range first
> >> - smmuv3_map_hook does not unmap anymore and the unmap is done
> >>   before the replay
> >> - add and use smmuv3_context_device_invalidate instead of
> >>   blindly replaying everything
> >> ---
> >>  hw/arm/smmuv3-internal.h |   1 +
> >>  hw/arm/smmuv3.c          | 265 ++++++++++++++++++++++++++++++++++++++++++++++-
> >>  hw/arm/trace-events      |  14 +++
> >>  3 files changed, 277 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> >> index e255df1..ac4628f 100644
> >> --- a/hw/arm/smmuv3-internal.h
> >> +++ b/hw/arm/smmuv3-internal.h
> >> @@ -344,6 +344,7 @@ enum {
> >>      SMMU_CMD_RESUME          = 0x44,
> >>      SMMU_CMD_STALL_TERM,
> >>      SMMU_CMD_SYNC,          /* 0x46 */
> >> +    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
> >>  };
> >>  
> >>  static const char *cmd_stringify[] = {
> >> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> >> index e195a0e..89fb116 100644
> >> --- a/hw/arm/smmuv3.c
> >> +++ b/hw/arm/smmuv3.c
> >> @@ -25,6 +25,7 @@
> >>  #include "exec/address-spaces.h"
> >>  #include "trace.h"
> >>  #include "qemu/error-report.h"
> >> +#include "exec/target_page.h"
> >>  
> >>  #include "hw/arm/smmuv3.h"
> >>  #include "smmuv3-internal.h"
> >> @@ -143,6 +144,71 @@ static MemTxResult smmu_read_cmdq(SMMUV3State *s, Cmd *cmd)
> >>      return ret;
> >>  }
> >>  
> >> +static void smmuv3_replay_all(SMMUState *s)
> >> +{
> >> +    SMMUNotifierNode *node;
> >> +
> >> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >> +        trace_smmuv3_replay_all(node->sdev->iommu.parent_obj.name);
> >> +        memory_region_iommu_replay_all(&node->sdev->iommu);
> >> +    }
> >> +}
> >> +
> >> +/* Replay the mappings for a given streamid */
> >> +static void smmuv3_context_device_invalidate(SMMUState *s, uint16_t sid)
> >> +{
> >> +    uint8_t bus_n, devfn;
> >> +    SMMUPciBus *smmu_bus;
> >> +    SMMUDevice *smmu;
> >> +
> >> +    trace_smmuv3_context_device_invalidate(sid);
> >> +    bus_n = PCI_BUS_NUM(sid);
> >> +    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
> >> +    if (smmu_bus) {
> >> +        devfn = PCI_FUNC(sid);
> >> +        smmu = smmu_bus->pbdev[devfn];
> >> +        if (smmu) {
> >> +            memory_region_iommu_replay_all(&smmu->iommu);
> >> +        }
> >> +    }
> >> +}
> >> +
> >> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> >> +                                 uint64_t iova);
> >> +
> >> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> >> +                                 uint64_t iova, size_t nb_pages);
> >> +
> >> +static void smmuv3_notify_single(SMMUState *s, uint64_t iova)
> >> +{
> >> +    SMMUNotifierNode *node;
> >> +
> >> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> >> +        IOMMUNotifier *n;
> >> +
> >> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> >> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> >> +            smmuv3_replay_single(mr, n, iova);
> >> +        }
> >> +    }
> >> +}
> >> +
> >> +static void smmuv3_notify_range(SMMUState *s, uint64_t iova, size_t size)
> >> +{
> >> +    SMMUNotifierNode *node;
> >> +
> >> +    QLIST_FOREACH(node, &s->notifiers_list, next) {
> >> +        IOMMUMemoryRegion *mr = &node->sdev->iommu;
> >> +        IOMMUNotifier *n;
> >> +
> >> +        trace_smmuv3_notify_all(node->sdev->iommu.parent_obj.name, iova);
> >> +        IOMMU_NOTIFIER_FOREACH(n, mr) {
> >> +            smmuv3_replay_range(mr, n, iova, size);
> >> +        }
> >> +    }
> >> +}
> >> +
> >>  static int smmu_cmdq_consume(SMMUV3State *s)
> >>  {
> >>      uint32_t error = SMMU_CMD_ERR_NONE;
> >> @@ -178,28 +244,38 @@ static int smmu_cmdq_consume(SMMUV3State *s)
> >>              break;
> >>          case SMMU_CMD_PREFETCH_CONFIG:
> >>          case SMMU_CMD_PREFETCH_ADDR:
> >> +            break;
> >>          case SMMU_CMD_CFGI_STE:
> >>          {
> >>               uint32_t streamid = cmd.word[1];
> >>  
> >>               trace_smmuv3_cmdq_cfgi_ste(streamid);
> >> -            break;
> >> +             smmuv3_context_device_invalidate(&s->smmu_state, streamid);
> >> +             break;
> >>          }
> >>          case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
> >>          {
> >> -            uint32_t start = cmd.word[1], range, end;
> >> +            uint32_t start = cmd.word[1], range, end, i;
> >>  
> >>              range = extract32(cmd.word[2], 0, 5);
> >>              end = start + (1 << (range + 1)) - 1;
> >>              trace_smmuv3_cmdq_cfgi_ste_range(start, end);
> >> +            for (i = start; i <= end; i++) {
> >> +                smmuv3_context_device_invalidate(&s->smmu_state, i);
> >> +            }
> >>              break;
> >>          }
> >>          case SMMU_CMD_CFGI_CD:
> >>          case SMMU_CMD_CFGI_CD_ALL:
> >> +        {
> >> +             uint32_t streamid = cmd.word[1];
> >> +
> >> +            smmuv3_context_device_invalidate(&s->smmu_state, streamid);
> >>              break;
> >> +        }
> >>          case SMMU_CMD_TLBI_NH_ALL:
> >>          case SMMU_CMD_TLBI_NH_ASID:
> >> -            printf("%s TLBI* replay\n", __func__);
> >> +            smmuv3_replay_all(&s->smmu_state);
> >>              break;
> >>          case SMMU_CMD_TLBI_NH_VA:
> >>          {
> >> @@ -210,6 +286,20 @@ static int smmu_cmdq_consume(SMMUV3State *s)
> >>              uint64_t addr = high << 32 | (low << 12);
> >>  
> >>              trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
> >> +            smmuv3_notify_single(&s->smmu_state, addr);
> >> +            break;
> >> +        }
> >> +        case SMMU_CMD_TLBI_NH_VA_AM:
> >> +        {
> >> +            int asid = extract32(cmd.word[1], 16, 16);
> >> +            int am = extract32(cmd.word[1], 0, 16);
> >> +            uint64_t low = extract32(cmd.word[2], 12, 20);
> >> +            uint64_t high = cmd.word[3];
> >> +            uint64_t addr = high << 32 | (low << 12);
> >> +            size_t size = am << 12;
> >> +
> >> +            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
> >> +            smmuv3_notify_range(&s->smmu_state, addr, size);
> >>              break;
> >>          }
> >>          case SMMU_CMD_TLBI_NH_VAA:
> >> @@ -222,6 +312,7 @@ static int smmu_cmdq_consume(SMMUV3State *s)
> >>          case SMMU_CMD_TLBI_S12_VMALL:
> >>          case SMMU_CMD_TLBI_S2_IPA:
> >>          case SMMU_CMD_TLBI_NSNH_ALL:
> >> +            smmuv3_replay_all(&s->smmu_state);
> >>              break;
> >>          case SMMU_CMD_ATC_INV:
> >>          case SMMU_CMD_PRI_RESP:
> >> @@ -804,6 +895,172 @@ out:
> >>      return entry;
> >>  }
> >>  
> >> +static int smmuv3_replay_hook(IOMMUTLBEntry *entry, void *private)
> >> +{
> >> +    trace_smmuv3_replay_hook(entry->iova, entry->translated_addr,
> >> +                             entry->addr_mask, entry->perm);
> >> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> >> +    return 0;
> >> +}
> >> +
> >> +static int smmuv3_map_hook(IOMMUTLBEntry *entry, void *private)
> >> +{
> >> +    trace_smmuv3_map_hook(entry->iova, entry->translated_addr,
> >> +                          entry->addr_mask, entry->perm);
> >> +    memory_region_notify_one((IOMMUNotifier *)private, entry);
> >> +    return 0;
> >> +}
> >> +
> >> +/* Unmap the whole range in the notifier's scope. */
> >> +static void smmuv3_unmap_notifier(SMMUDevice *sdev, IOMMUNotifier *n)
> >> +{
> >> +    IOMMUTLBEntry entry;
> >> +    hwaddr size;
> >> +    hwaddr start = n->start;
> >> +    hwaddr end = n->end;
> >> +
> >> +    size = end - start + 1;
> >> +
> >> +    entry.target_as = &address_space_memory;
> >> +    /* Adjust iova for the size */
> >> +    entry.iova = n->start & ~(size - 1);
> >> +    /* This field is meaningless for unmap */
> >> +    entry.translated_addr = 0;
> >> +    entry.perm = IOMMU_NONE;
> >> +    entry.addr_mask = size - 1;
> >> +
> >> +    /* TODO: check start/end/size/mask */
> >> +
> >> +    trace_smmuv3_unmap_notifier(pci_bus_num(sdev->bus),
> >> +                                PCI_SLOT(sdev->devfn),
> >> +                                PCI_FUNC(sdev->devfn),
> >> +                                entry.iova, size);
> >> +
> >> +    memory_region_notify_one(n, &entry);
> >> +}
> >> +
> >> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
> >> +{
> >> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> >> +    SMMUV3State *s = sdev->smmu;
> >> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> >> +    SMMUTransCfg cfg = {};
> >> +    int ret;
> >> +
> >> +    smmuv3_unmap_notifier(sdev, n);
> >> +
> >> +    ret = smmuv3_decode_config(mr, &cfg);
> >> +    if (ret) {
> >> +        error_report("%s error decoding the configuration for iommu mr=%s",
> >> +                     __func__, mr->parent_obj.name);
> >> +    }
> >> +
> >> +    if (cfg.disabled || cfg.bypassed) {
> >> +        return;
> >> +    }
> >> +    /* is the smmu enabled */
> >> +    sbc->page_walk_64(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
> >> +                      smmuv3_replay_hook, n);
> >> +}
> >> +static void smmuv3_replay_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> >> +                                 uint64_t iova, size_t size)
> >> +{
> >> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> >> +    SMMUV3State *s = sdev->smmu;
> >> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> >> +    SMMUTransCfg cfg = {};
> >> +    IOMMUTLBEntry entry;
> >> +    int ret;
> >> +
> >> +    trace_smmuv3_replay_range(mr->parent_obj.name, iova, size, n);
> >> +    ret = smmuv3_decode_config(mr, &cfg);
> >> +    if (ret) {
> >> +        error_report("%s error decoding the configuration for iommu mr=%s",
> >> +                     __func__, mr->parent_obj.name);
> >> +    }
> >> +
> >> +    if (cfg.disabled || cfg.bypassed) {
> >> +        return;
> >> +    }
> >> +
> >> +    /* first unmap */
> >> +    entry.target_as = &address_space_memory;
> >> +    entry.iova = iova & ~(size - 1);
> >> +    entry.addr_mask = size - 1;
> >> +    entry.perm = IOMMU_NONE;
> >> +
> >> +    memory_region_notify_one(n, &entry);
> >> +
> >> +    /* then figure out if a new mapping needs to be applied */
> >> +    sbc->page_walk_64(&cfg, iova, iova + entry.addr_mask , false,
> >> +                      smmuv3_map_hook, n);
> >> +}
> >> +
> >> +static void smmuv3_replay_single(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
> >> +                                 uint64_t iova)
> >> +{
> >> +    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
> >> +    SMMUV3State *s = sdev->smmu;
> >> +    size_t target_page_size = qemu_target_page_size();
> >> +    SMMUBaseClass *sbc = SMMU_DEVICE_GET_CLASS(s);
> >> +    SMMUTransCfg cfg = {};
> >> +    IOMMUTLBEntry entry;
> >> +    int ret;
> >> +
> >> +    trace_smmuv3_replay_single(mr->parent_obj.name, iova, n);
> >> +    ret = smmuv3_decode_config(mr, &cfg);
> >> +    if (ret) {
> >> +        error_report("%s error decoding the configuration for iommu mr=%s",
> >> +                     __func__, mr->parent_obj.name);
> >> +    }
> >> +
> >> +    if (cfg.disabled || cfg.bypassed) {
> >> +        return;
> >> +    }
> >> +
> >> +    /* first unmap */
> >> +    entry.target_as = &address_space_memory;
> >> +    entry.iova = iova & ~(target_page_size - 1);
> >> +    entry.addr_mask = target_page_size - 1;
> >> +    entry.perm = IOMMU_NONE;
> >> +
> >> +    memory_region_notify_one(n, &entry);
> >> +
> >> +    /* then figure out if a new mapping needs to be applied */
> >> +    sbc->page_walk_64(&cfg, iova, iova + 1, false,
> >> +                      smmuv3_map_hook, n);
> >> +}
> >> +
> >> +static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
> >> +                                       IOMMUNotifierFlag old,
> >> +                                       IOMMUNotifierFlag new)
> >> +{
> >> +    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
> >> +    SMMUV3State *s3 = sdev->smmu;
> >> +    SMMUState *s = &(s3->smmu_state);
> >> +    SMMUNotifierNode *node = NULL;
> >> +    SMMUNotifierNode *next_node = NULL;
> >> +
> >> +    if (old == IOMMU_NOTIFIER_NONE) {
> >> +        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
> >> +        node = g_malloc0(sizeof(*node));
> >> +        node->sdev = sdev;
> >> +        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
> >> +        return;
> >> +    }
> >> +
> >> +    /* update notifier node with new flags */
> >> +    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
> >> +        if (node->sdev == sdev) {
> >> +            if (new == IOMMU_NOTIFIER_NONE) {
> >> +                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
> >> +                QLIST_REMOVE(node, next);
> >> +                g_free(node);
> >> +            }
> >> +            return;
> >> +        }
> >> +    }
> >> +}
> >>  
> >>  static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
> >>                                          uint64_t val)
> >> @@ -1125,6 +1382,8 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
> >>      IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
> >>  
> >>      imrc->translate = smmuv3_translate;
> >> +    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
> >> +    imrc->replay = smmuv3_replay;
> >>  }
> >>  
> >>  static const TypeInfo smmuv3_type_info = {
> >> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> >> index f9b9cbe..8228e26 100644
> >> --- a/hw/arm/trace-events
> >> +++ b/hw/arm/trace-events
> >> @@ -27,6 +27,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
> >>  smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
> >>  smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
> >>  smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
> >> +smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
> >>  smmuv3_cmdq_consume_sev(void) "CMD_SYNC CS=SEV not supported, ignoring"
> >>  smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
> >>  smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
> >> @@ -50,3 +51,16 @@ smmuv3_dump_ste(int i, uint32_t word0, int j,  uint32_t word1) "STE[%2d]: 0x%x\t
> >>  smmuv3_dump_cd(int i, uint32_t word0, int j,  uint32_t word1) "CD[%2d]: 0x%x\t CD[%2d]: 0x%x"
> >>  smmuv3_dump_cmd(int i, uint32_t word0, int j,  uint32_t word1) "CMD[%2d]: 0x%x\t CMD[%2d]: 0x%x"
> >>  smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
> >> +
> >> +smmuv3_replay(uint16_t sid, bool enabled) "sid=%d, enabled=%d"
> >> +smmuv3_replay_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> >> +smmuv3_map_hook(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
> >> +smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
> >> +smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
> >> +smmuv3_replay_single(const char *name, uint64_t iova, void *n) "iommu mr=%s iova=0x%"PRIx64" n=%p"
> >> +smmuv3_replay_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
> >> +smmuv3_replay_all(const char *name) "iommu mr=%s"
> >> +smmuv3_notify_all(const char *name, uint64_t iova) "iommu mr=%s iova=0x%"PRIx64
> >> +smmuv3_unmap_notifier(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
> >> +smmuv3_context_device_invalidate(uint32_t sid) "sid=%d"
> >> +
> >> -- 
> >> 2.5.5
> >>
> >>
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-08-25 17:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-11 14:22 [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 1/9] hw/arm/smmu-common: smmu base class Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 2/9] hw/arm/smmuv3: smmuv3 emulation model Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 3/9] hw/arm/virt: Add SMMUv3 to the virt board Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 4/9] hw/arm/virt: Add 2.11 machine type Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 5/9] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 6/9] hw/arm/virt: Add tlbi-on-map property to the smmuv3 node Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 7/9] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 8/9] hw/arm/smmuv3: VFIO integration Eric Auger
2017-08-21  1:21   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-08-23  4:24   ` Linu Cherian
2017-08-23  6:39     ` Auger Eric
2017-08-25 17:13       ` Michael S. Tsirkin
2017-08-11 14:22 ` [Qemu-devel] [RFC v6 9/9] hw/arm/virt-acpi-build: Use the ACPI_IORT_SMMU_V3_CACHING_MODE model Eric Auger
2017-08-11 15:38 ` [Qemu-devel] [RFC v6 0/9] ARM SMMUv3 Emulation Support no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.