All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
@ 2017-09-01 17:21 Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
                   ` (24 more replies)
  0 siblings, 25 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

This series implements the emulation code for ARM SMMUv3.

Changes since v6:
- DPDK testpmd now running on guest with 2 assigned VFs
- Changed the instantiation method: add the following option to
  the QEMU command line
  -device smmuv3 # for virtio/vhost use cases
  -device smmuv3,caching-mode # for vfio use cases (based on [1])
- splitted the series into smaller patches to allow the review
- the VFIO integration based on "tlbi-on-map" smmuv3 driver
  is isolated from the rest: last 2 patches, not for upstream.
  This is shipped for testing/bench until a better solution is found.
- Reworked permission flag checks and event generation

testing:
- in dt and ACPI modes
- virtio-net-pci and vhost-net devices using dma ops with various
  guest page sizes [2]
- assigned VFs using dma ops [3]:
  - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
  - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
- DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
  with guest and host page size equal (4kB)

Known limitations:
- no VMSAv8-32 suport
- no nested stage support (S1 + S2)
- no support for HYP mappings
- register fine emulation, commands, interrupts and errors were
  not accurately tested. Handling is sufficient to run use cases
  described above though.
- interrupts and event generation not observed yet.

Best Regards

Eric

This series can be found at:
v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
Previous version at:
v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6

References:
[1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
    https://lkml.org/lkml/2017/8/11/426

[2] qemu cmd line excerpt:
-device smmuv3 \
-netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
-device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
[3] use -device smmuv3,caching-mode


History:
v6 -> v7:
- see above

v5 -> v6:
- Rebase on 2.10 and IOMMUMemoryRegion
- add ACPI TLBI_ON_MAP support (VFIO integration also works in
  ACPI mode)
- fix block replay
- handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
  (goes along with TLBI_ON_MAP FW quirk)
- replay systematically unmap the whole range first
- smmuv3_map_hook does not unmap anymore and the unmap is done
  before the replay
- add and use smmuv3_context_device_invalidate instead of
  blindly replaying everything

v4 -> v5:
- initial_level now part of SMMUTransCfg
- smmu_page_walk_64 takes into account the max input size
- implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
- smmuv3_translate: bug fix: don't walk on bypass
- smmu_update_qreg: fix PROD index update
- I did not yet address Peter's comments as the code is not mature enough
  to be split into sub patches.

v3 -> v4 [Eric]:
- page table walk rewritten to allow scan of the page table within a
  range of IOVA. This prepares for VFIO integration and replay.
- configuration parsing partially reworked.
- do not advertise unsupported/untested features: S2, S1 + S2, HYP,
  PRI, ATS, ..
- added ACPI table generation
- migrated to dynamic traces
- mingw compilation fix

v2 -> v3 [Eric]:
- rebased on 2.9
- mostly code and patch reorganization to ease the review process
- optional patches removed. They may be handled separately. I am currently
  working on ACPI enablement.
- optional instantiation of the smmu in mach-virt
- removed [2/9] (fdt functions) since not mandated
- start splitting main patch into base and derived object
- no new function feature added

v1 -> v2 [Prem]:
- Adopted review comments from Eric Auger
        - Make SMMU_DPRINTF to internally call qemu_log
            (since translation requests are too many, we need control
             on the type of log we want)
        - SMMUTransCfg modified to suite simplicity
        - Change RegInfo to uint64 register array
        - Code cleanup
        - Test cleanups
- Reshuffled patches

v0 -> v1 [Prem]:
- As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
- Reworked register access/update logic
- Factored out translation code for
        - single point bug fix
        - sharing/removal in future
- (optional) Unit tests added, with PCI test device
        - S1 with 4k/64k, S1+S2 with 4k/64k
        - (S1 or S2) only can be verified by Linux 4.7 driver
        - (optional) Priliminary ACPI support

v0 [Prem]:
- Implements SMMUv3 spec 11.0
- Supported for PCIe devices,
- Command Queue and Event Queue supported
- LPAE only, S1 is supported and Tested, S2 not tested
- BE mode Translation not supported
- IRQ support (legacy, no MSI)

Eric Auger (18):
  hw/arm/smmu-common: smmu base device and datatypes
  hw/arm/smmu-common: IOMMU memory region and address space setup
  hw/arm/smmu-common: smmu_read/write_sysmem
  hw/arm/smmu-common: VMSAv8-64 page table walk
  hw/arm/smmuv3: Wired IRQ and GERROR helpers
  hw/arm/smmuv3: Queue helpers
  hw/arm/smmuv3: Implement MMIO write operations
  hw/arm/smmuv3: Event queue recording helper
  hw/arm/smmuv3: Implement translate callback
  target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  hw/arm/smmuv3: Implement data structure and TLB invalidation
    notifications
  hw/arm/smmuv3: Implement IOMMU memory region replay callback
  hw/arm/virt: Store the PCI host controller dt phandle
  hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
    functions
  hw/arm/sysbus-fdt: Pass the platform bus base address in
    PlatformBusFDTData
  hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
  hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
  hw/arm/smmuv3: [not for upstream] Add caching-mode option

Prem Mallappa (2):
  hw/arm/smmuv3: Skeleton
  hw/arm/virt-acpi-build: Add smmuv3 node in IORT table

 default-configs/aarch64-softmmu.mak |    1 +
 hw/arm/Makefile.objs                |    1 +
 hw/arm/smmu-common.c                |  527 ++++++++++++++++
 hw/arm/smmu-internal.h              |  105 ++++
 hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
 hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
 hw/arm/sysbus-fdt.c                 |  129 +++-
 hw/arm/trace-events                 |   48 ++
 hw/arm/virt-acpi-build.c            |   63 +-
 hw/arm/virt.c                       |    6 +-
 include/hw/acpi/acpi-defs.h         |   15 +
 include/hw/arm/smmu-common.h        |  123 ++++
 include/hw/arm/smmuv3.h             |   80 +++
 include/hw/arm/sysbus-fdt.h         |    2 +
 include/hw/arm/virt.h               |   15 +
 target/arm/kvm.c                    |   27 +
 target/arm/trace-events             |    3 +
 17 files changed, 2886 insertions(+), 24 deletions(-)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 hw/arm/smmu-internal.h
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmu-common.h
 create mode 100644 include/hw/arm/smmuv3.h

-- 
2.5.5

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-27 17:38   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

The patch introduces the smmu base device and class for the ARM
smmu. Devices for specific versions will be derived from this
base device.

We also introduce some important datatypes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>

---

v3 -> v4:
- added smmu_find_as_from_bus_num
- SMMU_PCI_BUS_MAX and SMMU_PCI_DEVFN_MAX in smmu-common header
- new fields in SMMUState:
  - iommu_ops, smmu_as_by_busptr, smmu_as_by_bus_num
- add aa64[] field in SMMUTransCfg

v3:
- moved the base code in a separate patch to ease the review.
- clearer separation between base class and smmuv3 class
- translate_* only implemented as class methods
---
 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs                |   1 +
 hw/arm/smmu-common.c                |  58 +++++++++++++++++++
 include/hw/arm/smmu-common.h        | 108 ++++++++++++++++++++++++++++++++++++
 4 files changed, 168 insertions(+)
 create mode 100644 hw/arm/smmu-common.c
 create mode 100644 include/hw/arm/smmu-common.h

diff --git a/default-configs/aarch64-softmmu.mak b/default-configs/aarch64-softmmu.mak
index 2449483..83a2932 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -7,3 +7,4 @@ CONFIG_AUX=y
 CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
+CONFIG_ARM_SMMUV3=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index a2e56ec..5b2d38d 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -19,3 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
new file mode 100644
index 0000000..56608f1
--- /dev/null
+++ b/hw/arm/smmu-common.c
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Author: Prem Mallappa <pmallapp@broadcom.com>
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "exec/target_page.h"
+#include "qom/cpu.h"
+
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+static void smmu_base_instance_init(Object *obj)
+{
+}
+
+static void smmu_base_class_init(ObjectClass *klass, void *data)
+{
+}
+
+static const TypeInfo smmu_base_info = {
+    .name          = TYPE_SMMU_DEV_BASE,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(SMMUState),
+    .instance_init = smmu_base_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUBaseClass),
+    .class_init    = smmu_base_class_init,
+    .abstract      = true,
+};
+
+static void smmu_base_register_types(void)
+{
+    type_register_static(&smmu_base_info);
+}
+
+type_init(smmu_base_register_types)
+
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
new file mode 100644
index 0000000..38cd18f
--- /dev/null
+++ b/include/hw/arm/smmu-common.h
@@ -0,0 +1,108 @@
+/*
+ * ARM SMMU Support
+ *
+ * Copyright (C) 2015-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_COMMON_H
+#define HW_ARM_SMMU_COMMON_H
+
+#include <hw/sysbus.h>
+#include "hw/pci/pci.h"
+
+#define SMMU_PCI_BUS_MAX      256
+#define SMMU_PCI_DEVFN_MAX    256
+
+/*
+ * Page table walk generic errors
+ * At the moment values match SMMUv3 event numbers though
+ */
+typedef enum {
+    SMMU_TRANS_ERR_NONE          = 0x0,
+    SMMU_TRANS_ERR_WALK_EXT_ABRT = 0x1,  /* Translation walk external abort */
+    SMMU_TRANS_ERR_TRANS         = 0x10, /* Translation fault */
+    SMMU_TRANS_ERR_ADDR_SZ,              /* Address Size fault */
+    SMMU_TRANS_ERR_ACCESS,               /* Access fault */
+    SMMU_TRANS_ERR_PERM,                 /* Permission fault */
+    SMMU_TRANS_ERR_TLB_CONFLICT  = 0x20, /* TLB Conflict */
+} SMMUTransErr;
+
+/*
+ * Generic structure populated by derived SMMU devices
+ * after decoding the configuration information and used as
+ * input to the page table walk
+ */
+typedef struct SMMUTransCfg {
+    hwaddr   input;            /* input address */
+    hwaddr   output;           /* Output address */
+    int      stage;            /* translation stage */
+    uint32_t oas;              /* output address width */
+    uint32_t tsz;              /* input range, ie. 2^(64 -tnsz)*/
+    uint64_t ttbr;             /* TTBR address */
+    uint32_t granule_sz;       /* granule page shift */
+    bool     aa64;             /* arch64 or aarch32 translation table */
+    int      initial_level;    /* initial lookup level */
+    bool     disabled;         /* smmu is disabled */
+    bool     bypassed;         /* stage is bypassed */
+} SMMUTransCfg;
+
+typedef struct SMMUDevice {
+    void               *smmu;
+    PCIBus             *bus;
+    int                devfn;
+    IOMMUMemoryRegion  iommu;
+    AddressSpace       as;
+} SMMUDevice;
+
+typedef struct SMMUNotifierNode {
+    SMMUDevice *sdev;
+    QLIST_ENTRY(SMMUNotifierNode) next;
+} SMMUNotifierNode;
+
+typedef struct SMMUPciBus {
+    PCIBus       *bus;
+    SMMUDevice   *pbdev[0]; /* Parent array is sparse, so dynamically alloc */
+} SMMUPciBus;
+
+typedef struct SMMUState {
+    /* <private> */
+    SysBusDevice  dev;
+    char *mrtypename;
+    MemoryRegion iomem;
+
+    GHashTable *smmu_as_by_busptr;
+    SMMUPciBus *smmu_as_by_bus_num[SMMU_PCI_BUS_MAX];
+    QLIST_HEAD(, SMMUNotifierNode) notifiers_list;
+
+} SMMUState;
+
+typedef int (*smmu_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+
+typedef struct {
+    /* <private> */
+    SysBusDeviceClass parent_class;
+} SMMUBaseClass;
+
+#define TYPE_SMMU_DEV_BASE "smmu-base"
+#define SMMU_SYS_DEV(obj) OBJECT_CHECK(SMMUState, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_DEV_BASE)
+#define SMMU_DEVICE_CLASS(klass)                                    \
+    OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_SMMU_DEV_BASE)
+
+#endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 14:39   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem Eric Auger
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

We enumerate all the PCI devices attached to the SMMU and
initialize an associated IOMMU memory region and address space.
This happens on SMMU base instance init.

Those info are stored in SMMUDevice objects. The devices are
grouped according to the PCIBus they belong to. A hash table
indexed by the PCIBus poinet is used. Also an array indexed by
the bus number allows to find the list of SMMUDevices.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmu-common.c         | 89 ++++++++++++++++++++++++++++++++++++++++++++
 include/hw/arm/smmu-common.h |  6 +++
 2 files changed, 95 insertions(+)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 56608f1..3e67992 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -30,8 +30,97 @@
 #include "qemu/error-report.h"
 #include "hw/arm/smmu-common.h"
 
+/******************/
+/* Infrastructure */
+/******************/
+
+static inline gboolean smmu_uint64_equal(gconstpointer v1, gconstpointer v2)
+{
+    return *((const uint64_t *)v1) == *((const uint64_t *)v2);
+}
+
+static inline guint smmu_uint64_hash(gconstpointer v)
+{
+    return (guint)*(const uint64_t *)v;
+}
+
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
+{
+    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
+
+    if (!smmu_pci_bus) {
+        GHashTableIter iter;
+
+        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
+        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
+            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
+                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
+                return smmu_pci_bus;
+            }
+        }
+    }
+    return smmu_pci_bus;
+}
+
+static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
+{
+    SMMUState *s = opaque;
+    uintptr_t key = (uintptr_t)bus;
+    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, &key);
+    SMMUDevice *sdev;
+
+    if (!sbus) {
+        uintptr_t *new_key = g_malloc(sizeof(*new_key));
+
+        *new_key = (uintptr_t)bus;
+        sbus = g_malloc0(sizeof(SMMUPciBus) +
+                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
+        sbus->bus = bus;
+        g_hash_table_insert(s->smmu_as_by_busptr, new_key, sbus);
+    }
+
+    sdev = sbus->pbdev[devfn];
+    if (!sdev) {
+        char *name = g_strdup_printf("%s-%d-%d",
+                                     s->mrtypename,
+                                     pci_bus_num(bus), devfn);
+        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(SMMUDevice));
+
+        sdev->smmu = s;
+        sdev->bus = bus;
+        sdev->devfn = devfn;
+
+        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
+                                 s->mrtypename,
+                                 OBJECT(s), name, 1ULL << 48);
+        address_space_init(&sdev->as,
+                           MEMORY_REGION(&sdev->iommu), name);
+    }
+
+    return &sdev->as;
+}
+
+static void smmu_init_iommu_as(SMMUState *s)
+{
+    PCIBus *pcibus = pci_find_primary_bus();
+
+    if (pcibus) {
+        pci_setup_iommu(pcibus, smmu_find_add_as, s);
+    } else {
+        error_report("No PCI bus, SMMU is not registered");
+    }
+}
+
 static void smmu_base_instance_init(Object *obj)
 {
+    SMMUState *s = SMMU_SYS_DEV(obj);
+
+    memset(s->smmu_as_by_bus_num, 0, sizeof(s->smmu_as_by_bus_num));
+
+    s->smmu_as_by_busptr = g_hash_table_new_full(smmu_uint64_hash,
+                                                 smmu_uint64_equal,
+                                                 g_free, g_free);
+    smmu_init_iommu_as(s);
 }
 
 static void smmu_base_class_init(ObjectClass *klass, void *data)
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 38cd18f..20f3fe6 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -105,4 +105,10 @@ typedef struct {
 #define SMMU_DEVICE_CLASS(klass)                                    \
     OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_SMMU_DEV_BASE)
 
+SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
+
+static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
+{
+    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
+}
 #endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 14:46   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
                   ` (21 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

Those two functions will be used to access configuration
data (STE, CD) and page table entries in guest RAM.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmu-common.c         | 37 +++++++++++++++++++++++++++++++++++++
 include/hw/arm/smmu-common.h |  5 +++++
 2 files changed, 42 insertions(+)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 3e67992..2a94547 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -30,6 +30,43 @@
 #include "qemu/error-report.h"
 #include "hw/arm/smmu-common.h"
 
+inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
+                                    bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        *(uint32_t *)buf = ldl_le_phys(&address_space_memory, addr);
+        break;
+    case 8:
+        *(uint64_t *)buf = ldq_le_phys(&address_space_memory, addr);
+        break;
+    default:
+        return address_space_rw(&address_space_memory, addr,
+                                attrs, buf, len, false);
+    }
+    return MEMTX_OK;
+}
+
+inline void
+smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure)
+{
+    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
+
+    switch (len) {
+    case 4:
+        stl_le_phys(&address_space_memory, addr, *(uint32_t *)buf);
+        break;
+    case 8:
+        stq_le_phys(&address_space_memory, addr, *(uint64_t *)buf);
+        break;
+    default:
+        address_space_rw(&address_space_memory, addr,
+                         attrs, buf, len, true);
+    }
+}
+
 /******************/
 /* Infrastructure */
 /******************/
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 20f3fe6..a5999b0 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -111,4 +111,9 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
 {
     return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
 }
+
+MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
+                             dma_addr_t len, bool secure);
+void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
+
 #endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (2 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 15:36   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton Eric Auger
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

This patch implements the page table walk for VMSAv8-64.

The page table walk function is devised to walk the tables
for a range of IOVAs and to call a callback for each valid
leaf entry (frame or block).

smmu_page_walk_level_64() handles the walk from a specific level.
The advantage of using recursivity is one easily skips invalid
entries at any stage. Only if the entry of level n is valid then
we walk the level n+1, otherwise we jump to the next index of
level n.

Walk for an IOVA range will be used for SMMU memory region custom
replay. Translation function uses the same function for a granule.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- fix wrong error handling in walk_page_table
- check perm in smmu_translate

v5 -> v6:
- use IOMMUMemoryRegion
- remove initial_lookup_level()
- fix block replay

v4 -> v5:
- add initial level in translation config
- implement block pte
- rename must_translate into nofail
- introduce call_entry_hook
- small changes to dynamic traces
- smmu_page_walk code moved from smmuv3.c to this file
- remove smmu_translate*

v3 -> v4:
- reworked page table walk to prepare for VFIO integration
  (capability to scan a range of IOVA). Same function is used
  for translate for a single iova. This is largely inspired
  from intel_iommu.c
- as the translate function was not straightforward to me,
  I tried to stick more closely to the VMSA spec.
- remove support of nested stage (kernel driver does not
  support it anyway)
- use error_report and trace events
- add aa64[] field in SMMUTransCfg
---
 hw/arm/smmu-common.c         | 343 +++++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmu-internal.h       | 105 +++++++++++++
 hw/arm/trace-events          |  12 ++
 include/hw/arm/smmu-common.h |   4 +
 4 files changed, 464 insertions(+)
 create mode 100644 hw/arm/smmu-internal.h

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 2a94547..f476120 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -29,6 +29,349 @@
 
 #include "qemu/error-report.h"
 #include "hw/arm/smmu-common.h"
+#include "smmu-internal.h"
+
+/*************************/
+/* VMSAv8-64 Translation */
+/*************************/
+
+/**
+ * get_pte - Get the content of a page table entry located in
+ * @base_addr[@index]
+ */
+static uint64_t get_pte(dma_addr_t baseaddr, uint32_t index)
+{
+    uint64_t pte;
+
+    if (smmu_read_sysmem(baseaddr + index * sizeof(pte),
+                         &pte, sizeof(pte), false)) {
+        error_report("can't read pte at address=0x%"PRIx64,
+                     baseaddr + index * sizeof(pte));
+        pte = (uint64_t)-1;
+        return pte;
+    }
+    trace_smmu_get_pte(baseaddr, index, baseaddr + index * sizeof(pte), pte);
+    /* TODO: handle endianness */
+    return pte;
+}
+
+/* VMSAv8-64 Translation Table Format Descriptor Decoding */
+
+#define PTE_ADDRESS(pte, shift) (extract64(pte, shift, 47 - shift) << shift)
+
+/**
+ * get_page_pte_address - returns the L3 descriptor output address,
+ * ie. the page frame
+ * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
+ */
+static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_table_pte_address - return table descriptor output address,
+ * ie. address of next level table
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
+{
+    return PTE_ADDRESS(pte, granule_sz);
+}
+
+/**
+ * get_block_pte_address - return block descriptor output address and block size
+ * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
+ */
+static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
+                                    uint64_t *bsz)
+{
+    int n;
+
+    switch (granule_sz) {
+    case 12:
+        if (level == 1) {
+            n = 30;
+        } else if (level == 2) {
+            n = 21;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 14:
+        if (level == 2) {
+            n = 25;
+        } else {
+            goto error_out;
+        }
+        break;
+    case 16:
+        if (level == 2) {
+            n = 29;
+        } else {
+            goto error_out;
+        }
+        break;
+    default:
+            goto error_out;
+    }
+    *bsz = 1 << n;
+    return PTE_ADDRESS(pte, n);
+
+error_out:
+
+    error_report("unexpected granule_sz=%d/level=%d for block pte",
+                 granule_sz, level);
+    *bsz = 0;
+    return (hwaddr)-1;
+}
+
+static int call_entry_hook(uint64_t iova, uint64_t mask, uint64_t gpa,
+                           int perm, smmu_page_walk_hook hook_fn, void *private)
+{
+    IOMMUTLBEntry entry;
+    int ret;
+
+    entry.target_as = &address_space_memory;
+    entry.iova = iova & mask;
+    entry.translated_addr = gpa;
+    entry.addr_mask = ~mask;
+    entry.perm = perm;
+
+    ret = hook_fn(&entry, private);
+    if (ret) {
+        error_report("%s hook returned %d", __func__, ret);
+    }
+    return ret;
+}
+
+/**
+ * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
+ * @baseaddr: table base address corresponding to @level
+ * @level: level
+ * @cfg: translation config
+ * @start: end of the IOVA range
+ * @end: end of the IOVA range
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ * @flags: access flags of the parent
+ * @nofail: indicates whether each iova of the range
+ *  must be translated or whether failure is allowed
+ *
+ * Return 0 on success, < 0 on errors not related to translation
+ * process, > 1 on errors related to translation process (only
+ * if nofail is set)
+ */
+static int
+smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
+                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                        smmu_page_walk_hook hook_fn, void *private,
+                        IOMMUAccessFlags flags, bool nofail)
+{
+    uint64_t subpage_size, subpage_mask, pte, iova = start;
+    int ret, granule_sz, stage, perm;
+
+    granule_sz = cfg->granule_sz;
+    stage = cfg->stage;
+    subpage_size = 1ULL << level_shift(level, granule_sz);
+    subpage_mask = level_page_mask(level, granule_sz);
+
+    trace_smmu_page_walk_level_in(level, baseaddr, granule_sz,
+                                  start, end, flags, subpage_size);
+
+    while (iova < end) {
+        dma_addr_t next_table_baseaddr;
+        uint64_t iova_next, pte_addr;
+        uint32_t offset;
+
+        iova_next = (iova & subpage_mask) + subpage_size;
+        offset = iova_level_offset(iova, level, granule_sz);
+        pte_addr = baseaddr + offset * sizeof(pte);
+        pte = get_pte(baseaddr, offset);
+
+        trace_smmu_page_walk_level(level, iova, subpage_size,
+                                   baseaddr, offset, pte);
+
+        if (pte == (uint64_t)-1) {
+            if (nofail) {
+                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+            }
+            goto next;
+        }
+        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+            trace_smmu_page_walk_level_res_invalid_pte(stage, level, baseaddr,
+                                                       pte_addr, offset, pte);
+            if (nofail) {
+                return SMMU_TRANS_ERR_TRANS;
+            }
+            goto next;
+        }
+
+        if (is_page_pte(pte, level)) {
+            uint64_t gpa = get_page_pte_address(pte, granule_sz);
+
+            perm = flags & pte_ap_to_perm(pte, true);
+
+            trace_smmu_page_walk_level_page_pte(stage, level, iova,
+                                                baseaddr, pte_addr, pte, gpa);
+            ret = call_entry_hook(iova, subpage_mask, gpa, perm,
+                                  hook_fn, private);
+            if (ret) {
+                return ret;
+            }
+            goto next;
+        }
+        if (is_block_pte(pte, level)) {
+            size_t target_page_size = qemu_target_page_size();;
+            uint64_t block_size, top_iova;
+            hwaddr gpa, block_gpa;
+
+            block_gpa = get_block_pte_address(pte, level, granule_sz,
+                                              &block_size);
+            perm = flags & pte_ap_to_perm(pte, true);
+
+            if (block_gpa == -1) {
+                if (nofail) {
+                    return SMMU_TRANS_ERR_WALK_EXT_ABRT;
+                } else {
+                    goto next;
+                }
+            }
+            trace_smmu_page_walk_level_block_pte(stage, level, baseaddr,
+                                                 pte_addr, pte, iova, block_gpa,
+                                                 (int)(block_size >> 20));
+
+            gpa = block_gpa + (iova & (block_size - 1));
+            if ((block_gpa == gpa) && (end >= iova_next - 1)) {
+                ret = call_entry_hook(iova, ~(block_size - 1), block_gpa,
+                                      perm, hook_fn, private);
+                if (ret) {
+                    return ret;
+                }
+                goto next;
+            } else {
+                top_iova = MIN(end, iova_next);
+                while (iova < top_iova) {
+                    gpa = block_gpa + (iova & (block_size - 1));
+                    ret = call_entry_hook(iova, ~(target_page_size - 1),
+                                          gpa, perm, hook_fn, private);
+                    if (ret) {
+                        return ret;
+                    }
+                    iova += target_page_size;
+                }
+            }
+        }
+        if (level  == 3) {
+            goto next;
+        }
+        /* table pte */
+        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
+        trace_smmu_page_walk_level_table_pte(stage, level, baseaddr, pte_addr,
+                                             pte, next_table_baseaddr);
+        perm = flags & pte_ap_to_perm(pte, false);
+        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
+                                      iova, MIN(iova_next, end),
+                                      hook_fn, private, perm, nofail);
+        if (ret) {
+            return ret;
+        }
+
+next:
+        iova = iova_next;
+    }
+
+    return SMMU_TRANS_ERR_NONE;
+}
+
+/**
+ * smmu_page_walk - walk a specific IOVA range from the initial
+ * lookup level, and call the hook for each valid entry
+ *
+ * @cfg: translation config
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ * @nofail: if true, each IOVA within the range must have a translation
+ * @hook_fn: the hook that to be called for each detected area
+ * @private: private data for the hook function
+ */
+int smmu_page_walk(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                   bool nofail, smmu_page_walk_hook hook_fn, void *private)
+{
+    uint64_t roof = MIN(end, (1ULL << (64 - cfg->tsz)) - 1);
+    IOMMUAccessFlags perm = IOMMU_ACCESS_FLAG(true, true);
+    int stage = cfg->stage;
+    dma_addr_t ttbr;
+
+    if (!hook_fn) {
+        return 0;
+    }
+
+    if (!cfg->aa64) {
+        error_report("VMSAv8-32 page walk is not yet implemented");
+        abort();
+    }
+
+    ttbr = extract64(cfg->ttbr, 0, 48);
+    trace_smmu_page_walk(stage, cfg->ttbr, cfg->initial_level, start, roof);
+
+    return smmu_page_walk_level_64(ttbr, cfg->initial_level, cfg, start, roof,
+                                   hook_fn, private, perm, nofail);
+}
+
+/**
+ * set_translated_address: page table walk callback for smmu_translate
+ *
+ * once a leaf entry is found, applies the offset to the translated address
+ * and check the permission
+ *
+ * @entry: entry filled by the page table walk function, ie. contains the
+ * leaf entry iova/translated addr and permission flags
+ * @private: pointer to the original entry that must be translated
+ */
+static int set_translated_address(IOMMUTLBEntry *entry, void *private)
+{
+    IOMMUTLBEntry *tlbe_in = (IOMMUTLBEntry *)private;
+    size_t offset = tlbe_in->iova - entry->iova;
+
+    if (((tlbe_in->perm & IOMMU_RO) && !(entry->perm & IOMMU_RO)) ||
+        ((tlbe_in->perm & IOMMU_WO) && !(entry->perm & IOMMU_WO))) {
+        return SMMU_TRANS_ERR_PERM;
+    }
+    tlbe_in->translated_addr = entry->translated_addr + offset;
+    trace_smmu_set_translated_address(tlbe_in->iova, tlbe_in->translated_addr);
+    return 0;
+}
+
+/**
+ * smmu_translate - Attempt to translate a given entry according to @cfg
+ *
+ * @cfg: translation configuration
+ * @tlbe: entry pre-filled with the input iova, mask
+ *
+ * return: !=0 if no mapping is found for the tlbe->iova or access permission
+ * does not match
+ */
+int smmu_translate(SMMUTransCfg *cfg, IOMMUTLBEntry *tlbe)
+{
+    int ret = 0;
+
+    if (cfg->bypassed || cfg->disabled) {
+        return 0;
+    }
+
+    ret = smmu_page_walk(cfg, tlbe->iova, tlbe->iova + 1, true /* nofail */,
+                         set_translated_address, tlbe);
+
+    if (ret) {
+        error_report("translation failed for iova=0x%"PRIx64" perm=%d (%d)",
+                     tlbe->iova, tlbe->perm, ret);
+        goto exit;
+    }
+
+exit:
+    return ret;
+}
 
 inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
                                     bool secure)
diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
new file mode 100644
index 0000000..aeeadd4
--- /dev/null
+++ b/hw/arm/smmu-internal.h
@@ -0,0 +1,105 @@
+/*
+ * ARM SMMU support - Internal API
+ *
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_INTERNAL_H
+#define HW_ARM_SMMU_INTERNAL_H
+
+#define ARM_LPAE_MAX_ADDR_BITS          48
+#define ARM_LPAE_MAX_LEVELS             4
+
+/* PTE Manipulation */
+
+#define ARM_LPAE_PTE_TYPE_SHIFT         0
+#define ARM_LPAE_PTE_TYPE_MASK          0x3
+
+#define ARM_LPAE_PTE_TYPE_BLOCK         1
+#define ARM_LPAE_PTE_TYPE_RESERVED      1
+#define ARM_LPAE_PTE_TYPE_TABLE         3
+#define ARM_LPAE_PTE_TYPE_PAGE          3
+
+#define ARM_LPAE_PTE_VALID              (1 << 0)
+
+static inline bool is_invalid_pte(uint64_t pte)
+{
+    return !(pte & ARM_LPAE_PTE_VALID);
+}
+
+static inline bool is_reserved_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_RESERVED));
+}
+
+static inline bool is_block_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK));
+}
+
+static inline bool is_table_pte(uint64_t pte, int level)
+{
+    return ((level < 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE));
+}
+
+static inline bool is_page_pte(uint64_t pte, int level)
+{
+    return ((level == 3) &&
+            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_PAGE));
+}
+
+static IOMMUAccessFlags pte_ap_to_perm(uint64_t pte, bool is_leaf)
+{
+    int ap;
+    IOMMUAccessFlags flags;
+
+    if (is_leaf) {
+        ap = extract64(pte, 6, 2);
+    } else {
+        ap = extract64(pte, 61, 2);
+    }
+    flags = IOMMU_ACCESS_FLAG(true, !(ap & 0x2));
+    return flags;
+}
+
+/* Level Indexing */
+
+static inline int level_shift(int level, int granule_sz)
+{
+    return granule_sz + (3 - level) * (granule_sz - 3);
+}
+
+static inline uint64_t level_page_mask(int level, int granule_sz)
+{
+    return ~((1ULL << level_shift(level, granule_sz)) - 1);
+}
+
+/**
+ * TODO: handle the case where the level resolves less than
+ * granule_sz -3 IA bits.
+ */
+static inline
+uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
+{
+    return (iova >> level_shift(level, granule_sz)) &
+            ((1ULL << (granule_sz - 3)) - 1);
+}
+
+#endif
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 193063e..c67cd39 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -2,3 +2,15 @@
 
 # hw/arm/virt-acpi-build.c
 virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
+
+# hw/arm/smmu-common.c
+
+smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
+smmu_page_walk_level_in(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, int flags, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64" flags=%d subpage_size=0x%lx"
+smmu_page_walk_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
+smmu_page_walk_level_res_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%lx"
+smmu_page_walk_level_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
+smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
+smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
+smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
+smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index a5999b0..112a11c 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -116,4 +116,8 @@ MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
                              dma_addr_t len, bool secure);
 void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
 
+int smmu_translate(SMMUTransCfg *cfg, IOMMUTLBEntry *tlbe);
+int smmu_page_walk(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
+                   bool nofail, smmu_page_walk_hook hook_fn, void *private);
+
 #endif  /* HW_ARM_SMMU_COMMON */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (3 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-08 10:52   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  2017-10-09 16:17   ` [Qemu-devel] " Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
                   ` (19 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch implements a skeleton for the smmuv3 device.
Datatypes and register definitions are introduced. The MMIO
region, the interrupts and the queue are initialized (PRI is
not supported).

Only the MMIO read operation is implemented here.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- split into several patches

v5 -> v6:
- Use IOMMUMemoryregion
- regs become uint32_t and fix 64b MMIO access (.impl)
- trace_smmuv3_write/read_mmio take the size param

v4 -> v5:
- change smmuv3_translate proto (IOMMUAccessFlags flag)
- has_stagex replaced by is_ste_stagex
- smmu_cfg_populate removed
- added smmuv3_decode_config and reworked error management
- remwork the naming of IOMMU mrs
- fix SMMU_CMDQ_CONS offset

v3 -> v4
- smmu_irq_update
- fix hash key allocation
- set smmu_iommu_ops
- set SMMU_REG_CR0,
- smmuv3_translate: ret.perm not set in bypass mode
- use trace events
- renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
- rework smmu_find_ste
- fix tg2granule in TT0/0b10 corresponds to 16kB

v2 -> v3:
- move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
- compilation allowed
- fix sbus allocation in smmu_init_pci_iommu
- restructure code into headers
- misc cleanups
---
 hw/arm/Makefile.objs     |   2 +-
 hw/arm/smmuv3-internal.h | 201 +++++++++++++++++++++++++++++++++++++++
 hw/arm/smmuv3.c          | 239 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |   3 +
 include/hw/arm/smmuv3.h  |  79 ++++++++++++++++
 5 files changed, 523 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/smmuv3-internal.h
 create mode 100644 hw/arm/smmuv3.c
 create mode 100644 include/hw/arm/smmuv3.h

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 5b2d38d..a7c808b 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
 obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
 obj-$(CONFIG_MPS2) += mps2.o
-obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
+obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
new file mode 100644
index 0000000..488acc8
--- /dev/null
+++ b/hw/arm/smmuv3-internal.h
@@ -0,0 +1,201 @@
+/*
+ * ARM SMMUv3 support - Internal API
+ *
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMU_V3_INTERNAL_H
+#define HW_ARM_SMMU_V3_INTERNAL_H
+
+#include "trace.h"
+#include "qemu/error-report.h"
+#include "hw/arm/smmu-common.h"
+
+/*****************************
+ * MMIO Register
+ *****************************/
+enum {
+    SMMU_REG_IDR0            = 0x0,
+
+/* IDR0 Field Values and supported features */
+
+#define SMMU_IDR0_S2P      1  /* stage 2 */
+#define SMMU_IDR0_S1P      1  /* stage 1 */
+#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
+#define SMMU_IDR0_COHACC   1  /* IO coherent access */
+#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
+#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
+#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
+#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
+#define SMMU_IDR0_PRI      0  /* Page Request Interface */
+#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
+#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
+#define SMMU_IDR0_STALL    1  /* Stalling fault model */
+#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
+#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
+
+#define SMMU_IDR0_S2P_SHIFT      0
+#define SMMU_IDR0_S1P_SHIFT      1
+#define SMMU_IDR0_TTF_SHIFT      2
+#define SMMU_IDR0_COHACC_SHIFT   4
+#define SMMU_IDR0_HTTU_SHIFT     6
+#define SMMU_IDR0_HYP_SHIFT      9
+#define SMMU_IDR0_ATS_SHIFT      10
+#define SMMU_IDR0_ASID16_SHIFT   12
+#define SMMU_IDR0_PRI_SHIFT      16
+#define SMMU_IDR0_VMID16_SHIFT   18
+#define SMMU_IDR0_CD2L_SHIFT     19
+#define SMMU_IDR0_STALL_SHIFT    24
+#define SMMU_IDR0_TERM_SHIFT     26
+#define SMMU_IDR0_STLEVEL_SHIFT  27
+
+    SMMU_REG_IDR1            = 0x4,
+#define SMMU_IDR1_SIDSIZE 16
+    SMMU_REG_IDR2            = 0x8,
+    SMMU_REG_IDR3            = 0xc,
+    SMMU_REG_IDR4            = 0x10,
+    SMMU_REG_IDR5            = 0x14,
+#define SMMU_IDR5_GRAN_SHIFT 4
+#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
+#define SMMU_IDR5_OAS        4     /* 44 bits */
+    SMMU_REG_IIDR            = 0x1c,
+    SMMU_REG_CR0             = 0x20,
+
+#define SMMU_CR0_SMMU_ENABLE (1 << 0)
+#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
+#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
+#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
+#define SMMU_CR0_ATS_CHECK   (1 << 4)
+
+    SMMU_REG_CR0_ACK         = 0x24,
+    SMMU_REG_CR1             = 0x28,
+    SMMU_REG_CR2             = 0x2c,
+
+    SMMU_REG_STATUSR         = 0x40,
+
+    SMMU_REG_IRQ_CTRL        = 0x50,
+    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
+
+#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
+#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
+#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
+
+    SMMU_REG_GERROR          = 0x60,
+
+#define SMMU_GERROR_CMDQ           (1 << 0)
+#define SMMU_GERROR_EVENTQ_ABT     (1 << 2)
+#define SMMU_GERROR_PRIQ_ABT       (1 << 3)
+#define SMMU_GERROR_MSI_CMDQ_ABT   (1 << 4)
+#define SMMU_GERROR_MSI_EVENTQ_ABT (1 << 5)
+#define SMMU_GERROR_MSI_PRIQ_ABT   (1 << 6)
+#define SMMU_GERROR_MSI_GERROR_ABT (1 << 7)
+#define SMMU_GERROR_SFM_ERR        (1 << 8)
+
+    SMMU_REG_GERRORN         = 0x64,
+    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
+    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
+    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
+
+    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
+#define SMMU_BASE_RA        (1ULL << 62)
+    SMMU_REG_STRTAB_BASE     = 0x80,
+    SMMU_REG_STRTAB_BASE_CFG = 0x88,
+
+    SMMU_REG_CMDQ_BASE       = 0x90,
+    SMMU_REG_CMDQ_PROD       = 0x98,
+    SMMU_REG_CMDQ_CONS       = 0x9c,
+    /* CMD Consumer (CONS) */
+#define SMMU_CMD_CONS_ERR_SHIFT        24
+#define SMMU_CMD_CONS_ERR_BITS         7
+
+    SMMU_REG_EVTQ_BASE       = 0xa0,
+    SMMU_REG_EVTQ_PROD       = 0xa8,
+    SMMU_REG_EVTQ_CONS       = 0xac,
+    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
+    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
+    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
+
+    SMMU_REG_PRIQ_BASE       = 0xc0,
+    SMMU_REG_PRIQ_PROD       = 0xc8,
+    SMMU_REG_PRIQ_CONS       = 0xcc,
+    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
+    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
+    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
+
+    SMMU_ID_REGS_OFFSET      = 0xfd0,
+
+    /* Secure registers are not used for now */
+    SMMU_SECURE_OFFSET       = 0x8000,
+};
+
+/**********************
+ * Data Structures
+ **********************/
+
+struct __smmu_data2 {
+    uint32_t word[2];
+};
+
+struct __smmu_data8 {
+    uint32_t word[8];
+};
+
+struct __smmu_data16 {
+    uint32_t word[16];
+};
+
+struct __smmu_data4 {
+    uint32_t word[4];
+};
+
+typedef struct __smmu_data4  Cmd; /* Command Entry */
+typedef struct __smmu_data8  Evt; /* Event Entry */
+
+/*****************************
+ *  Register Access Primitives
+ *****************************/
+
+static inline void smmu_write32_reg(SMMUV3State *s, uint32_t addr, uint32_t val)
+{
+    s->regs[addr >> 2] = val;
+}
+
+static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
+{
+    addr >>= 2;
+    s->regs[addr] = extract64(val, 0, 32);
+    s->regs[addr + 1] = extract64(val, 32, 32);
+}
+
+static inline uint32_t smmu_read32_reg(SMMUV3State *s, uint32_t addr)
+{
+    return s->regs[addr >> 2];
+}
+
+static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
+{
+    addr >>= 2;
+    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
+}
+
+static inline int smmu_enabled(SMMUV3State *s)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
+}
+
+#endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
new file mode 100644
index 0000000..0a7cd1c
--- /dev/null
+++ b/hw/arm/smmuv3.c
@@ -0,0 +1,239 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "sysemu/sysemu.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci.h"
+#include "exec/address-spaces.h"
+#include "trace.h"
+#include "qemu/error-report.h"
+
+#include "hw/arm/smmuv3.h"
+#include "smmuv3-internal.h"
+
+static void smmuv3_init_regs(SMMUV3State *s)
+{
+    uint32_t data =
+        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
+        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
+        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
+        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
+        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
+        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
+        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
+        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
+        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
+        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
+        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
+        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
+        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
+
+    smmu_write32_reg(s, SMMU_REG_IDR0, data);
+
+#define SMMU_QUEUE_SIZE_LOG2  19
+    data =
+        1 << 27 |                    /* Attr Types override */
+        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
+        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
+        0  << 6 |                    /* SSID not supported */
+        SMMU_IDR1_SIDSIZE;
+
+    smmu_write32_reg(s, SMMU_REG_IDR1, data);
+
+    s->sid_size = SMMU_IDR1_SIDSIZE;
+
+    data = SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
+
+    smmu_write32_reg(s, SMMU_REG_IDR5, data);
+}
+
+static void smmuv3_init_queues(SMMUV3State *s)
+{
+    s->cmdq.prod = 0;
+    s->cmdq.cons = 0;
+    s->cmdq.wrap.prod = 0;
+    s->cmdq.wrap.cons = 0;
+
+    s->evtq.prod = 0;
+    s->evtq.cons = 0;
+    s->evtq.wrap.prod = 0;
+    s->evtq.wrap.cons = 0;
+
+    s->cmdq.entries = SMMU_QUEUE_SIZE_LOG2;
+    s->cmdq.ent_size = sizeof(Cmd);
+    s->evtq.entries = SMMU_QUEUE_SIZE_LOG2;
+    s->evtq.ent_size = sizeof(Evt);
+}
+
+static void smmuv3_init(SMMUV3State *s)
+{
+    smmuv3_init_regs(s);
+    smmuv3_init_queues(s);
+}
+
+static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
+                                        uint64_t val)
+{
+    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
+}
+
+static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
+{
+    switch (*addr) {
+    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
+    case 0x100c8: case 0x100cc:
+        *addr ^= (hwaddr)0x10000;
+    }
+}
+
+static void smmu_write_mmio(void *opaque, hwaddr addr,
+                            uint64_t val, unsigned size)
+{
+}
+
+static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
+{
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    uint64_t val;
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    /* Primecell/Corelink ID registers */
+    switch (addr) {
+    case 0xFF0 ... 0xFFC:
+    case 0xFDC ... 0xFE4:
+        val = 0;
+        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
+        break;
+    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
+    case SMMU_REG_EVTQ_BASE:
+    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
+        val = smmu_read64_reg(s, addr);
+        break;
+    default:
+        val = (uint64_t)smmu_read32_reg(s, addr);
+        break;
+    }
+
+    trace_smmuv3_read_mmio(addr, val, size);
+    return val;
+}
+
+static const MemoryRegionOps smmu_mem_ops = {
+    .read = smmu_read_mmio,
+    .write = smmu_write_mmio,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
+static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
+        sysbus_init_irq(dev, &s->irq[i]);
+    }
+}
+
+static void smmu_reset(DeviceState *dev)
+{
+    SMMUV3State *s = SMMU_V3_DEV(dev);
+    smmuv3_init(s);
+}
+
+static void smmu_realize(DeviceState *d, Error **errp)
+{
+    SMMUState *sys = SMMU_SYS_DEV(d);
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+    SysBusDevice *dev = SYS_BUS_DEVICE(d);
+
+    memory_region_init_io(&sys->iomem, OBJECT(s),
+                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
+
+    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
+
+    sysbus_init_mmio(dev, &sys->iomem);
+
+    smmu_init_irq(s, dev);
+}
+
+static const VMStateDescription vmstate_smmuv3 = {
+    .name = "smmuv3",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
+        VMSTATE_END_OF_LIST(),
+    },
+};
+
+static void smmuv3_instance_init(Object *obj)
+{
+    /* Nothing much to do here as of now */
+}
+
+static void smmuv3_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    dc->reset   = smmu_reset;
+    dc->vmsd    = &vmstate_smmuv3;
+    dc->realize = smmu_realize;
+}
+
+static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
+                                                  void *data)
+{
+}
+
+static const TypeInfo smmuv3_type_info = {
+    .name          = TYPE_SMMU_V3_DEV,
+    .parent        = TYPE_SMMU_DEV_BASE,
+    .instance_size = sizeof(SMMUV3State),
+    .instance_init = smmuv3_instance_init,
+    .class_data    = NULL,
+    .class_size    = sizeof(SMMUV3Class),
+    .class_init    = smmuv3_class_init,
+};
+
+static const TypeInfo smmuv3_iommu_memory_region_info = {
+    .parent = TYPE_IOMMU_MEMORY_REGION,
+    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
+    .class_init = smmuv3_iommu_memory_region_class_init,
+};
+
+static void smmuv3_register_types(void)
+{
+    type_register(&smmuv3_type_info);
+    type_register(&smmuv3_iommu_memory_region_info);
+}
+
+type_init(smmuv3_register_types)
+
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index c67cd39..8affbf7 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -14,3 +14,6 @@ smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t
 smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
 smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
 smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
+
+#hw/arm/smmuv3.c
+smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
new file mode 100644
index 0000000..0c8973d
--- /dev/null
+++ b/include/hw/arm/smmuv3.h
@@ -0,0 +1,79 @@
+/*
+ * Copyright (C) 2014-2016 Broadcom Corporation
+ * Copyright (c) 2017 Red Hat, Inc.
+ * Written by Prem Mallappa, Eric Auger
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ARM_SMMUV3_H
+#define HW_ARM_SMMUV3_H
+
+#include "hw/arm/smmu-common.h"
+
+#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
+
+#define SMMU_NREGS            0x200
+
+typedef struct SMMUQueue {
+     hwaddr base;
+     uint32_t prod;
+     uint32_t cons;
+     union {
+          struct {
+               uint8_t prod:1;
+               uint8_t cons:1;
+          };
+          uint8_t unused;
+     } wrap;
+
+     uint16_t entries;           /* Number of entries */
+     uint8_t  ent_size;          /* Size of entry in bytes */
+     uint8_t  shift;             /* Size in log2 */
+} SMMUQueue;
+
+typedef struct SMMUV3State {
+    SMMUState     smmu_state;
+
+    /* Local cache of most-frequently used registers */
+#define SMMU_FEATURE_2LVL_STE (1 << 0)
+    uint32_t     features;
+    uint16_t     sid_size;
+    uint16_t     sid_split;
+    uint64_t     strtab_base;
+
+    uint32_t    regs[SMMU_NREGS];
+
+    qemu_irq     irq[4];
+    SMMUQueue    cmdq, evtq;
+
+} SMMUV3State;
+
+typedef enum {
+    SMMU_IRQ_EVTQ,
+    SMMU_IRQ_PRIQ,
+    SMMU_IRQ_CMD_SYNC,
+    SMMU_IRQ_GERROR,
+} SMMUIrq;
+
+typedef struct {
+    SMMUBaseClass smmu_base_class;
+} SMMUV3Class;
+
+#define TYPE_SMMU_V3_DEV   "smmuv3"
+#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
+#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
+    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
+
+#endif
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (4 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:01   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers Eric Auger
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

We introduce some helpers to handle wired IRQs and especially
GERROR interrupt. SMMU writes GERROR register on GERROR event
and SW acks GERROR interrupts by setting GERRORn.

The Wired interrupts are edge sensitive.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

Is CMD_SYNC interrupt enabled somewhere?
---
 hw/arm/smmuv3-internal.h | 20 ++++++++++++++++++
 hw/arm/smmuv3.c          | 55 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events      |  2 ++
 3 files changed, 77 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 488acc8..2b44ee2 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -198,4 +198,24 @@ static inline int smmu_enabled(SMMUV3State *s)
     return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
 }
 
+/*****************************
+ * Interrupts
+ *****************************/
+
+#define smmu_evt_irq_enabled(s)                   \
+    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_EVENT_EN)
+#define smmu_gerror_irq_enabled(s)                  \
+    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_GERROR_EN)
+#define smmu_pri_irq_enabled(s)                 \
+    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_PRI_EN)
+
+#define SMMU_PENDING_GERRORS(s) \
+    (smmu_read32_reg(s, SMMU_REG_GERROR) ^ \
+     smmu_read32_reg(s, SMMU_REG_GERRORN))
+
+#define SMMU_CMDQ_ERR(s) (SMMU_PENDING_GERRORS(s) & SMMU_GERROR_CMDQ)
+
+void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val);
+void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 0a7cd1c..468134f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -29,6 +29,61 @@
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-internal.h"
 
+/**
+ * smmuv3_irq_trigger - pulse @irq if enabled and update
+ * GERROR register in case of GERROR interrupt
+ *
+ * @irq: irq type
+ * @gerror: gerror new value, only relevant if @irq is GERROR
+ */
+void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
+{
+    uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
+    bool pulse = false;
+
+    switch (irq) {
+    case SMMU_IRQ_EVTQ:
+        pulse = smmu_evt_irq_enabled(s);
+        break;
+    case SMMU_IRQ_PRIQ:
+        pulse = smmu_pri_irq_enabled(s);
+        break;
+    case SMMU_IRQ_CMD_SYNC:
+        pulse = true;
+        break;
+    case SMMU_IRQ_GERROR:
+    {
+        /* don't toggle an already pending error */
+        bool new_gerrors = ~pending_gerrors & gerror_val;
+        uint32_t gerror = smmu_read32_reg(s, SMMU_REG_GERROR);
+
+        smmu_write32_reg(s, SMMU_REG_GERROR, gerror | new_gerrors);
+
+        /* pulse the GERROR irq only if all fields were acked */
+        pulse = smmu_gerror_irq_enabled(s) && !pending_gerrors;
+        break;
+    }
+    }
+    if (pulse) {
+            trace_smmuv3_irq_trigger(irq,
+                                     smmu_read32_reg(s, SMMU_REG_GERROR),
+                                     SMMU_PENDING_GERRORS(s));
+            qemu_irq_pulse(s->irq[irq]);
+    }
+}
+
+void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
+{
+    uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
+    uint32_t sanitized;
+
+    /* Make sure SW does not toggle irqs that are not active */
+    sanitized = gerrorn & pending_gerrors;
+
+    smmu_write32_reg(s, SMMU_REG_GERRORN, sanitized);
+    trace_smmuv3_write_gerrorn(gerrorn, sanitized, SMMU_PENDING_GERRORS(s));
+}
+
 static void smmuv3_init_regs(SMMUV3State *s)
 {
     uint32_t data =
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 8affbf7..c1ce8eb 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -17,3 +17,5 @@ smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa =
 
 #hw/arm/smmuv3.c
 smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_irq_trigger(int irq, uint32_t gerror, uint32_t pending) "irq=%d gerror=0x%x pending gerrors=0x%x"
+smmuv3_write_gerrorn(uint32_t gerrorn, uint32_t sanitized, uint32_t pending) "gerrorn=0x%x sanitized=0x%x pending=0x%x"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (5 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:12   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
                   ` (17 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

We introduce helpers to read/write into the circular queues.
smmuv3_read_cmdq and smmuv3_write_evtq will become static
later on.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3-internal.h | 48 ++++++++++++++++++++++++++++++-
 hw/arm/smmuv3.c          | 75 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 2b44ee2..d88f141 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -215,7 +215,53 @@ static inline int smmu_enabled(SMMUV3State *s)
 
 #define SMMU_CMDQ_ERR(s) (SMMU_PENDING_GERRORS(s) & SMMU_GERROR_CMDQ)
 
-void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val);
 void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn);
 
+/***************************
+ * Queue Handling
+ ***************************/
+
+typedef enum {
+    CMD_Q_EMPTY,
+    CMD_Q_FULL,
+    CMD_Q_PARTIALLY_FILLED,
+} SMMUQStatus;
+
+#define Q_ENTRY(q, idx)  (q->base + q->ent_size * idx)
+#define Q_WRAP(q, pc)    ((pc) >> (q)->shift)
+#define Q_IDX(q, pc)     ((pc) & ((1 << (q)->shift) - 1))
+
+static inline SMMUQStatus __smmu_queue_status(SMMUV3State *s, SMMUQueue *q)
+{
+    uint32_t prod = Q_IDX(q, q->prod);
+    uint32_t cons = Q_IDX(q, q->cons);
+
+    if ((prod == cons) && (q->wrap.prod != q->wrap.cons)) {
+        return CMD_Q_FULL;
+    } else if ((prod == cons) && (q->wrap.prod == q->wrap.cons)) {
+        return CMD_Q_EMPTY;
+    }
+    return CMD_Q_PARTIALLY_FILLED;
+}
+#define smmu_is_q_full(s, q) (__smmu_queue_status(s, q) == CMD_Q_FULL)
+#define smmu_is_q_empty(s, q) (__smmu_queue_status(s, q) == CMD_Q_EMPTY)
+
+static inline int __smmu_q_enabled(SMMUV3State *s, uint32_t q)
+{
+    return smmu_read32_reg(s, SMMU_REG_CR0) & q;
+}
+#define smmu_cmd_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_CMDQ_ENABLE)
+#define smmu_evt_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_EVTQ_ENABLE)
+
+static inline void smmu_write_cmdq_err(SMMUV3State *s, uint32_t err_type)
+{
+    uint32_t regval = smmu_read32_reg(s, SMMU_REG_CMDQ_CONS);
+
+    smmu_write32_reg(s, SMMU_REG_CMDQ_CONS,
+                        regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
+}
+
+MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd);
+void smmuv3_write_evtq(SMMUV3State *s, Evt *evt);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 468134f..2f96463 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -36,7 +36,7 @@
  * @irq: irq type
  * @gerror: gerror new value, only relevant if @irq is GERROR
  */
-void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
+static void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
 {
     uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
     bool pulse = false;
@@ -84,6 +84,79 @@ void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
     trace_smmuv3_write_gerrorn(gerrorn, sanitized, SMMU_PENDING_GERRORS(s));
 }
 
+static MemTxResult smmu_q_read(SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->cons));
+    MemTxResult ret;
+
+    ret = smmu_read_sysmem(addr, data, q->ent_size, false);
+    if (ret != MEMTX_OK) {
+        return ret;
+    }
+
+    q->cons++;
+    if (q->cons == q->entries) {
+        q->cons = 0;
+        q->wrap.cons++;
+    }
+
+    return ret;
+}
+
+static void smmu_q_write(SMMUQueue *q, void *data)
+{
+    uint64_t addr = Q_ENTRY(q, Q_IDX(q, q->prod));
+
+    smmu_write_sysmem(addr, data, q->ent_size, false);
+
+    q->prod++;
+    if (q->prod == q->entries) {
+        q->prod = 0;
+        q->wrap.prod++;
+    }
+}
+
+MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
+{
+    SMMUQueue *q = &s->cmdq;
+    MemTxResult ret = smmu_q_read(q, cmd);
+    uint32_t val = 0;
+
+    if (ret != MEMTX_OK) {
+        return ret;
+    }
+
+    val |= (q->wrap.cons << q->shift) | q->cons;
+    smmu_write32_reg(s, SMMU_REG_CMDQ_CONS, val);
+
+    return ret;
+}
+
+void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
+{
+    SMMUQueue *q = &s->evtq;
+    bool was_empty = smmu_is_q_empty(s, q);
+    bool was_full = smmu_is_q_full(s, q);
+    uint32_t val;
+
+    if (!smmu_evt_q_enabled(s)) {
+        return;
+    }
+
+    if (was_full) {
+        return;
+    }
+
+    smmu_q_write(q, evt);
+
+    val = (q->wrap.prod << q->shift) | q->prod;
+    smmu_write32_reg(s, SMMU_REG_EVTQ_PROD, val);
+
+    if (was_empty) {
+        smmuv3_irq_trigger(s, SMMU_IRQ_EVTQ, 0);
+    }
+}
+
 static void smmuv3_init_regs(SMMUV3State *s)
 {
     uint32_t data =
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (6 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:17   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper Eric Auger
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

Now we have relevant helpers for queue and irq
management, let's implement MMIO write operations

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3-internal.h | 103 +++++++++++++++++++++++-
 hw/arm/smmuv3.c          | 204 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/arm/trace-events      |  15 ++++
 3 files changed, 317 insertions(+), 5 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index d88f141..a5d60b4 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -215,8 +215,6 @@ static inline int smmu_enabled(SMMUV3State *s)
 
 #define SMMU_CMDQ_ERR(s) (SMMU_PENDING_GERRORS(s) & SMMU_GERROR_CMDQ)
 
-void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn);
-
 /***************************
  * Queue Handling
  ***************************/
@@ -261,7 +259,106 @@ static inline void smmu_write_cmdq_err(SMMUV3State *s, uint32_t err_type)
                         regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
 }
 
-MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd);
 void smmuv3_write_evtq(SMMUV3State *s, Evt *evt);
 
+/*****************************
+ * Commands
+ *****************************/
+
+enum {
+    SMMU_CMD_PREFETCH_CONFIG = 0x01,
+    SMMU_CMD_PREFETCH_ADDR,
+    SMMU_CMD_CFGI_STE,
+    SMMU_CMD_CFGI_STE_RANGE,
+    SMMU_CMD_CFGI_CD,
+    SMMU_CMD_CFGI_CD_ALL,
+    SMMU_CMD_CFGI_ALL,
+    SMMU_CMD_TLBI_NH_ALL     = 0x10,
+    SMMU_CMD_TLBI_NH_ASID,
+    SMMU_CMD_TLBI_NH_VA,
+    SMMU_CMD_TLBI_NH_VAA,
+    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
+    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
+    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
+    SMMU_CMD_TLBI_EL2_ASID,
+    SMMU_CMD_TLBI_EL2_VA,
+    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
+    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
+    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
+    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
+    SMMU_CMD_ATC_INV         = 0x40,
+    SMMU_CMD_PRI_RESP,
+    SMMU_CMD_RESUME          = 0x44,
+    SMMU_CMD_STALL_TERM,
+    SMMU_CMD_SYNC,          /* 0x46 */
+};
+
+static const char *cmd_stringify[] = {
+    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
+    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
+    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
+    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
+    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
+    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
+    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
+    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
+    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
+    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
+    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
+    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
+    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
+    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
+    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
+    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
+    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
+    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
+    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
+    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
+    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
+    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
+    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
+    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
+    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
+};
+
+/*****************************
+ * CMDQ fields
+ *****************************/
+
+typedef enum {
+    SMMU_CERROR_NONE = 0,
+    SMMU_CERROR_ILL,
+    SMMU_CERROR_ABT,
+    SMMU_CERROR_ATC_INV_SYNC,
+} SMMUCmdError;
+
+enum { /* Command completion notification */
+    CMD_SYNC_SIG_NONE,
+    CMD_SYNC_SIG_IRQ,
+    CMD_SYNC_SIG_SEV,
+};
+
+#define CMD_TYPE(x)  extract32((x)->word[0], 0, 8)
+#define CMD_SEC(x)   extract32((x)->word[0], 9, 1)
+#define CMD_SEV(x)   extract32((x)->word[0], 10, 1)
+#define CMD_AC(x)    extract32((x)->word[0], 12, 1)
+#define CMD_AB(x)    extract32((x)->word[0], 13, 1)
+#define CMD_CS(x)    extract32((x)->word[0], 12, 2)
+#define CMD_SSID(x)  extract32((x)->word[0], 16, 16)
+#define CMD_SID(x)   ((x)->word[1])
+#define CMD_VMID(x)  extract32((x)->word[1], 0, 16)
+#define CMD_ASID(x)  extract32((x)->word[1], 16, 16)
+#define CMD_STAG(x)  extract32((x)->word[2], 0, 16)
+#define CMD_RESP(x)  extract32((x)->word[2], 11, 2)
+#define CMD_GRPID(x) extract32((x)->word[3], 0, 8)
+#define CMD_SIZE(x)  extract32((x)->word[3], 0, 16)
+#define CMD_LEAF(x)  extract32((x)->word[3], 0, 1)
+#define CMD_SPAN(x)  extract32((x)->word[3], 0, 5)
+#define CMD_ADDR(x) ({                                  \
+            uint64_t addr = (uint64_t)(x)->word[3];     \
+            addr <<= 32;                                \
+            addr |=  extract32((x)->word[3], 12, 20);   \
+            addr;                                       \
+        })
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 2f96463..f35fadc 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -72,7 +72,7 @@ static void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
     }
 }
 
-void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
+static void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
 {
     uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
     uint32_t sanitized;
@@ -116,7 +116,7 @@ static void smmu_q_write(SMMUQueue *q, void *data)
     }
 }
 
-MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
+static MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
 {
     SMMUQueue *q = &s->cmdq;
     MemTxResult ret = smmu_q_read(q, cmd);
@@ -224,6 +224,147 @@ static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
     *base = val & ~(SMMU_BASE_RA | 0x3fULL);
 }
 
+static int smmuv3_cmdq_consume(SMMUV3State *s)
+{
+    SMMUCmdError cmd_error = SMMU_CERROR_NONE;
+
+    trace_smmuv3_cmdq_consume(SMMU_CMDQ_ERR(s), smmu_cmd_q_enabled(s),
+                              s->cmdq.prod, s->cmdq.cons,
+                              s->cmdq.wrap.prod, s->cmdq.wrap.cons);
+
+    if (!smmu_cmd_q_enabled(s)) {
+        return 0;
+    }
+
+    while (!SMMU_CMDQ_ERR(s) && !smmu_is_q_empty(s, &s->cmdq)) {
+        uint32_t type;
+        Cmd cmd;
+
+        if (smmuv3_read_cmdq(s, &cmd) != MEMTX_OK) {
+            cmd_error = SMMU_CERROR_ABT;
+            break;
+        }
+
+        type = CMD_TYPE(&cmd);
+
+        trace_smmuv3_cmdq_opcode(cmd_stringify[type]);
+
+        switch (CMD_TYPE(&cmd)) {
+        case SMMU_CMD_SYNC:
+            if (CMD_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
+                smmuv3_irq_trigger(s, SMMU_IRQ_CMD_SYNC, 0);
+            }
+            break;
+        case SMMU_CMD_PREFETCH_CONFIG:
+        case SMMU_CMD_PREFETCH_ADDR:
+            break;
+        case SMMU_CMD_CFGI_STE:
+        {
+             uint32_t streamid = cmd.word[1];
+
+             trace_smmuv3_cmdq_cfgi_ste(streamid);
+            break;
+        }
+        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
+        {
+            uint32_t start = cmd.word[1], range, end;
+
+            range = extract32(cmd.word[2], 0, 5);
+            end = start + (1 << (range + 1)) - 1;
+            trace_smmuv3_cmdq_cfgi_ste_range(start, end);
+            break;
+        }
+        case SMMU_CMD_CFGI_CD:
+        case SMMU_CMD_CFGI_CD_ALL:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        case SMMU_CMD_TLBI_NH_ALL:
+        case SMMU_CMD_TLBI_NH_ASID:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        case SMMU_CMD_TLBI_NH_VA:
+        {
+            int asid = extract32(cmd.word[1], 16, 16);
+            int vmid = extract32(cmd.word[1], 0, 16);
+            uint64_t low = extract32(cmd.word[2], 12, 20);
+            uint64_t high = cmd.word[3];
+            uint64_t addr = high << 32 | (low << 12);
+
+            trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
+            break;
+        }
+        case SMMU_CMD_TLBI_NH_VAA:
+        case SMMU_CMD_TLBI_EL3_ALL:
+        case SMMU_CMD_TLBI_EL3_VA:
+        case SMMU_CMD_TLBI_EL2_ALL:
+        case SMMU_CMD_TLBI_EL2_ASID:
+        case SMMU_CMD_TLBI_EL2_VA:
+        case SMMU_CMD_TLBI_EL2_VAA:
+        case SMMU_CMD_TLBI_S12_VMALL:
+        case SMMU_CMD_TLBI_S2_IPA:
+        case SMMU_CMD_TLBI_NSNH_ALL:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        case SMMU_CMD_ATC_INV:
+        case SMMU_CMD_PRI_RESP:
+        case SMMU_CMD_RESUME:
+        case SMMU_CMD_STALL_TERM:
+            trace_smmuv3_unhandled_cmd(type);
+            break;
+        default:
+            cmd_error = SMMU_CERROR_ILL;
+            error_report("Illegal command type: %d", CMD_TYPE(&cmd));
+            break;
+        }
+    }
+
+    if (cmd_error) {
+        error_report("GERROR_CMDQ: CONS.ERR=%d", cmd_error);
+        smmu_write_cmdq_err(s, cmd_error);
+        smmuv3_irq_trigger(s, SMMU_IRQ_GERROR, SMMU_GERROR_CMDQ);
+    }
+
+    trace_smmuv3_cmdq_consume_out(s->cmdq.wrap.prod, s->cmdq.prod,
+                                  s->cmdq.wrap.cons, s->cmdq.cons);
+
+    return 0;
+}
+
+static void smmu_update_qreg(SMMUV3State *s, SMMUQueue *q, hwaddr reg,
+                             uint32_t off, uint64_t val, unsigned size)
+{
+   if (size == 8 && off == 0) {
+        smmu_write64_reg(s, reg, val);
+    } else {
+        smmu_write32_reg(s, reg, val);
+    }
+
+    switch (off) {
+    case 0:                             /* BASE register */
+        val = smmu_read64_reg(s, reg);
+        q->shift = val & 0x1f;
+        q->entries = 1 << (q->shift);
+        smmu_update_base_reg(s, &q->base, val);
+        break;
+
+    case 8:                             /* PROD */
+        q->prod = Q_IDX(q, val);
+        q->wrap.prod = val >> q->shift;
+        break;
+
+    case 12:                             /* CONS */
+        q->cons = Q_IDX(q, val);
+        q->wrap.cons = val >> q->shift;
+        trace_smmuv3_update_qreg(q->cons, val);
+        break;
+
+    }
+
+    if (reg == SMMU_REG_CMDQ_PROD) {
+        smmuv3_cmdq_consume(s);
+    }
+}
+
 static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
 {
     switch (*addr) {
@@ -236,6 +377,65 @@ static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
 static void smmu_write_mmio(void *opaque, hwaddr addr,
                             uint64_t val, unsigned size)
 {
+    SMMUState *sys = opaque;
+    SMMUV3State *s = SMMU_V3_DEV(sys);
+
+    smmu_write_mmio_fixup(s, &addr);
+
+    trace_smmuv3_write_mmio(addr, val, size);
+
+    switch (addr) {
+    case 0xFDC ... 0xFFC:
+    case SMMU_REG_IDR0 ... SMMU_REG_IDR5:
+        trace_smmuv3_write_mmio_idr(addr, val);
+        return;
+    case SMMU_REG_GERRORN:
+        smmuv3_write_gerrorn(s, val);
+        /*
+         * By acknowledging the CMDQ_ERR, SW may notify cmds can
+         * be processed again
+         */
+        smmuv3_cmdq_consume(s);
+        return;
+    case SMMU_REG_CR0:
+        smmu_write32_reg(s, SMMU_REG_CR0, val);
+        /* immediatly reflect the changes in CR0_ACK */
+        smmu_write32_reg(s, SMMU_REG_CR0_ACK, val);
+        /* in case the command queue has been enabled */
+        smmuv3_cmdq_consume(s);
+        return;
+    case SMMU_REG_IRQ_CTRL:
+        smmu_write32_reg(s, SMMU_REG_IRQ_CTRL_ACK, val);
+        return;
+    case SMMU_REG_STRTAB_BASE:
+        smmu_update_base_reg(s, &s->strtab_base, val);
+        return;
+    case SMMU_REG_STRTAB_BASE_CFG:
+        if (((val >> 16) & 0x3) == 0x1) {
+            s->sid_split = (val >> 6) & 0x1f;
+            s->features |= SMMU_FEATURE_2LVL_STE;
+        }
+        return;
+    case SMMU_REG_CMDQ_BASE ... SMMU_REG_CMDQ_CONS:
+        smmu_update_qreg(s, &s->cmdq, addr, addr - SMMU_REG_CMDQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_EVTQ_BASE ... SMMU_REG_EVTQ_CONS:
+        smmu_update_qreg(s, &s->evtq, addr, addr - SMMU_REG_EVTQ_BASE,
+                         val, size);
+        return;
+
+    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_CONS:
+        error_report("%s PRI queue is not supported", __func__);
+        abort();
+    }
+
+    if (size == 8) {
+        smmu_write64_reg(s, addr, val);
+    } else {
+        smmu_write32_reg(s, addr, (uint32_t)val);
+    }
 }
 
 static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index c1ce8eb..40f2057 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -19,3 +19,18 @@ smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa =
 smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
 smmuv3_irq_trigger(int irq, uint32_t gerror, uint32_t pending) "irq=%d gerror=0x%x pending gerrors=0x%x"
 smmuv3_write_gerrorn(uint32_t gerrorn, uint32_t sanitized, uint32_t pending) "gerrorn=0x%x sanitized=0x%x pending=0x%x"
+smmuv3_unhandled_cmd(uint32_t type) "Unhandled command type=%d"
+smmuv3_cmdq_consume(int error, bool enabled, uint32_t prod, uint32_t cons, uint8_t wrap_prod, uint8_t wrap_cons) "error=%d, enabled=%d prod=%d cons=%d wrap.prod=%d wrap.cons=%d"
+smmuv3_cmdq_consume_details(hwaddr base, uint32_t cons, uint32_t prod, uint32_t word, uint8_t wrap_cons) "CMDQ base: 0x%"PRIx64" cons:%d prod:%d val:0x%x wrap:%d"
+smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
+smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
+smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
+smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
+smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
+smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
+smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
+smmuv3_update_qreg(uint32_t cons, uint64_t val) "cons written : %d val:0x%"PRIx64
+smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
+smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
+smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (7 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:34   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback Eric Auger
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

Let's introduce a helper function aiming at recording an
event in the event queue.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

At the moment, for some events we do not fill all the fields.
Typically filling the FetchAddr field resulting of an abort
on page table walk would require to return more information
from this latter in case of error.

However with enabled use cases I have not seen any event
recorded yet.
---
 hw/arm/smmuv3-internal.h | 45 ++++++++++++++++++++++++--
 hw/arm/smmuv3.c          | 84 +++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index a5d60b4..e3e9828 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -259,8 +259,6 @@ static inline void smmu_write_cmdq_err(SMMUV3State *s, uint32_t err_type)
                         regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
 }
 
-void smmuv3_write_evtq(SMMUV3State *s, Evt *evt);
-
 /*****************************
  * Commands
  *****************************/
@@ -361,4 +359,47 @@ enum { /* Command completion notification */
             addr;                                       \
         })
 
+/*****************************
+ * EVTQ fields
+ *****************************/
+
+#define EVT_Q_OVERFLOW        (1 << 31)
+
+#define EVT_SET_TYPE(x, t)    deposit32((x)->word[0], 0, 8, t)
+#define EVT_SET_SID(x, s)     ((x)->word[1] =  s)
+#define EVT_SET_INPUT_ADDR(x, addr) ({                    \
+            (x)->word[5] = (uint32_t)(addr >> 32);        \
+            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
+        })
+#define EVT_SET_RNW(x, rnw)     deposit32((x)->word[3], 3, 1, rnw)
+
+/*****************************
+ * Events
+ *****************************/
+
+typedef enum evt_err {
+    SMMU_EVT_OK,
+    SMMU_EVT_F_UUT,
+    SMMU_EVT_C_BAD_SID,
+    SMMU_EVT_F_STE_FETCH,
+    SMMU_EVT_C_BAD_STE,
+    SMMU_EVT_F_BAD_ATS_REQ,
+    SMMU_EVT_F_STREAM_DISABLED,
+    SMMU_EVT_F_TRANS_FORBIDDEN,
+    SMMU_EVT_C_BAD_SSID,
+    SMMU_EVT_F_CD_FETCH,
+    SMMU_EVT_C_BAD_CD,
+    SMMU_EVT_F_WALK_EXT_ABRT,
+    SMMU_EVT_F_TRANS        = 0x10,
+    SMMU_EVT_F_ADDR_SZ,
+    SMMU_EVT_F_ACCESS,
+    SMMU_EVT_F_PERM,
+    SMMU_EVT_F_TLB_CONFLICT = 0x20,
+    SMMU_EVT_F_CFG_CONFLICT = 0x21,
+    SMMU_EVT_E_PAGE_REQ     = 0x24,
+} SMMUEvtErr;
+
+void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
+                         uint32_t sid, bool is_write, SMMUEvtErr type);
+
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index f35fadc..7470576 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -132,7 +132,7 @@ static MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
     return ret;
 }
 
-void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
+static void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
 {
     SMMUQueue *q = &s->evtq;
     bool was_empty = smmu_is_q_empty(s, q);
@@ -157,6 +157,88 @@ void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
     }
 }
 
+/*
+ * smmuv3_record_event - Record an event
+ */
+void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
+                         uint32_t sid, IOMMUAccessFlags perm,
+                         SMMUEvtErr type)
+{
+    Evt evt;
+    bool rnw = perm & IOMMU_RO;
+
+    if (!smmu_evt_q_enabled(s)) {
+        return;
+    }
+
+    EVT_SET_TYPE(&evt, type);
+    EVT_SET_SID(&evt, sid);
+    /* SSV=0 (substream invalid) and substreamID= 0 */
+
+    switch (type) {
+    case SMMU_EVT_OK:
+        return;
+    case SMMU_EVT_F_UUT:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        /* PnU and Ind not filled */
+        break;
+    case SMMU_EVT_C_BAD_SID:
+        break;
+    case SMMU_EVT_F_STE_FETCH:
+        /* Implementation defined and FetchAddr not filled yet */
+        break;
+    case SMMU_EVT_C_BAD_STE:
+        break;
+    case SMMU_EVT_F_BAD_ATS_REQ:
+        /* ATS not yet implemented */
+        break;
+    case SMMU_EVT_F_STREAM_DISABLED:
+        break;
+    case SMMU_EVT_F_TRANS_FORBIDDEN:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        break;
+    case SMMU_EVT_C_BAD_SSID:
+        break;
+    case SMMU_EVT_F_CD_FETCH:
+        break;
+    case SMMU_EVT_C_BAD_CD:
+        /* Implementation defined and FetchAddr not filled yet */
+        break;
+    case SMMU_EVT_F_WALK_EXT_ABRT:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        /* Reason, Class, S2, Ind, PnU, FetchAddr not filled yet */
+        break;
+    case SMMU_EVT_F_TRANS:
+    case SMMU_EVT_F_ADDR_SZ:
+    case SMMU_EVT_F_ACCESS:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        /* STAG, Class, S2, InD, PnU, IPA not filled yet */
+        break;
+    case SMMU_EVT_F_PERM:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        /* STAG, TTRnW, Class, S2, InD, PnU, IPA not filled yet */
+        break;
+    case SMMU_EVT_F_TLB_CONFLICT:
+        EVT_SET_INPUT_ADDR(&evt, iova);
+        EVT_SET_RNW(&evt, rnw);
+        /* Reason, S2, InD, PnU, IPA not filled yet */
+        break;
+    case SMMU_EVT_F_CFG_CONFLICT:
+        /* Implementation defined reason not filled yet */
+        break;
+    case SMMU_EVT_E_PAGE_REQ:
+        /* PRI not supported */
+        break;
+    }
+
+    smmuv3_write_evtq(s, &evt);
+}
+
 static void smmuv3_init_regs(SMMUV3State *s)
 {
     uint32_t data =
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (8 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:45   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 11/20] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

This patch implements the IOMMU Memory Region translate()
callback. Most of the code relates to the translation
configuration decoding and check (STE, CD).

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3-internal.h | 182 +++++++++++++++++++++++-
 hw/arm/smmuv3.c          | 351 ++++++++++++++++++++++++++++++++++++++++++++++-
 hw/arm/trace-events      |   9 ++
 3 files changed, 537 insertions(+), 5 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index e3e9828..f9f95ae 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -399,7 +399,185 @@ typedef enum evt_err {
     SMMU_EVT_E_PAGE_REQ     = 0x24,
 } SMMUEvtErr;
 
-void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
-                         uint32_t sid, bool is_write, SMMUEvtErr type);
+/*****************************
+ * Configuration Data
+ *****************************/
+
+typedef struct __smmu_data2  STEDesc; /* STE Level 1 Descriptor */
+typedef struct __smmu_data16 Ste;     /* Stream Table Entry(STE) */
+typedef struct __smmu_data2  CDDesc;  /* CD Level 1 Descriptor */
+typedef struct __smmu_data16 Cd;      /* Context Descriptor(CD) */
+
+/*****************************
+ * STE fields
+ *****************************/
+
+#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
+#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
+enum {
+    STE_CONFIG_NONE      = 0,
+    STE_CONFIG_BYPASS    = 4,       /* S1 Bypass    , S2 Bypass */
+    STE_CONFIG_S1        = 5,       /* S1 Translate , S2 Bypass */
+    STE_CONFIG_S2        = 6,       /* S1 Bypass    , S2 Translate */
+    STE_CONFIG_NESTED    = 7,       /* S1 Translate , S2 Translate */
+};
+#define STE_S1FMT(x)   extract32((x)->word[0], 4, 2)
+#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
+#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
+#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
+#define STE_S2VMID(x)  extract32((x)->word[4], 0, 16)
+#define STE_S2T0SZ(x)  extract32((x)->word[5], 0, 6)
+#define STE_S2SL0(x)   extract32((x)->word[5], 6, 2)
+#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
+#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
+#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
+#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
+#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
+#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
+#define STE_CTXPTR(x)                                           \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
+        addr;                                                   \
+    })
+
+#define STE_S2TTB(x)                                            \
+    ({                                                          \
+        unsigned long addr;                                     \
+        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
+        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
+        addr;                                                   \
+    })
+
+static inline int is_ste_bypass(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_BYPASS;
+}
+
+static inline bool is_ste_stage1(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_S1;
+}
+
+static inline bool is_ste_stage2(Ste *ste)
+{
+    return STE_CONFIG(ste) == STE_CONFIG_S2;
+}
+
+/**
+ * is_s2granule_valid - Check the stage 2 translation granule size
+ * advertised in the STE matches any IDR5 supported value
+ */
+static inline bool is_s2granule_valid(Ste *ste)
+{
+    int idr5_format = 0;
+
+    switch (STE_S2TG(ste)) {
+    case 0: /* 4kB */
+        idr5_format = 0x1;
+        break;
+    case 1: /* 64 kB */
+        idr5_format = 0x4;
+        break;
+    case 2: /* 16 kB */
+        idr5_format = 0x2;
+        break;
+    case 3: /* reserved */
+        break;
+    }
+    idr5_format &= SMMU_IDR5_GRAN;
+    return idr5_format;
+}
+
+static inline int oas2bits(int oas_field)
+{
+    switch (oas_field) {
+    case 0b011:
+        return 42;
+    case 0b100:
+        return 44;
+    default:
+        return 32 + (1 << oas_field);
+   }
+}
+
+static inline int pa_range(Ste *ste)
+{
+    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
+
+    if (!STE_S2AA64(ste)) {
+        return 40;
+    }
+
+    return oas2bits(oas_field);
+}
+
+#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
+
+/*****************************
+ * CD fields
+ *****************************/
+#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
+#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
+#define CD_TTB(x, sel)                                      \
+    ({                                                      \
+        uint64_t hi, lo;                                    \
+        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
+        hi <<= 32;                                          \
+        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
+        hi | lo;                                            \
+    })
+
+#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
+#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
+#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
+
+#define CD_T0SZ(x)    CD_TSZ((x), 0)
+#define CD_T1SZ(x)    CD_TSZ((x), 1)
+#define CD_TG0(x)     CD_TG((x), 0)
+#define CD_TG1(x)     CD_TG((x), 1)
+#define CD_EPD0(x)    CD_EPD((x), 0)
+#define CD_EPD1(x)    CD_EPD((x), 1)
+#define CD_IPS(x)     extract32((x)->word[1], 0, 3)
+#define CD_AARCH64(x) extract32((x)->word[1], 9, 1)
+#define CD_TTB0(x)    CD_TTB((x), 0)
+#define CD_TTB1(x)    CD_TTB((x), 1)
+
+#define CDM_VALID(x)    ((x)->word[0] & 0x1)
+
+static inline int is_cd_valid(SMMUV3State *s, Ste *ste, Cd *cd)
+{
+    return CD_VALID(cd);
+}
+
+/**
+ * tg2granule - Decodes the CD translation granule size field according
+ * to the TT in use
+ * @bits: TG0/1 fiels
+ * @tg1: if set, @bits belong to TG1, otherwise belong to TG0
+ */
+static inline int tg2granule(int bits, bool tg1)
+{
+    switch (bits) {
+    case 1:
+        return tg1 ? 14 : 16;
+    case 2:
+        return tg1 ? 12 : 14;
+    case 3:
+        return tg1 ? 16 : 12;
+    default:
+        return 12;
+    }
+}
+
+#define L1STD_L2PTR(stm) ({                                 \
+            uint64_t hi, lo;                            \
+            hi = (stm)->word[1];                        \
+            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
+            hi << 32 | lo;                              \
+        })
+
+#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
 
 #endif
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 7470576..20fbce6 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -160,9 +160,9 @@ static void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
 /*
  * smmuv3_record_event - Record an event
  */
-void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
-                         uint32_t sid, IOMMUAccessFlags perm,
-                         SMMUEvtErr type)
+static void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
+                                uint32_t sid, IOMMUAccessFlags perm,
+                                SMMUEvtErr type)
 {
     Evt evt;
     bool rnw = perm & IOMMU_RO;
@@ -306,6 +306,348 @@ static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
     *base = val & ~(SMMU_BASE_RA | 0x3fULL);
 }
 
+/*
+ * All SMMU data structures are little endian, and are aligned to 8 bytes
+ * L1STE/STE/L1CD/CD, Queue entries in CMDQ/EVTQ/PRIQ
+ */
+static inline int smmu_get_ste(SMMUV3State *s, hwaddr addr, Ste *buf)
+{
+    trace_smmuv3_get_ste(addr);
+    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+}
+
+/*
+ * For now we only support CD with a single entry, 'ssid' is used to identify
+ * otherwise
+ */
+static inline int smmu_get_cd(SMMUV3State *s, Ste *ste, uint32_t ssid, Cd *buf)
+{
+    hwaddr addr = STE_CTXPTR(ste);
+
+    if (STE_S1CDMAX(ste) != 0) {
+        error_report("Multilevel Ctx Descriptor not supported yet");
+    }
+
+    trace_smmuv3_get_cd(addr);
+    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
+}
+
+/**
+ * is_ste_consistent - Check validity of STE
+ * according to 6.2.1 Validity of STE
+ * TODO: check the relevance of each check and compliance
+ * with this spec chapter
+ */
+static bool is_ste_consistent(SMMUV3State *s, Ste *ste)
+{
+    uint32_t _config = STE_CONFIG(ste);
+    uint32_t ste_vmid, ste_eats, ste_s2s, ste_s1fmt, ste_s2aa64, ste_s1cdmax;
+    uint32_t ste_strw;
+    bool strw_unused, addr_out_of_range, granule_supported;
+    bool config[] = {_config & 0x1, _config & 0x2, _config & 0x3};
+
+    ste_vmid = STE_S2VMID(ste);
+    ste_eats = STE_EATS(ste); /* Enable PCIe ATS trans */
+    ste_s2s = STE_S2S(ste);
+    ste_s1fmt = STE_S1FMT(ste);
+    ste_s2aa64 = STE_S2AA64(ste);
+    ste_s1cdmax = STE_S1CDMAX(ste); /*CD bit # S1ContextPtr */
+    ste_strw = STE_STRW(ste); /* stream world control */
+
+    if (!STE_VALID(ste)) {
+        error_report("STE NOT valid");
+        return false;
+    }
+
+    granule_supported = is_s2granule_valid(ste);
+
+    /* As S1/S2 combinations are supported do not check
+     * corresponding STE config values */
+
+    if (!config[2]) {
+        /* Report abort to device, no event recorded */
+        error_report("STE config 0b000 not implemented");
+        return false;
+    }
+
+    if (!SMMU_IDR1_SIDSIZE && ste_s1cdmax && config[0] &&
+        !SMMU_IDR0_CD2L && (ste_s1fmt == 1 || ste_s1fmt == 2)) {
+        error_report("STE inconsistant, CD mismatch");
+        return false;
+    }
+    if (SMMU_IDR0_ATS && ((_config & 0x3) == 0) &&
+        ((ste_eats == 2 && (_config != 0x7 || ste_s2s)) ||
+        (ste_eats == 1 && !ste_s2s))) {
+        error_report("STE inconsistant, EATS/S2S mismatch");
+        return false;
+    }
+    if (config[0] && (SMMU_IDR1_SIDSIZE &&
+        (ste_s1cdmax > SMMU_IDR1_SIDSIZE))) {
+        error_report("STE inconsistant, SSID out of range");
+        return false;
+    }
+
+    strw_unused = (!SMMU_IDR0_S1P || !SMMU_IDR0_HYP || (_config == 4));
+
+    addr_out_of_range = STE_S2TTB(ste) > MAX_PA(ste);
+
+    if (is_ste_stage2(ste)) {
+        if ((ste_s2aa64 && !is_s2granule_valid(ste)) ||
+            (!ste_s2aa64 && !(SMMU_IDR0_TTF & 0x1)) ||
+            (ste_s2aa64 && !(SMMU_IDR0_TTF & 0x2))  ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !ste_s2aa64) ||
+            ((STE_S2HA(ste) || STE_S2HD(ste)) && !SMMU_IDR0_HTTU) ||
+            (STE_S2HD(ste) && (SMMU_IDR0_HTTU == 1)) || addr_out_of_range) {
+            error_report("STE inconsistant");
+            trace_smmuv3_is_ste_consistent(config[1], granule_supported,
+                                           addr_out_of_range, ste_s2aa64,
+                                           STE_S2HA(ste), STE_S2HD(ste),
+                                           STE_S2TTB(ste));
+        return false;
+        }
+    }
+    if (SMMU_IDR0_S2P && (config[0] == 0 && config[1]) &&
+        (strw_unused || !ste_strw) && !SMMU_IDR0_VMID16 && !(ste_vmid >> 8)) {
+        error_report("STE inconsistant, VMID out of range");
+        return false;
+    }
+    return true;
+}
+
+/**
+ * smmu_find_ste - Return the stream table entry associated
+ * to the sid
+ *
+ * @s: smmuv3 handle
+ * @sid: stream ID
+ * @ste: returned stream table entry
+ *
+ * Supports linear and 2-level stream table
+ * Return 0 on success or an SMMUEvtErr enum value otherwise
+ */
+static int smmu_find_ste(SMMUV3State *s, uint16_t sid, Ste *ste)
+{
+    hwaddr addr;
+
+    trace_smmuv3_find_ste(sid, s->features, s->sid_split);
+    /* Check SID range */
+    if (sid > (1 << s->sid_size)) {
+        return SMMU_EVT_C_BAD_SID;
+    }
+    if (s->features & SMMU_FEATURE_2LVL_STE) {
+        int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+        hwaddr l1ptr, l2ptr;
+        STEDesc l1std;
+
+        l1_ste_offset = sid >> s->sid_split;
+        l2_ste_offset = sid & ((1 << s->sid_split) - 1);
+        l1ptr = (hwaddr)(s->strtab_base + l1_ste_offset * sizeof(l1std));
+        smmu_read_sysmem(l1ptr, &l1std, sizeof(l1std), false);
+        span = L1STD_SPAN(&l1std);
+
+        if (!span) {
+            /* l2ptr is not valid */
+            error_report("invalid sid=%d (L1STD span=0)", sid);
+            return SMMU_EVT_C_BAD_SID;
+        }
+        max_l2_ste = (1 << span) - 1;
+        l2ptr = L1STD_L2PTR(&l1std);
+        trace_smmuv3_find_ste_2lvl(s->strtab_base, l1ptr, l1_ste_offset,
+                                   l2ptr, l2_ste_offset, max_l2_ste);
+        if (l2_ste_offset > max_l2_ste) {
+            error_report("l2_ste_offset=%d > max_l2_ste=%d",
+                         l2_ste_offset, max_l2_ste);
+            return SMMU_EVT_C_BAD_STE;
+        }
+        addr = L1STD_L2PTR(&l1std) + l2_ste_offset * sizeof(*ste);
+    } else {
+        addr = s->strtab_base + sid * sizeof(*ste);
+    }
+
+    if (smmu_get_ste(s, addr, ste)) {
+        error_report("Unable to Fetch STE");
+        return SMMU_EVT_F_STE_FETCH;
+    }
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s1 - Populate the stage 1 translation config
+ * from the context descriptor
+ */
+static int smmu_cfg_populate_s1(SMMUTransCfg *cfg, Cd *cd)
+{
+    bool s1a64 = CD_AARCH64(cd);
+    int epd0 = CD_EPD0(cd);
+    int tg;
+
+    cfg->stage   = 1;
+    tg           = epd0 ? CD_TG1(cd) : CD_TG0(cd);
+    cfg->tsz     = epd0 ? CD_T1SZ(cd) : CD_T0SZ(cd);
+    cfg->ttbr    = epd0 ? CD_TTB1(cd) : CD_TTB0(cd);
+    cfg->oas     = oas2bits(CD_IPS(cd));
+
+    if (s1a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, epd0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+    cfg->aa64 = s1a64;
+    cfg->initial_level  = 4 - (64 - cfg->tsz - 4) / (cfg->granule_sz - 3);
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz, cfg->initial_level);
+
+    return 0;
+}
+
+/**
+ * smmu_cfg_populate_s2 - Populate the stage 2 translation config
+ * from the Stream Table Entry
+ */
+static int smmu_cfg_populate_s2(SMMUTransCfg *cfg, Ste *ste)
+{
+    bool s2a64 = STE_S2AA64(ste);
+    int default_initial_level;
+    int tg;
+
+    cfg->stage = 2;
+
+    tg           = STE_S2TG(ste);
+    cfg->tsz     = STE_S2T0SZ(ste);
+    cfg->ttbr    = STE_S2TTB(ste);
+    cfg->oas     = pa_range(ste);
+
+    cfg->aa64    = s2a64;
+
+    if (s2a64) {
+        cfg->tsz = MIN(cfg->tsz, 39);
+        cfg->tsz = MAX(cfg->tsz, 16);
+    }
+    cfg->granule_sz = tg2granule(tg, 0);
+
+    cfg->oas = MIN(oas2bits(SMMU_IDR5_OAS), cfg->oas);
+    /* fix ttbr - make top bits zero*/
+    cfg->ttbr = extract64(cfg->ttbr, 0, cfg->oas);
+
+    default_initial_level = 4 - (64 - cfg->tsz - 4) / (cfg->granule_sz - 3);
+    cfg->initial_level = ~STE_S2SL0(ste);
+    if (cfg->initial_level  != default_initial_level) {
+        error_report("%s concatenated translation tables at initial S2 lookup"
+                     " not supported", __func__);
+        return SMMU_EVT_C_BAD_STE;;
+    }
+
+    trace_smmuv3_cfg_stage(cfg->stage, cfg->oas, cfg->tsz, cfg->ttbr,
+                           cfg->aa64, cfg->granule_sz, cfg->initial_level);
+
+    return 0;
+}
+
+/**
+ * smmuv3_decode_config - Prepare the translation configuration
+ * for the @mr iommu region
+ * @mr: iommu memory region the translation config must be prepared for
+ * @cfg: output translation configuration
+ *
+ * return 0 on success or an SMMUEvtErr enum value otherwise
+ */
+static int smmuv3_decode_config(IOMMUMemoryRegion *mr, SMMUTransCfg *cfg)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    int sid = smmu_get_sid(sdev);
+    SMMUV3State *s = sdev->smmu;
+    Ste ste;
+    Cd cd;
+    int ret = 0;
+
+    if (!smmu_enabled(s)) {
+        cfg->disabled = true;
+        return 0;
+    }
+    ret = smmu_find_ste(s, sid, &ste);
+    if (ret) {
+        return ret;
+    }
+
+    if (!STE_VALID(&ste)) {
+        return SMMU_EVT_C_BAD_STE;
+    }
+
+    switch (STE_CONFIG(&ste)) {
+    case STE_CONFIG_BYPASS:
+        cfg->bypassed = true;
+        return 0;
+    case STE_CONFIG_S1:
+         break;
+    case STE_CONFIG_S2:
+         break;
+    default: /* reserved, abort, nested */
+        return SMMU_EVT_F_UUT;
+    }
+
+    /* S1 or S2 */
+
+    if (!is_ste_consistent(s, &ste)) {
+        return SMMU_EVT_C_BAD_STE;
+    }
+
+    if (is_ste_stage1(&ste)) {
+        ret = smmu_get_cd(s, &ste, 0, &cd); /* We dont have SSID yet */
+        if (ret) {
+            return SMMU_EVT_F_CD_FETCH;
+        }
+
+        if (!is_cd_valid(s, &ste, &cd)) {
+            return SMMU_EVT_C_BAD_CD;
+        }
+        return smmu_cfg_populate_s1(cfg, &cd);
+    }
+
+    return smmu_cfg_populate_s2(cfg, &ste);
+}
+
+static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion *mr, hwaddr addr,
+                                      IOMMUAccessFlags flag)
+{
+    SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
+    SMMUV3State *s = sdev->smmu;
+    uint16_t sid = smmu_get_sid(sdev);
+    SMMUEvtErr ret;
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry = {
+        .target_as = &address_space_memory,
+        .iova = addr,
+        .translated_addr = addr,
+        .addr_mask = ~(hwaddr)0,
+        .perm = flag,
+    };
+
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret || cfg.disabled || cfg.bypassed) {
+        goto out;
+    }
+
+    entry.addr_mask = (1 << cfg.granule_sz) - 1;
+
+    ret = smmu_translate(&cfg, &entry);
+
+    trace_smmuv3_translate(mr->parent_obj.name, sid, addr,
+                           entry.translated_addr, entry.perm, ret);
+out:
+    if (ret) {
+        error_report("%s translation failed for iova=0x%"PRIx64,
+                     mr->parent_obj.name, addr);
+        smmuv3_record_event(s, entry.iova, sid, flag, ret);
+    }
+    return entry;
+}
+
 static int smmuv3_cmdq_consume(SMMUV3State *s)
 {
     SMMUCmdError cmd_error = SMMU_CERROR_NONE;
@@ -621,6 +963,9 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
                                                   void *data)
 {
+    IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
+
+    imrc->translate = smmuv3_translate;
 }
 
 static const TypeInfo smmuv3_type_info = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 40f2057..e643fc3 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -34,3 +34,12 @@ smmuv3_write_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" v
 smmuv3_write_mmio_idr(hwaddr addr, uint64_t val) "write to RO/Unimpl reg 0x%lx val64:0x%lx"
 smmuv3_write_mmio_evtq_cons_bef_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "Before clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
 smmuv3_write_mmio_evtq_cons_after_clear(uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "after clearing interrupt prod:0x%x cons:0x%x prod.w:%d cons.w:%d"
+smmuv3_is_ste_consistent(bool cfg, bool granule_supported, bool addr_oor, uint32_t aa64, int s2ha, int s2hd, uint64_t s2ttb ) "config[1]:%d gran:%d addr:%d aa64:%d s2ha:%d s2hd:%d s2ttb:0x%"PRIx64
+smmuv3_find_ste(uint16_t sid, uint32_t features, uint16_t sid_split) "SID:0x%x features:0x%x, sid_split:0x%x"
+smmuv3_find_ste_2lvl(uint64_t strtab_base, hwaddr l1ptr, int l1_ste_offset, hwaddr l2ptr, int l2_ste_offset, int max_l2_ste) "strtab_base:0x%lx l1ptr:0x%"PRIx64" l1_off:0x%x, l2ptr:0x%"PRIx64" l2_off:0x%x max_l2_ste:%d"
+smmuv3_get_ste(hwaddr addr) "STE addr: 0x%"PRIx64
+smmuv3_translate_bypass(const char *n, uint16_t sid, hwaddr addr, bool is_write) "%s sid=%d bypass iova:0x%"PRIx64" is_write=%d"
+smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x bus:%d strtab_base:0x%"PRIx64
+smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
+smmuv3_translate(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm, int ret) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x (%d)"
+smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 11/20] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (9 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 12/20] hw/arm/smmuv3: Implement data structure and TLB invalidation notifications Eric Auger
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

In case the MSI is translated by an IOMMU we need to fixup the
MSI route with the translated address.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

v5 -> v6:
- use IOMMUMemoryRegionClass API

It is still unclear to me if we need to register an IOMMUNotifier
to handle any change in the MSI doorbell which would occur behind
the scene and would not lead to any call to kvm_arch_fixup_msi_route().
---
 target/arm/kvm.c        | 27 +++++++++++++++++++++++++++
 target/arm/trace-events |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7c17f0d..a2fa948 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -20,8 +20,13 @@
 #include "sysemu/kvm.h"
 #include "kvm_arm.h"
 #include "cpu.h"
+#include "trace.h"
 #include "internals.h"
 #include "hw/arm/arm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/arm/smmu-common.h"
+#include "hw/arm/smmuv3.h"
 #include "exec/memattrs.h"
 #include "exec/address-spaces.h"
 #include "hw/boards.h"
@@ -662,6 +667,28 @@ int kvm_arm_vgic_probe(void)
 int kvm_arch_fixup_msi_route(struct kvm_irq_routing_entry *route,
                              uint64_t address, uint32_t data, PCIDevice *dev)
 {
+    AddressSpace *as = pci_device_iommu_address_space(dev);
+    IOMMUMemoryRegionClass *imrc;
+    IOMMUTLBEntry entry;
+    SMMUDevice *sdev;
+
+    if (as == &address_space_memory) {
+        return 0;
+    }
+
+    /* MSI doorbell address is translated by an IOMMU */
+    sdev = container_of(as, SMMUDevice, as);
+    imrc = IOMMU_MEMORY_REGION_GET_CLASS(&sdev->iommu);
+
+    entry = imrc->translate(&sdev->iommu, address, IOMMU_WO);
+
+    route->u.msi.address_lo = entry.translated_addr;
+    route->u.msi.address_hi = entry.translated_addr >> 32;
+
+    trace_kvm_arm_fixup_msi_route(address, sdev->devfn,
+                                  sdev->iommu.parent_obj.name,
+                                  entry.translated_addr);
+
     return 0;
 }
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 9e37131..8b3c220 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -8,3 +8,6 @@ arm_gt_tval_write(int timer, uint64_t value) "gt_tval_write: timer %d value 0x%"
 arm_gt_ctl_write(int timer, uint64_t value) "gt_ctl_write: timer %d value 0x%" PRIx64
 arm_gt_imask_toggle(int timer, int irqstate) "gt_ctl_write: timer %d IMASK toggle, new irqstate %d"
 arm_gt_cntvoff_write(uint64_t value) "gt_cntvoff_write: value 0x%" PRIx64
+
+# target/arm/kvm.c
+kvm_arm_fixup_msi_route(uint64_t iova, uint32_t devid, const char *name, uint64_t gpa) "MSI addr = 0x%"PRIx64" is translated for devfn=%d through %s into 0x%"PRIx64
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 12/20] hw/arm/smmuv3: Implement data structure and TLB invalidation notifications
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (10 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 11/20] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback Eric Auger
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

When the guest invalidates data structure (STE, CD) or TLB
we need to notify the IOMMU region notifiers. This allows
vhost integration and also prepares for VFIO integration.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- move SMMU_CMD_TLBI_NH_VA_AM in a separate patch
- rationalize names and add some comments
- fix devfn computation in smmuv3_replay_sid
- direcly use smmuv3_notify_iova_range
- move smmuv3_replay (used for VFIO) in a separate patch

v5 -> v6:
- use IOMMUMemoryRegion
- handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
  (goes along with TLBI_ON_MAP FW quirk)
- replay systematically unmap the whole range first
- smmuv3_map_hook does not unmap anymore and the unmap is done
  before the replay
- add and use smmuv3_context_device_invalidate instead of
  blindly replaying everything
---
 hw/arm/smmuv3.c     | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 hw/arm/trace-events |   5 ++
 2 files changed, 138 insertions(+), 4 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 20fbce6..8e7d10d 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -25,6 +25,7 @@
 #include "exec/address-spaces.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "exec/target_page.h"
 
 #include "hw/arm/smmuv3.h"
 #include "smmuv3-internal.h"
@@ -648,6 +649,123 @@ out:
     return entry;
 }
 
+static int smmuv3_notify_entry(IOMMUTLBEntry *entry, void *private)
+{
+    trace_smmuv3_notify_entry(entry->iova, entry->translated_addr,
+                              entry->addr_mask, entry->perm);
+    memory_region_notify_one((IOMMUNotifier *)private, entry);
+    return 0;
+}
+
+static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
+                                     uint64_t iova, size_t size)
+{
+    SMMUTransCfg cfg = {};
+    IOMMUTLBEntry entry;
+    int ret;
+
+    trace_smmuv3_notify_iova_range(mr->parent_obj.name, iova, size, n);
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret) {
+        error_report("%s error decoding the configuration for iommu mr=%s",
+                     __func__, mr->parent_obj.name);
+    }
+
+    if (cfg.disabled || cfg.bypassed) {
+        return;
+    }
+
+    /* first unmap */
+    entry.target_as = &address_space_memory;
+    entry.iova = iova & ~(size - 1);
+    entry.addr_mask = size - 1;
+    entry.perm = IOMMU_NONE;
+
+    memory_region_notify_one(n, &entry);
+
+    /* then figure out if a new mapping needs to be applied */
+    smmu_page_walk(&cfg, iova, iova + entry.addr_mask , false,
+                   smmuv3_notify_entry, n);
+}
+
+static void smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
+                                       IOMMUNotifierFlag old,
+                                       IOMMUNotifierFlag new)
+{
+    SMMUDevice *sdev = container_of(iommu, SMMUDevice, iommu);
+    SMMUV3State *s3 = sdev->smmu;
+    SMMUState *s = &(s3->smmu_state);
+    SMMUNotifierNode *node = NULL;
+    SMMUNotifierNode *next_node = NULL;
+
+    if (old == IOMMU_NOTIFIER_NONE) {
+        trace_smmuv3_notify_flag_add(iommu->parent_obj.name);
+        node = g_malloc0(sizeof(*node));
+        node->sdev = sdev;
+        QLIST_INSERT_HEAD(&s->notifiers_list, node, next);
+        return;
+    }
+
+    /* update notifier node with new flags */
+    QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) {
+        if (node->sdev == sdev) {
+            if (new == IOMMU_NOTIFIER_NONE) {
+                trace_smmuv3_notify_flag_del(iommu->parent_obj.name);
+                QLIST_REMOVE(node, next);
+                g_free(node);
+            }
+            return;
+        }
+    }
+}
+/*
+ * Replay all iommu memory regions attached to the smmu
+ */
+static void smmuv3_replay_all(SMMUState *s)
+{
+    SMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        trace_smmuv3_replay_mr(node->sdev->iommu.parent_obj.name);
+        memory_region_iommu_replay_all(&node->sdev->iommu);
+    }
+}
+
+/*
+ * Replay the iommu memory region corresponding to a given streamid
+ */
+static void smmuv3_replay_sid(SMMUState *s, uint16_t sid)
+{
+    uint8_t bus_n, devfn;
+    SMMUPciBus *smmu_bus;
+    SMMUDevice *smmu;
+
+    bus_n = PCI_BUS_NUM(sid);
+    smmu_bus = smmu_find_as_from_bus_num(s, bus_n);
+    if (smmu_bus) {
+        devfn = sid & 0xFF;
+        smmu = smmu_bus->pbdev[devfn];
+        if (smmu) {
+            trace_smmuv3_replay_mr(smmu->iommu.parent_obj.name);
+            memory_region_iommu_replay_all(&smmu->iommu);
+        }
+    }
+}
+
+static void smmuv3_replay_iova_range(SMMUState *s, uint64_t iova, size_t size)
+{
+    SMMUNotifierNode *node;
+
+    QLIST_FOREACH(node, &s->notifiers_list, next) {
+        IOMMUMemoryRegion *mr = &node->sdev->iommu;
+        IOMMUNotifier *n;
+
+        IOMMU_NOTIFIER_FOREACH(n, mr) {
+            smmuv3_notify_iova_range(mr, n, iova, size);
+        }
+    }
+}
+
 static int smmuv3_cmdq_consume(SMMUV3State *s)
 {
     SMMUCmdError cmd_error = SMMU_CERROR_NONE;
@@ -687,24 +805,32 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
              uint32_t streamid = cmd.word[1];
 
              trace_smmuv3_cmdq_cfgi_ste(streamid);
+             smmuv3_replay_sid(&s->smmu_state, streamid);
             break;
         }
         case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
         {
-            uint32_t start = cmd.word[1], range, end;
+            uint32_t start = cmd.word[1], range, end, i;
 
             range = extract32(cmd.word[2], 0, 5);
             end = start + (1 << (range + 1)) - 1;
             trace_smmuv3_cmdq_cfgi_ste_range(start, end);
+            for (i = start; i <= end; i++) {
+                smmuv3_replay_sid(&s->smmu_state, i);
+            }
             break;
         }
         case SMMU_CMD_CFGI_CD:
         case SMMU_CMD_CFGI_CD_ALL:
-            trace_smmuv3_unhandled_cmd(type);
+        {
+            uint32_t streamid = cmd.word[1];
+
+            smmuv3_replay_sid(&s->smmu_state, streamid);
             break;
+        }
         case SMMU_CMD_TLBI_NH_ALL:
         case SMMU_CMD_TLBI_NH_ASID:
-            trace_smmuv3_unhandled_cmd(type);
+            smmuv3_replay_all(&s->smmu_state);
             break;
         case SMMU_CMD_TLBI_NH_VA:
         {
@@ -713,8 +839,10 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
             uint64_t low = extract32(cmd.word[2], 12, 20);
             uint64_t high = cmd.word[3];
             uint64_t addr = high << 32 | (low << 12);
+            size_t size = qemu_target_page_size();
 
             trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
+            smmuv3_replay_iova_range(&s->smmu_state, addr, size);
             break;
         }
         case SMMU_CMD_TLBI_NH_VAA:
@@ -727,7 +855,7 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
         case SMMU_CMD_TLBI_S12_VMALL:
         case SMMU_CMD_TLBI_S2_IPA:
         case SMMU_CMD_TLBI_NSNH_ALL:
-            trace_smmuv3_unhandled_cmd(type);
+            smmuv3_replay_all(&s->smmu_state);
             break;
         case SMMU_CMD_ATC_INV:
         case SMMU_CMD_PRI_RESP:
@@ -966,6 +1094,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
     imrc->translate = smmuv3_translate;
+    imrc->notify_flag_changed = smmuv3_notify_flag_changed;
 }
 
 static const TypeInfo smmuv3_type_info = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index e643fc3..4ac264d 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -43,3 +43,8 @@ smmuv3_translate_in(uint16_t sid, int pci_bus_num, hwaddr strtab_base) "SID:0x%x
 smmuv3_get_cd(hwaddr addr) "CD addr: 0x%"PRIx64
 smmuv3_translate(const char *n, uint16_t sid, hwaddr iova, hwaddr translated, int perm, int ret) "%s sid=%d iova=0x%"PRIx64" translated=0x%"PRIx64" perm=0x%x (%d)"
 smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, uint32_t granule_sz, int initial_level) "TransCFG stage:%d oas:%d tsz:%d ttbr:0x%"PRIx64"  aa64:%d granule_sz:%d, initial_level = %d"
+smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
+smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
+smmuv3_replay_mr(const char *name) "iommu mr=%s"
+smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
+smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (11 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 12/20] hw/arm/smmuv3: Implement data structure and TLB invalidation notifications Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-14  9:27   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 14/20] hw/arm/virt: Store the PCI host controller dt phandle Eric Auger
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

memory_region_iommu_replay() is used for VFIO integration.

However its default implementation is not adapted to SMMUv3
IOMMU memory region. Indeed the input address range is too
huge and its execution is too slow as it calls the translate()
callback on each granule.

Let's implement the replay callback which hierarchically walk
over the page table structure and notify only the segments
that are populated with valid entries.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
 hw/arm/trace-events |  1 +
 2 files changed, 37 insertions(+)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 8e7d10d..c43bd93 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry *entry, void *private)
     return 0;
 }
 
+/* Unmap the whole notifier's range */
+static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
+{
+    IOMMUTLBEntry entry;
+    hwaddr size = n->end - n->start + 1;
+
+    entry.target_as = &address_space_memory;
+    entry.iova = n->start & ~(size - 1);
+    entry.perm = IOMMU_NONE;
+    entry.addr_mask = size - 1;
+
+    memory_region_notify_one(n, &entry);
+}
+
+static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
+{
+    SMMUTransCfg cfg = {};
+    int ret;
+
+    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
+    smmuv3_unmap_notifier_range(n);
+
+    ret = smmuv3_decode_config(mr, &cfg);
+    if (ret) {
+        error_report("%s error decoding the configuration for iommu mr=%s",
+                     __func__, mr->parent_obj.name);
+    }
+
+    if (cfg.disabled || cfg.bypassed) {
+        return;
+    }
+    /* walk the page tables and replay valid entries */
+    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
+                   smmuv3_notify_entry, n);
+}
 static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
                                      uint64_t iova, size_t size)
 {
@@ -1095,6 +1130,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
 
     imrc->translate = smmuv3_translate;
     imrc->notify_flag_changed = smmuv3_notify_flag_changed;
+    imrc->replay = smmuv3_replay;
 }
 
 static const TypeInfo smmuv3_type_info = {
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 4ac264d..15f84d6 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, ui
 smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
 smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
 smmuv3_replay_mr(const char *name) "iommu mr=%s"
+smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end) "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
 smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
 smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 14/20] hw/arm/virt: Store the PCI host controller dt phandle
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (12 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions Eric Auger
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

Let's allocate a phandle for the PCI host controller dt
node and store this latter in the Virt Machine State. This
will simplify fdt operations when we bind smmu and PCI host
controller.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/virt.c         | 5 ++++-
 include/hw/arm/virt.h | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6b7a0fe..39886c1 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -991,7 +991,7 @@ static void create_pcie_irq_map(const VirtMachineState *vms,
                            0x7           /* PCI irq */);
 }
 
-static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
+static void create_pcie(VirtMachineState *vms, qemu_irq *pic)
 {
     hwaddr base_mmio = vms->memmap[VIRT_PCIE_MMIO].base;
     hwaddr size_mmio = vms->memmap[VIRT_PCIE_MMIO].size;
@@ -1100,8 +1100,11 @@ static void create_pcie(const VirtMachineState *vms, qemu_irq *pic)
                                      2, base_mmio, 2, size_mmio);
     }
 
+    vms->pcihost_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
     qemu_fdt_setprop_cell(vms->fdt, nodename, "#interrupt-cells", 1);
     create_pcie_irq_map(vms, vms->gic_phandle, irq, nodename);
+    qemu_fdt_setprop_cell(vms->fdt, nodename, "phandle", vms->pcihost_phandle);
 
     g_free(nodename);
 }
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 33b0ff3..ae2bf2c 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -105,6 +105,7 @@ typedef struct {
     uint32_t clock_phandle;
     uint32_t gic_phandle;
     uint32_t msi_phandle;
+    uint32_t pcihost_phandle;
     int psci_conduit;
 } VirtMachineState;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (13 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 14/20] hw/arm/virt: Store the PCI host controller dt phandle Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:47   ` Peter Maydell
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 16/20] hw/arm/sysbus-fdt: Pass the platform bus base address in PlatformBusFDTData Eric Auger
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

The VirtMachineState contains some dt phandles that will be used
in some node creation functions. For instance we plan to use the
PCI host controller phandle in the smmu node creation function. So
let's pass the VirtMachineState handle down to the node creation
functions by enhancing the involved datatypes.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/sysbus-fdt.c         | 3 +++
 hw/arm/virt.c               | 1 +
 include/hw/arm/sysbus-fdt.h | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index d68e3dc..d92a983 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -36,6 +36,7 @@
 #include "hw/vfio/vfio-platform.h"
 #include "hw/vfio/vfio-calxeda-xgmac.h"
 #include "hw/vfio/vfio-amd-xgbe.h"
+#include "hw/arm/virt.h"
 #include "hw/arm/fdt.h"
 
 /*
@@ -47,6 +48,7 @@ typedef struct PlatformBusFDTData {
     int irq_start; /* index of the first IRQ usable by platform bus devices */
     const char *pbus_node_name; /* name of the platform bus node */
     PlatformBusDevice *pbus;
+    VirtMachineState *vms;
 } PlatformBusFDTData;
 
 /*
@@ -514,6 +516,7 @@ static void add_all_platform_bus_fdt_nodes(ARMPlatformBusFDTParams *fdt_params)
         .irq_start = irq_start,
         .pbus_node_name = node,
         .pbus = pbus,
+        .vms = fdt_params->vms,
     };
 
     /* Loop through all dynamic sysbus devices and create their node */
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 39886c1..d7c28b0 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1125,6 +1125,7 @@ static void create_platform_bus(VirtMachineState *vms, qemu_irq *pic)
     fdt_params->system_params = &platform_bus_params;
     fdt_params->binfo = &vms->bootinfo;
     fdt_params->intc = "/intc";
+    fdt_params->vms = vms;
     /*
      * register a machine init done notifier that creates the device tree
      * nodes of the platform bus and its children dynamic sysbus devices
diff --git a/include/hw/arm/sysbus-fdt.h b/include/hw/arm/sysbus-fdt.h
index e15bb81..f5feabc 100644
--- a/include/hw/arm/sysbus-fdt.h
+++ b/include/hw/arm/sysbus-fdt.h
@@ -25,6 +25,7 @@
 #define HW_ARM_SYSBUS_FDT_H
 
 #include "hw/arm/arm.h"
+#include "hw/arm/virt.h"
 #include "qemu-common.h"
 #include "hw/sysbus.h"
 
@@ -48,6 +49,7 @@ typedef struct {
     const ARMPlatformBusSystemParams *system_params;
     struct arm_boot_info *binfo;
     const char *intc; /* parent interrupt controller name */
+    VirtMachineState *vms;
 } ARMPlatformBusFDTParams;
 
 /**
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 16/20] hw/arm/sysbus-fdt: Pass the platform bus base address in PlatformBusFDTData
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (14 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 17/20] hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation Eric Auger
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

The Base address of the platform bus may be useful for device node
creation function. This is typically the case if the node creation
function also prepares data for ACPI table generation.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/sysbus-fdt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index d92a983..f8c4909 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -46,6 +46,7 @@
 typedef struct PlatformBusFDTData {
     void *fdt; /* device tree handle */
     int irq_start; /* index of the first IRQ usable by platform bus devices */
+    hwaddr base; /* base address of the platform bus */
     const char *pbus_node_name; /* name of the platform bus node */
     PlatformBusDevice *pbus;
     VirtMachineState *vms;
@@ -514,6 +515,7 @@ static void add_all_platform_bus_fdt_nodes(ARMPlatformBusFDTParams *fdt_params)
     PlatformBusFDTData data = {
         .fdt = fdt,
         .irq_start = irq_start,
+        .base = addr,
         .pbus_node_name = node,
         .pbus = pbus,
         .vms = fdt_params->vms,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 17/20] hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (15 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 16/20] hw/arm/sysbus-fdt: Pass the platform bus base address in PlatformBusFDTData Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 18/20] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

This patch adds a node creation function for the smmuv3. Using
"-device smmuv3" it is now possible to get the iommu instantiated
on ARM VIRT machine. The node creation function handles the creation
of the smmuv3 node and also add the iommu-map property on the generic
PCI host controller node.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c     |   2 +
 hw/arm/sysbus-fdt.c | 110 ++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 96 insertions(+), 16 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index c43bd93..9c8640f 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1121,6 +1121,8 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
     dc->reset   = smmu_reset;
     dc->vmsd    = &vmstate_smmuv3;
     dc->realize = smmu_realize;
+    /* Supported by TYPE_VIRT_MACHINE */
+    dc->user_creatable = true;
 }
 
 static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index f8c4909..9bbfbde 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -36,6 +36,7 @@
 #include "hw/vfio/vfio-platform.h"
 #include "hw/vfio/vfio-calxeda-xgmac.h"
 #include "hw/vfio/vfio-amd-xgbe.h"
+#include "hw/arm/smmuv3.h"
 #include "hw/arm/virt.h"
 #include "hw/arm/fdt.h"
 
@@ -126,6 +127,31 @@ static HostProperty clock_copied_properties[] = {
     {"clock-output-names", true},
 };
 
+static char *fdt_get_node_path(void *fdt, int phandle)
+{
+    char *node_path = NULL;
+    int ret, node_offset, path_len = 16;;
+
+    node_offset = fdt_node_offset_by_phandle(fdt, phandle);
+    if (node_offset <= 0) {
+        error_setg(&error_fatal,
+                   "not able to locate clock handle %d in device tree",
+                   phandle);
+    }
+
+    node_path = g_malloc(path_len);
+    while ((ret = fdt_get_path(fdt, node_offset, node_path, path_len))
+            == -FDT_ERR_NOSPACE) {
+        path_len += 16;
+        node_path = g_realloc(node_path, path_len);
+    }
+    if (ret < 0) {
+        g_free(node_path);
+        node_path = NULL;
+    }
+    return node_path;
+}
+
 /**
  * fdt_build_clock_node
  *
@@ -142,24 +168,12 @@ static void fdt_build_clock_node(void *host_fdt, void *guest_fdt,
                                 uint32_t host_phandle,
                                 uint32_t guest_phandle)
 {
-    char *node_path = NULL;
-    char *nodename;
+    char *node_path, *nodename;
     const void *r;
-    int ret, node_offset, prop_len, path_len = 16;
+    int prop_len;
 
-    node_offset = fdt_node_offset_by_phandle(host_fdt, host_phandle);
-    if (node_offset <= 0) {
-        error_setg(&error_fatal,
-                   "not able to locate clock handle %d in host device tree",
-                   host_phandle);
-    }
-    node_path = g_malloc(path_len);
-    while ((ret = fdt_get_path(host_fdt, node_offset, node_path, path_len))
-            == -FDT_ERR_NOSPACE) {
-        path_len += 16;
-        node_path = g_realloc(node_path, path_len);
-    }
-    if (ret < 0) {
+    node_path = fdt_get_node_path(host_fdt, host_phandle);
+    if (!node_path) {
         error_setg(&error_fatal,
                    "not able to retrieve node path for clock handle %d",
                    host_phandle);
@@ -416,6 +430,69 @@ static int add_amd_xgbe_fdt_node(SysBusDevice *sbdev, void *opaque)
     return 0;
 }
 
+static int add_smmuv3_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+    const char irq_names[] = "eventq\0priq\0cmdq-sync\0gerror";
+    const char compat[] = "arm,smmu-v3";
+    uint32_t reg_attr[2], irq_attr[12], smmu_phandle;
+    uint64_t mmio_base, irq_number;
+    PlatformBusFDTData *data = opaque;
+    const char *parent_node = data->pbus_node_name;
+    PlatformBusDevice *pbus = data->pbus;
+    VirtMachineState *vms = data->vms;
+    void *guest_fdt = data->fdt;
+    char *nodename, *node_path;
+    int i;
+
+    mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+    nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+                               "smmuv3", mmio_base);
+    qemu_fdt_add_subnode(guest_fdt, nodename);
+
+    qemu_fdt_setprop(guest_fdt, nodename, "compatible", compat, sizeof(compat));
+
+    reg_attr[0] = cpu_to_be32(mmio_base);
+    reg_attr[1] = cpu_to_be32(0x20000);
+    qemu_fdt_setprop(guest_fdt, nodename, "reg",
+                     reg_attr, 2 * sizeof(uint32_t));
+
+    for (i = 0; i < 4; i++) {
+        irq_number = platform_bus_get_irqn(pbus, sbdev , i) + data->irq_start;
+        irq_attr[3 * i] = cpu_to_be32(GIC_FDT_IRQ_TYPE_SPI);
+        irq_attr[3 * i + 1] = cpu_to_be32(irq_number);
+        irq_attr[3 * i + 2] = cpu_to_be32(GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+    }
+    qemu_fdt_setprop(guest_fdt, nodename, "interrupts",
+                     irq_attr, 4 * 3 * sizeof(uint32_t));
+    qemu_fdt_setprop(guest_fdt, nodename, "interrupt-names", irq_names,
+                     sizeof(irq_names));
+
+    qemu_fdt_setprop_cell(guest_fdt, nodename, "clocks", vms->clock_phandle);
+    qemu_fdt_setprop_string(guest_fdt, nodename, "clock-names", "apb_pclk");
+    qemu_fdt_setprop(guest_fdt, nodename, "dma-coherent", NULL, 0);
+
+    qemu_fdt_setprop_cell(guest_fdt, nodename, "#iommu-cells", 1);
+
+    smmu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
+
+    qemu_fdt_setprop_cell(guest_fdt, nodename, "phandle", smmu_phandle);
+
+    node_path = fdt_get_node_path(guest_fdt, vms->pcihost_phandle);
+    if (!node_path) {
+        error_setg(&error_fatal,
+                   "not able to retrieve node path for pci ctlr phandle %d",
+                   vms->pcihost_phandle);
+    }
+
+    qemu_fdt_setprop_cells(guest_fdt, node_path, "iommu-map",
+                     0x0, smmu_phandle, 0x0, 0x10000);
+
+    g_free(nodename);
+    g_free(node_path);
+
+    return 0;
+}
+
 #endif /* CONFIG_LINUX */
 
 /* list of supported dynamic sysbus devices */
@@ -423,6 +500,7 @@ static const NodeCreationPair add_fdt_node_functions[] = {
 #ifdef CONFIG_LINUX
     {TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
     {TYPE_VFIO_AMD_XGBE, add_amd_xgbe_fdt_node},
+    {TYPE_SMMU_V3_DEV, add_smmuv3_fdt_node},
 #endif
     {"", NULL}, /* last element */
 };
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 18/20] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (16 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 17/20] hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling Eric Auger
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

From: Prem Mallappa <prem.mallappa@broadcom.com>

This patch builds the smmuv3 node in the ACPI IORT table. As
the smmu is dynamically instantianted using the platform bus,
the dt node creation function fills the information used by
the IORT table generation function (base address, base irq,
type of the smmu).

The RID space of the root complex, which spans 0x0-0x10000
maps to streamid space 0x0-0x10000 in smmuv3, which in turn
maps to deviceid space 0x0-0x10000 in the ITS group.

Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v6 -> v7:
- adapt to the fact the smmuv3 now is dynamically instantiated.
- inverse sync and gerror interrupts

v2 -> v3:
- integrate into the existing IORT table made up of ITS, RC nodes
- take into account vms->smmu
- match linux actbl2.h acpi_iort_smmu_v3 field names
---
 hw/arm/sysbus-fdt.c         |  5 ++++
 hw/arm/virt-acpi-build.c    | 58 +++++++++++++++++++++++++++++++++++++++------
 include/hw/acpi/acpi-defs.h | 15 ++++++++++++
 include/hw/arm/virt.h       | 13 ++++++++++
 4 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 9bbfbde..4583acf 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -487,6 +487,11 @@ static int add_smmuv3_fdt_node(SysBusDevice *sbdev, void *opaque)
     qemu_fdt_setprop_cells(guest_fdt, node_path, "iommu-map",
                      0x0, smmu_phandle, 0x0, 0x10000);
 
+    vms->smmu_info.type = VIRT_IOMMU_SMMUV3;
+    vms->smmu_info.reg.base = data->base + mmio_base;
+    vms->smmu_info.reg.size = 0x20000;
+    vms->smmu_info.irq_base = irq_number;
+
     g_free(nodename);
     g_free(node_path);
 
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 3d78ff6..8395898 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -43,6 +43,7 @@
 #include "hw/pci/pcie_host.h"
 #include "hw/pci/pci.h"
 #include "hw/arm/virt.h"
+#include "hw/arm/smmuv3.h"
 #include "sysemu/numa.h"
 #include "kvm_arm.h"
 
@@ -393,19 +394,26 @@ build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
 }
 
 static void
-build_iort(GArray *table_data, BIOSLinker *linker)
+build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-    int iort_start = table_data->len;
+    int nb_nodes, iort_start = table_data->len;
     AcpiIortIdMapping *idmap;
     AcpiIortItsGroup *its;
     AcpiIortTable *iort;
-    size_t node_size, iort_length;
+    AcpiIortSmmu3 *smmu;
+    size_t node_size, iort_length, smmu_offset = 0;
     AcpiIortRC *rc;
 
     iort = acpi_data_push(table_data, sizeof(*iort));
 
+    if (vms->smmu_info.type) {
+        nb_nodes = 3; /* RC, ITS, SMMUv3 */
+    } else {
+        nb_nodes = 2; /* RC, ITS */
+    }
+
     iort_length = sizeof(*iort);
-    iort->node_count = cpu_to_le32(2); /* RC and ITS nodes */
+    iort->node_count = cpu_to_le32(nb_nodes);
     iort->node_offset = cpu_to_le32(sizeof(*iort));
 
     /* ITS group node */
@@ -418,6 +426,36 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
+    if (vms->smmu_info.type == VIRT_IOMMU_SMMUV3) {
+        int irq =  vms->smmu_info.irq_base;
+
+        /* SMMUv3 node */
+        smmu_offset = cpu_to_le32(iort->node_offset + node_size);
+        node_size = sizeof(*smmu) + sizeof(*idmap);
+        iort_length += node_size;
+        smmu = acpi_data_push(table_data, node_size);
+
+
+        smmu->type = ACPI_IORT_NODE_SMMU_V3;
+        smmu->length = cpu_to_le16(node_size);
+        smmu->mapping_count = cpu_to_le32(1);
+        smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
+        smmu->base_address = cpu_to_le64(vms->smmu_info.reg.base);
+        smmu->event_gsiv = cpu_to_le32(irq);
+        smmu->pri_gsiv = cpu_to_le32(irq + 1);
+        smmu->sync_gsiv = cpu_to_le32(irq + 2);
+        smmu->gerr_gsiv = cpu_to_le32(irq + 3);
+        smmu->flags = 0x1; /* COHACC Override */
+
+        /* Identity RID mapping covering the whole input RID range */
+        idmap = &smmu->id_mapping_array[0];
+        idmap->input_base = 0;
+        idmap->id_count = cpu_to_le32(0xFFFF);
+        idmap->output_base = 0;
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
+
     /* Root Complex Node */
     node_size = sizeof(*rc) + sizeof(*idmap);
     iort_length += node_size;
@@ -438,8 +476,14 @@ build_iort(GArray *table_data, BIOSLinker *linker)
     idmap->input_base = 0;
     idmap->id_count = cpu_to_le32(0xFFFF);
     idmap->output_base = 0;
-    /* output IORT node is the ITS group node (the first node) */
-    idmap->output_reference = cpu_to_le32(iort->node_offset);
+
+    if (vms->smmu_info.type) {
+        /* output IORT node is the smmuv3 node */
+        idmap->output_reference = cpu_to_le32(smmu_offset);
+    } else {
+        /* output IORT node is the ITS group node (the first node) */
+        idmap->output_reference = cpu_to_le32(iort->node_offset);
+    }
 
     iort->length = cpu_to_le32(iort_length);
 
@@ -782,7 +826,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
 
     if (its_class_name() && !vmc->no_its) {
         acpi_add_table(table_offsets, tables_blob);
-        build_iort(tables_blob, tables->linker);
+        build_iort(tables_blob, tables->linker, vms);
     }
 
     /* XSDT is pointed to by RSDP */
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 72be675..69307b7 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -697,6 +697,21 @@ struct AcpiIortItsGroup {
 } QEMU_PACKED;
 typedef struct AcpiIortItsGroup AcpiIortItsGroup;
 
+struct AcpiIortSmmu3 {
+    ACPI_IORT_NODE_HEADER_DEF
+    uint64_t base_address;
+    uint32_t flags;
+    uint32_t reserved2;
+    uint64_t vatos_address;
+    uint32_t model;
+    uint32_t event_gsiv;
+    uint32_t pri_gsiv;
+    uint32_t gerr_gsiv;
+    uint32_t sync_gsiv;
+    AcpiIortIdMapping id_mapping_array[0];
+} QEMU_PACKED;
+typedef struct AcpiIortSmmu3 AcpiIortSmmu3;
+
 struct AcpiIortRC {
     ACPI_IORT_NODE_HEADER_DEF
     AcpiIortMemoryAccess memory_properties;
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index ae2bf2c..fd6f34f 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -87,6 +87,18 @@ typedef struct {
     bool claim_edge_triggered_timers;
 } VirtMachineClass;
 
+typedef enum VirtIOMMUType {
+    VIRT_IOMMU_NONE,
+    VIRT_IOMMU_SMMUV3,
+    VIRT_IOMMU_VIRTIO,
+} VirtIOMMUType;
+
+typedef struct VirtIOMMUInfo {
+    VirtIOMMUType type;
+    MemMapEntry reg;
+    int irq_base;
+} VirtIOMMUInfo;
+
 typedef struct {
     MachineState parent;
     Notifier machine_done;
@@ -95,6 +107,7 @@ typedef struct {
     bool highmem;
     bool its;
     bool virt;
+    VirtIOMMUInfo smmu_info;
     int32_t gic_version;
     struct arm_boot_info bootinfo;
     const MemMapEntry *memmap;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (17 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 18/20] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:48   ` Peter Maydell
  2017-10-17 15:06   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option Eric Auger
                   ` (5 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

SMMUV3 does not support any IOVA range TLBI command:
SMMU_CMD_TLBI_NH_VA invalidates TLB entries by page.
That's an issue when running DPDK on guest. DPDK uses
hugepages but each time a hugepage is mapped on guest side,
a storm of SMMU_CMD_TLBI_NH_VA commands get sent by the
guest smmuv3 driver and trapped by QEMU for VFIO replay.

Let's get prepared to handle implementation defined commands,
SMMU_CMD_TLBI_NH_VA_VM, which invalidate a range of IOVAs.

Upon this command, we notify the whole range in one host.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3-internal.h |  1 +
 hw/arm/smmuv3.c          | 13 +++++++++++++
 hw/arm/trace-events      |  1 +
 3 files changed, 15 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index f9f95ae..e70cf76 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -289,6 +289,7 @@ enum {
     SMMU_CMD_RESUME          = 0x44,
     SMMU_CMD_STALL_TERM,
     SMMU_CMD_SYNC,          /* 0x46 */
+    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
 };
 
 static const char *cmd_stringify[] = {
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 9c8640f..55dc80b 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -880,6 +880,19 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
             smmuv3_replay_iova_range(&s->smmu_state, addr, size);
             break;
         }
+        case SMMU_CMD_TLBI_NH_VA_AM:
+        {
+            int asid = extract32(cmd.word[1], 16, 16);
+            int am = extract32(cmd.word[1], 0, 16);
+            uint64_t low = extract32(cmd.word[2], 12, 20);
+            uint64_t high = cmd.word[3];
+            uint64_t addr = high << 32 | (low << 12);
+            size_t size = am << 12;
+
+            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
+            smmuv3_replay_iova_range(&s->smmu_state, addr, size);
+            break;
+        }
         case SMMU_CMD_TLBI_NH_VAA:
         case SMMU_CMD_TLBI_EL3_ALL:
         case SMMU_CMD_TLBI_EL3_VA:
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 15f84d6..fba33ac 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -26,6 +26,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
 smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
 smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
 smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
+smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
 smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
 smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
 smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (18 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling Eric Auger
@ 2017-09-01 17:21 ` Eric Auger
  2017-10-09 17:49   ` Peter Maydell
  2017-09-07 12:39 ` [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Peter Maydell
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Eric Auger @ 2017-09-01 17:21 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson
  Cc: drjones, christoffer.dall, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, tcain, bharat.bhushan, tn, mst, will.deacon,
	jean-philippe.brucker, robin.murphy, peterx, edgar.iglesias,
	wtownsen

In VFIO use cases, the virtual smmu translates IOVA->IPA (stage 1)
whereas the physical SMMU translates IPA -> host PA (stage 2).

The 2 stages of the physical SMMU are currently not used.
Instead both stage 1 and stage2 mappings are combined together
and programmed in a single stage (S1) in the physical SMMU.

The drawback of this approach is each time the IOVA->IPA mapping
is changed by the guest, the host must be notified to re-program
the physical SMMU with the combined stages.

So we need to trap into the QEMU device each time the guest alters
the configuration or TLB data. Unfortunately the SMMU does not
expose any caching mode as the Intel IOMMU. On Intel, this caching
mode HW bit informs the OS that each time it updates the remapping
structures (even on map) it must invalidate the caches. Those
invalidate commands are used to notify the host that it must
recompute S1+S2 mappings and reprogram the HW.

As we don't have the HW bit on ARM, we currently rely on a
a FW quirk on guest smmuv3 driver side. When this FW quirk is
applied the driver performs TLB invalidations on map and
sends SMMU_CMD_TLBI_NH_VA_AM commands.

Those TLB invalidations are used to trap changes in the
translation tables.

We introduced a new implemented defined SMMU_CMD_TLBI_NH_VA_AM
command since it allows to inavlidate a whole range instead
of invalidating a single page (native SMMU_CMD_TLBI_NH_VA command).

As a consequence anybody wanting to use virtual smmuv3 in VFIO
use case must add
-device smmuv3,caching-mode
to the option line.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 hw/arm/smmuv3.c          |  7 +++++++
 hw/arm/sysbus-fdt.c      | 11 ++++++++++-
 hw/arm/virt-acpi-build.c |  7 ++++++-
 include/hw/arm/smmuv3.h  |  1 +
 include/hw/arm/virt.h    |  1 +
 5 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 55dc80b..bb35e50 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1122,6 +1122,12 @@ static const VMStateDescription vmstate_smmuv3 = {
     },
 };
 
+static Property smmuv3_dev_properties[] = {
+    DEFINE_PROP_BOOL("caching-mode", SMMUV3State, cm, false),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+
 static void smmuv3_instance_init(Object *obj)
 {
     /* Nothing much to do here as of now */
@@ -1131,6 +1137,7 @@ static void smmuv3_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
 
+    dc->props   = smmuv3_dev_properties;
     dc->reset   = smmu_reset;
     dc->vmsd    = &vmstate_smmuv3;
     dc->realize = smmu_realize;
diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 4583acf..cfb40a4 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -440,6 +440,7 @@ static int add_smmuv3_fdt_node(SysBusDevice *sbdev, void *opaque)
     const char *parent_node = data->pbus_node_name;
     PlatformBusDevice *pbus = data->pbus;
     VirtMachineState *vms = data->vms;
+    SMMUV3State *smmu = SMMU_V3_DEV(sbdev);
     void *guest_fdt = data->fdt;
     char *nodename, *node_path;
     int i;
@@ -471,6 +472,10 @@ static int add_smmuv3_fdt_node(SysBusDevice *sbdev, void *opaque)
     qemu_fdt_setprop_string(guest_fdt, nodename, "clock-names", "apb_pclk");
     qemu_fdt_setprop(guest_fdt, nodename, "dma-coherent", NULL, 0);
 
+    if (smmu->cm) {
+        qemu_fdt_setprop(guest_fdt, nodename, "tlbi-on-map", NULL, 0);
+    }
+
     qemu_fdt_setprop_cell(guest_fdt, nodename, "#iommu-cells", 1);
 
     smmu_phandle = qemu_fdt_alloc_phandle(vms->fdt);
@@ -487,7 +492,11 @@ static int add_smmuv3_fdt_node(SysBusDevice *sbdev, void *opaque)
     qemu_fdt_setprop_cells(guest_fdt, node_path, "iommu-map",
                      0x0, smmu_phandle, 0x0, 0x10000);
 
-    vms->smmu_info.type = VIRT_IOMMU_SMMUV3;
+    if (smmu->cm) {
+        vms->smmu_info.type = VIRT_IOMMU_SMMUV3_CACHING_MODE;
+    } else {
+        vms->smmu_info.type = VIRT_IOMMU_SMMUV3;
+    }
     vms->smmu_info.reg.base = data->base + mmio_base;
     vms->smmu_info.reg.size = 0x20000;
     vms->smmu_info.irq_base = irq_number;
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 8395898..cc10e79 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -426,7 +426,8 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     its->its_count = cpu_to_le32(1);
     its->identifiers[0] = 0; /* MADT translation_id */
 
-    if (vms->smmu_info.type == VIRT_IOMMU_SMMUV3) {
+    if (vms->smmu_info.type == VIRT_IOMMU_SMMUV3 ||
+        vms->smmu_info.type == VIRT_IOMMU_SMMUV3_CACHING_MODE) {
         int irq =  vms->smmu_info.irq_base;
 
         /* SMMUv3 node */
@@ -438,6 +439,10 @@ build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 
         smmu->type = ACPI_IORT_NODE_SMMU_V3;
         smmu->length = cpu_to_le16(node_size);
+
+        if (vms->smmu_info.type == VIRT_IOMMU_SMMUV3_CACHING_MODE) {
+            smmu->model = 0x3; /* ACPI_IORT_SMMU_V3_CACHING_MODE */
+        }
         smmu->mapping_count = cpu_to_le32(1);
         smmu->mapping_offset = cpu_to_le32(sizeof(*smmu));
         smmu->base_address = cpu_to_le64(vms->smmu_info.reg.base);
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 0c8973d..16ae5c6 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -58,6 +58,7 @@ typedef struct SMMUV3State {
     qemu_irq     irq[4];
     SMMUQueue    cmdq, evtq;
 
+    bool cm; /* caching mode */
 } SMMUV3State;
 
 typedef enum {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index fd6f34f..7669a7c 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -90,6 +90,7 @@ typedef struct {
 typedef enum VirtIOMMUType {
     VIRT_IOMMU_NONE,
     VIRT_IOMMU_SMMUV3,
+    VIRT_IOMMU_SMMUV3_CACHING_MODE,
     VIRT_IOMMU_VIRTIO,
 } VirtIOMMUType;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (19 preceding siblings ...)
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option Eric Auger
@ 2017-09-07 12:39 ` Peter Maydell
  2017-09-08  8:35   ` Auger Eric
  2017-09-08  5:47 ` Michael S. Tsirkin
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2017-09-07 12:39 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> This series implements the emulation code for ARM SMMUv3.
>
> Changes since v6:
> - DPDK testpmd now running on guest with 2 assigned VFs
> - Changed the instantiation method: add the following option to
>   the QEMU command line
>   -device smmuv3 # for virtio/vhost use cases
>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> - splitted the series into smaller patches to allow the review
> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>   is isolated from the rest: last 2 patches, not for upstream.
>   This is shipped for testing/bench until a better solution is found.
> - Reworked permission flag checks and event generation

Hi Eric -- I see you've upgraded this from an RFC to a PATCH set.
Do you want the patches reviewed and (eventually) taken into git
now?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (20 preceding siblings ...)
  2017-09-07 12:39 ` [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Peter Maydell
@ 2017-09-08  5:47 ` Michael S. Tsirkin
  2017-09-08  8:36   ` Auger Eric
  2017-09-12  6:18 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Michael S. Tsirkin @ 2017-09-08  5:47 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, drjones, christoffer.dall,
	Radha.Chintakuntla, Sunil.Goutham, mohun106, tcain,
	bharat.bhushan, tn, will.deacon, jean-philippe.brucker,
	robin.murphy, peterx, edgar.iglesias, wtownsen

On Fri, Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> This series implements the emulation code for ARM SMMUv3.

Can you add some code to block using vfio with this
until patches 19+20 are ready?
Then 1-18 could be applied.

> Changes since v6:
> - DPDK testpmd now running on guest with 2 assigned VFs
> - Changed the instantiation method: add the following option to
>   the QEMU command line
>   -device smmuv3 # for virtio/vhost use cases
>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> - splitted the series into smaller patches to allow the review
> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>   is isolated from the rest: last 2 patches, not for upstream.
>   This is shipped for testing/bench until a better solution is found.
> - Reworked permission flag checks and event generation
> 
> testing:
> - in dt and ACPI modes
> - virtio-net-pci and vhost-net devices using dma ops with various
>   guest page sizes [2]
> - assigned VFs using dma ops [3]:
>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>   with guest and host page size equal (4kB)
> 
> Known limitations:
> - no VMSAv8-32 suport
> - no nested stage support (S1 + S2)
> - no support for HYP mappings
> - register fine emulation, commands, interrupts and errors were
>   not accurately tested. Handling is sufficient to run use cases
>   described above though.
> - interrupts and event generation not observed yet.
> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
> Previous version at:
> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
> 
> References:
> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>     https://lkml.org/lkml/2017/8/11/426
> 
> [2] qemu cmd line excerpt:
> -device smmuv3 \
> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
> [3] use -device smmuv3,caching-mode
> 
> 
> History:
> v6 -> v7:
> - see above
> 
> v5 -> v6:
> - Rebase on 2.10 and IOMMUMemoryRegion
> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>   ACPI mode)
> - fix block replay
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> 
> v4 -> v5:
> - initial_level now part of SMMUTransCfg
> - smmu_page_walk_64 takes into account the max input size
> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
> - smmuv3_translate: bug fix: don't walk on bypass
> - smmu_update_qreg: fix PROD index update
> - I did not yet address Peter's comments as the code is not mature enough
>   to be split into sub patches.
> 
> v3 -> v4 [Eric]:
> - page table walk rewritten to allow scan of the page table within a
>   range of IOVA. This prepares for VFIO integration and replay.
> - configuration parsing partially reworked.
> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>   PRI, ATS, ..
> - added ACPI table generation
> - migrated to dynamic traces
> - mingw compilation fix
> 
> v2 -> v3 [Eric]:
> - rebased on 2.9
> - mostly code and patch reorganization to ease the review process
> - optional patches removed. They may be handled separately. I am currently
>   working on ACPI enablement.
> - optional instantiation of the smmu in mach-virt
> - removed [2/9] (fdt functions) since not mandated
> - start splitting main patch into base and derived object
> - no new function feature added
> 
> v1 -> v2 [Prem]:
> - Adopted review comments from Eric Auger
>         - Make SMMU_DPRINTF to internally call qemu_log
>             (since translation requests are too many, we need control
>              on the type of log we want)
>         - SMMUTransCfg modified to suite simplicity
>         - Change RegInfo to uint64 register array
>         - Code cleanup
>         - Test cleanups
> - Reshuffled patches
> 
> v0 -> v1 [Prem]:
> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
> - Reworked register access/update logic
> - Factored out translation code for
>         - single point bug fix
>         - sharing/removal in future
> - (optional) Unit tests added, with PCI test device
>         - S1 with 4k/64k, S1+S2 with 4k/64k
>         - (S1 or S2) only can be verified by Linux 4.7 driver
>         - (optional) Priliminary ACPI support
> 
> v0 [Prem]:
> - Implements SMMUv3 spec 11.0
> - Supported for PCIe devices,
> - Command Queue and Event Queue supported
> - LPAE only, S1 is supported and Tested, S2 not tested
> - BE mode Translation not supported
> - IRQ support (legacy, no MSI)
> 
> Eric Auger (18):
>   hw/arm/smmu-common: smmu base device and datatypes
>   hw/arm/smmu-common: IOMMU memory region and address space setup
>   hw/arm/smmu-common: smmu_read/write_sysmem
>   hw/arm/smmu-common: VMSAv8-64 page table walk
>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>   hw/arm/smmuv3: Queue helpers
>   hw/arm/smmuv3: Implement MMIO write operations
>   hw/arm/smmuv3: Event queue recording helper
>   hw/arm/smmuv3: Implement translate callback
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>     notifications
>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>   hw/arm/virt: Store the PCI host controller dt phandle
>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>     functions
>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>     PlatformBusFDTData
>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
> 
> Prem Mallappa (2):
>   hw/arm/smmuv3: Skeleton
>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
> 
>  default-configs/aarch64-softmmu.mak |    1 +
>  hw/arm/Makefile.objs                |    1 +
>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>  hw/arm/smmu-internal.h              |  105 ++++
>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>  hw/arm/sysbus-fdt.c                 |  129 +++-
>  hw/arm/trace-events                 |   48 ++
>  hw/arm/virt-acpi-build.c            |   63 +-
>  hw/arm/virt.c                       |    6 +-
>  include/hw/acpi/acpi-defs.h         |   15 +
>  include/hw/arm/smmu-common.h        |  123 ++++
>  include/hw/arm/smmuv3.h             |   80 +++
>  include/hw/arm/sysbus-fdt.h         |    2 +
>  include/hw/arm/virt.h               |   15 +
>  target/arm/kvm.c                    |   27 +
>  target/arm/trace-events             |    3 +
>  17 files changed, 2886 insertions(+), 24 deletions(-)
>  create mode 100644 hw/arm/smmu-common.c
>  create mode 100644 hw/arm/smmu-internal.h
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmu-common.h
>  create mode 100644 include/hw/arm/smmuv3.h
> 
> -- 
> 2.5.5

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-07 12:39 ` [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Peter Maydell
@ 2017-09-08  8:35   ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-08  8:35 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

Hi Peter,

On 07/09/2017 14:39, Peter Maydell wrote:
> On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
>> This series implements the emulation code for ARM SMMUv3.
>>
>> Changes since v6:
>> - DPDK testpmd now running on guest with 2 assigned VFs
>> - Changed the instantiation method: add the following option to
>>   the QEMU command line
>>   -device smmuv3 # for virtio/vhost use cases
>>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
>> - splitted the series into smaller patches to allow the review
>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>>   is isolated from the rest: last 2 patches, not for upstream.
>>   This is shipped for testing/bench until a better solution is found.
>> - Reworked permission flag checks and event generation
> 
> Hi Eric -- I see you've upgraded this from an RFC to a PATCH set.
> Do you want the patches reviewed and (eventually) taken into git
> now?

Yes I split the series to make it more reviewable and from a functional
point of view I have run all major use cases. So now I would encourage
people to start reviewing the series (focusing on patches 1-18).

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-08  5:47 ` Michael S. Tsirkin
@ 2017-09-08  8:36   ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-08  8:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: peter.maydell, drjones, tcain, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, jean-philippe.brucker, tn, will.deacon, qemu-devel,
	peterx, alex.williamson, qemu-arm, christoffer.dall,
	edgar.iglesias, robin.murphy, wtownsen, bharat.bhushan,
	prem.mallappa, eric.auger.pro

Hi Michael,

On 08/09/2017 07:47, Michael S. Tsirkin wrote:
> On Fri, Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
>> This series implements the emulation code for ARM SMMUv3.
> 
> Can you add some code to block using vfio with this
> until patches 19+20 are ready?
Sure.

Thanks

Eric
> Then 1-18 could be applied.
> 
>> Changes since v6:
>> - DPDK testpmd now running on guest with 2 assigned VFs
>> - Changed the instantiation method: add the following option to
>>   the QEMU command line
>>   -device smmuv3 # for virtio/vhost use cases
>>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
>> - splitted the series into smaller patches to allow the review
>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>>   is isolated from the rest: last 2 patches, not for upstream.
>>   This is shipped for testing/bench until a better solution is found.
>> - Reworked permission flag checks and event generation
>>
>> testing:
>> - in dt and ACPI modes
>> - virtio-net-pci and vhost-net devices using dma ops with various
>>   guest page sizes [2]
>> - assigned VFs using dma ops [3]:
>>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>>   with guest and host page size equal (4kB)
>>
>> Known limitations:
>> - no VMSAv8-32 suport
>> - no nested stage support (S1 + S2)
>> - no support for HYP mappings
>> - register fine emulation, commands, interrupts and errors were
>>   not accurately tested. Handling is sufficient to run use cases
>>   described above though.
>> - interrupts and event generation not observed yet.
>>
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
>> Previous version at:
>> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
>>
>> References:
>> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>>     https://lkml.org/lkml/2017/8/11/426
>>
>> [2] qemu cmd line excerpt:
>> -device smmuv3 \
>> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
>> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
>> [3] use -device smmuv3,caching-mode
>>
>>
>> History:
>> v6 -> v7:
>> - see above
>>
>> v5 -> v6:
>> - Rebase on 2.10 and IOMMUMemoryRegion
>> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>>   ACPI mode)
>> - fix block replay
>> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>>   (goes along with TLBI_ON_MAP FW quirk)
>> - replay systematically unmap the whole range first
>> - smmuv3_map_hook does not unmap anymore and the unmap is done
>>   before the replay
>> - add and use smmuv3_context_device_invalidate instead of
>>   blindly replaying everything
>>
>> v4 -> v5:
>> - initial_level now part of SMMUTransCfg
>> - smmu_page_walk_64 takes into account the max input size
>> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
>> - smmuv3_translate: bug fix: don't walk on bypass
>> - smmu_update_qreg: fix PROD index update
>> - I did not yet address Peter's comments as the code is not mature enough
>>   to be split into sub patches.
>>
>> v3 -> v4 [Eric]:
>> - page table walk rewritten to allow scan of the page table within a
>>   range of IOVA. This prepares for VFIO integration and replay.
>> - configuration parsing partially reworked.
>> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>>   PRI, ATS, ..
>> - added ACPI table generation
>> - migrated to dynamic traces
>> - mingw compilation fix
>>
>> v2 -> v3 [Eric]:
>> - rebased on 2.9
>> - mostly code and patch reorganization to ease the review process
>> - optional patches removed. They may be handled separately. I am currently
>>   working on ACPI enablement.
>> - optional instantiation of the smmu in mach-virt
>> - removed [2/9] (fdt functions) since not mandated
>> - start splitting main patch into base and derived object
>> - no new function feature added
>>
>> v1 -> v2 [Prem]:
>> - Adopted review comments from Eric Auger
>>         - Make SMMU_DPRINTF to internally call qemu_log
>>             (since translation requests are too many, we need control
>>              on the type of log we want)
>>         - SMMUTransCfg modified to suite simplicity
>>         - Change RegInfo to uint64 register array
>>         - Code cleanup
>>         - Test cleanups
>> - Reshuffled patches
>>
>> v0 -> v1 [Prem]:
>> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
>> - Reworked register access/update logic
>> - Factored out translation code for
>>         - single point bug fix
>>         - sharing/removal in future
>> - (optional) Unit tests added, with PCI test device
>>         - S1 with 4k/64k, S1+S2 with 4k/64k
>>         - (S1 or S2) only can be verified by Linux 4.7 driver
>>         - (optional) Priliminary ACPI support
>>
>> v0 [Prem]:
>> - Implements SMMUv3 spec 11.0
>> - Supported for PCIe devices,
>> - Command Queue and Event Queue supported
>> - LPAE only, S1 is supported and Tested, S2 not tested
>> - BE mode Translation not supported
>> - IRQ support (legacy, no MSI)
>>
>> Eric Auger (18):
>>   hw/arm/smmu-common: smmu base device and datatypes
>>   hw/arm/smmu-common: IOMMU memory region and address space setup
>>   hw/arm/smmu-common: smmu_read/write_sysmem
>>   hw/arm/smmu-common: VMSAv8-64 page table walk
>>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>>   hw/arm/smmuv3: Queue helpers
>>   hw/arm/smmuv3: Implement MMIO write operations
>>   hw/arm/smmuv3: Event queue recording helper
>>   hw/arm/smmuv3: Implement translate callback
>>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>>     notifications
>>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>>   hw/arm/virt: Store the PCI host controller dt phandle
>>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>>     functions
>>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>>     PlatformBusFDTData
>>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
>>
>> Prem Mallappa (2):
>>   hw/arm/smmuv3: Skeleton
>>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
>>
>>  default-configs/aarch64-softmmu.mak |    1 +
>>  hw/arm/Makefile.objs                |    1 +
>>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>>  hw/arm/smmu-internal.h              |  105 ++++
>>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>>  hw/arm/sysbus-fdt.c                 |  129 +++-
>>  hw/arm/trace-events                 |   48 ++
>>  hw/arm/virt-acpi-build.c            |   63 +-
>>  hw/arm/virt.c                       |    6 +-
>>  include/hw/acpi/acpi-defs.h         |   15 +
>>  include/hw/arm/smmu-common.h        |  123 ++++
>>  include/hw/arm/smmuv3.h             |   80 +++
>>  include/hw/arm/sysbus-fdt.h         |    2 +
>>  include/hw/arm/virt.h               |   15 +
>>  target/arm/kvm.c                    |   27 +
>>  target/arm/trace-events             |    3 +
>>  17 files changed, 2886 insertions(+), 24 deletions(-)
>>  create mode 100644 hw/arm/smmu-common.c
>>  create mode 100644 hw/arm/smmu-internal.h
>>  create mode 100644 hw/arm/smmuv3-internal.h
>>  create mode 100644 hw/arm/smmuv3.c
>>  create mode 100644 include/hw/arm/smmu-common.h
>>  create mode 100644 include/hw/arm/smmuv3.h
>>
>> -- 
>> 2.5.5
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton Eric Auger
@ 2017-09-08 10:52   ` Linu Cherian
  2017-09-08 15:18     ` Auger Eric
  2017-10-09 16:17   ` [Qemu-devel] " Peter Maydell
  1 sibling, 1 reply; 72+ messages in thread
From: Linu Cherian @ 2017-09-08 10:52 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On Fri Sep 01, 2017 at 07:21:08PM +0200, Eric Auger wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
> 
> This patch implements a skeleton for the smmuv3 device.
> Datatypes and register definitions are introduced. The MMIO
> region, the interrupts and the queue are initialized (PRI is
> not supported).
> 
> Only the MMIO read operation is implemented here.
> 
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> ---
> v6 -> v7:
> - split into several patches
> 
> v5 -> v6:
> - Use IOMMUMemoryregion
> - regs become uint32_t and fix 64b MMIO access (.impl)
> - trace_smmuv3_write/read_mmio take the size param
> 
> v4 -> v5:
> - change smmuv3_translate proto (IOMMUAccessFlags flag)
> - has_stagex replaced by is_ste_stagex
> - smmu_cfg_populate removed
> - added smmuv3_decode_config and reworked error management
> - remwork the naming of IOMMU mrs
> - fix SMMU_CMDQ_CONS offset
> 
> v3 -> v4
> - smmu_irq_update
> - fix hash key allocation
> - set smmu_iommu_ops
> - set SMMU_REG_CR0,
> - smmuv3_translate: ret.perm not set in bypass mode
> - use trace events
> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
> - rework smmu_find_ste
> - fix tg2granule in TT0/0b10 corresponds to 16kB
> 
> v2 -> v3:
> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
> - compilation allowed
> - fix sbus allocation in smmu_init_pci_iommu
> - restructure code into headers
> - misc cleanups
> ---
>  hw/arm/Makefile.objs     |   2 +-
>  hw/arm/smmuv3-internal.h | 201 +++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3.c          | 239 +++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |   3 +
>  include/hw/arm/smmuv3.h  |  79 ++++++++++++++++
>  5 files changed, 523 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmuv3.h
> 
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index 5b2d38d..a7c808b 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
>  obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
>  obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
>  obj-$(CONFIG_MPS2) += mps2.o
> -obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
> +obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> new file mode 100644
> index 0000000..488acc8
> --- /dev/null
> +++ b/hw/arm/smmuv3-internal.h
> @@ -0,0 +1,201 @@
> +/*
> + * ARM SMMUv3 support - Internal API
> + *
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMU_V3_INTERNAL_H
> +#define HW_ARM_SMMU_V3_INTERNAL_H
> +
> +#include "trace.h"
> +#include "qemu/error-report.h"
> +#include "hw/arm/smmu-common.h"
> +
> +/*****************************
> + * MMIO Register
> + *****************************/
> +enum {
> +    SMMU_REG_IDR0            = 0x0,
> +
> +/* IDR0 Field Values and supported features */
> +
> +#define SMMU_IDR0_S2P      1  /* stage 2 */
> +#define SMMU_IDR0_S1P      1  /* stage 1 */
> +#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
> +#define SMMU_IDR0_COHACC   1  /* IO coherent access */
> +#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
> +#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
> +#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
> +#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
> +#define SMMU_IDR0_PRI      0  /* Page Request Interface */
> +#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
> +#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
> +#define SMMU_IDR0_STALL    1  /* Stalling fault model */
> +#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
> +#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
> +
> +#define SMMU_IDR0_S2P_SHIFT      0
> +#define SMMU_IDR0_S1P_SHIFT      1
> +#define SMMU_IDR0_TTF_SHIFT      2
> +#define SMMU_IDR0_COHACC_SHIFT   4
> +#define SMMU_IDR0_HTTU_SHIFT     6
> +#define SMMU_IDR0_HYP_SHIFT      9
> +#define SMMU_IDR0_ATS_SHIFT      10
> +#define SMMU_IDR0_ASID16_SHIFT   12
> +#define SMMU_IDR0_PRI_SHIFT      16
> +#define SMMU_IDR0_VMID16_SHIFT   18
> +#define SMMU_IDR0_CD2L_SHIFT     19
> +#define SMMU_IDR0_STALL_SHIFT    24
> +#define SMMU_IDR0_TERM_SHIFT     26
> +#define SMMU_IDR0_STLEVEL_SHIFT  27
> +
> +    SMMU_REG_IDR1            = 0x4,
> +#define SMMU_IDR1_SIDSIZE 16
> +    SMMU_REG_IDR2            = 0x8,
> +    SMMU_REG_IDR3            = 0xc,
> +    SMMU_REG_IDR4            = 0x10,
> +    SMMU_REG_IDR5            = 0x14,
> +#define SMMU_IDR5_GRAN_SHIFT 4
> +#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
> +#define SMMU_IDR5_OAS        4     /* 44 bits */
> +    SMMU_REG_IIDR            = 0x1c,
> +    SMMU_REG_CR0             = 0x20,
> +
> +#define SMMU_CR0_SMMU_ENABLE (1 << 0)
> +#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
> +#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
> +#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
> +#define SMMU_CR0_ATS_CHECK   (1 << 4)
> +
> +    SMMU_REG_CR0_ACK         = 0x24,
> +    SMMU_REG_CR1             = 0x28,
> +    SMMU_REG_CR2             = 0x2c,
> +
> +    SMMU_REG_STATUSR         = 0x40,
> +
> +    SMMU_REG_IRQ_CTRL        = 0x50,
> +    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
> +
> +#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
> +#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
> +#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
> +
> +    SMMU_REG_GERROR          = 0x60,
> +
> +#define SMMU_GERROR_CMDQ           (1 << 0)
> +#define SMMU_GERROR_EVENTQ_ABT     (1 << 2)
> +#define SMMU_GERROR_PRIQ_ABT       (1 << 3)
> +#define SMMU_GERROR_MSI_CMDQ_ABT   (1 << 4)
> +#define SMMU_GERROR_MSI_EVENTQ_ABT (1 << 5)
> +#define SMMU_GERROR_MSI_PRIQ_ABT   (1 << 6)
> +#define SMMU_GERROR_MSI_GERROR_ABT (1 << 7)
> +#define SMMU_GERROR_SFM_ERR        (1 << 8)
> +
> +    SMMU_REG_GERRORN         = 0x64,
> +    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
> +    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
> +    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
> +
> +    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
> +#define SMMU_BASE_RA        (1ULL << 62)
> +    SMMU_REG_STRTAB_BASE     = 0x80,
> +    SMMU_REG_STRTAB_BASE_CFG = 0x88,
> +
> +    SMMU_REG_CMDQ_BASE       = 0x90,
> +    SMMU_REG_CMDQ_PROD       = 0x98,
> +    SMMU_REG_CMDQ_CONS       = 0x9c,
> +    /* CMD Consumer (CONS) */
> +#define SMMU_CMD_CONS_ERR_SHIFT        24
> +#define SMMU_CMD_CONS_ERR_BITS         7
> +
> +    SMMU_REG_EVTQ_BASE       = 0xa0,
> +    SMMU_REG_EVTQ_PROD       = 0xa8,
> +    SMMU_REG_EVTQ_CONS       = 0xac,
> +    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
> +    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
> +    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
> +
> +    SMMU_REG_PRIQ_BASE       = 0xc0,
> +    SMMU_REG_PRIQ_PROD       = 0xc8,
> +    SMMU_REG_PRIQ_CONS       = 0xcc,
> +    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
> +    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
> +    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
> +
> +    SMMU_ID_REGS_OFFSET      = 0xfd0,
> +
> +    /* Secure registers are not used for now */
> +    SMMU_SECURE_OFFSET       = 0x8000,
> +};
> +
> +/**********************
> + * Data Structures
> + **********************/
> +
> +struct __smmu_data2 {
> +    uint32_t word[2];
> +};
> +
> +struct __smmu_data8 {
> +    uint32_t word[8];
> +};
> +
> +struct __smmu_data16 {
> +    uint32_t word[16];
> +};
> +
> +struct __smmu_data4 {
> +    uint32_t word[4];
> +};
> +
> +typedef struct __smmu_data4  Cmd; /* Command Entry */
> +typedef struct __smmu_data8  Evt; /* Event Entry */
> +
> +/*****************************
> + *  Register Access Primitives
> + *****************************/
> +
> +static inline void smmu_write32_reg(SMMUV3State *s, uint32_t addr, uint32_t val)
> +{
> +    s->regs[addr >> 2] = val;
> +}
> +
> +static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
> +{
> +    addr >>= 2;
> +    s->regs[addr] = extract64(val, 0, 32);
> +    s->regs[addr + 1] = extract64(val, 32, 32);
> +}
> +
> +static inline uint32_t smmu_read32_reg(SMMUV3State *s, uint32_t addr)
> +{
> +    return s->regs[addr >> 2];
> +}
> +
> +static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
> +{
> +    addr >>= 2;
> +    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
> +}
> +
> +static inline int smmu_enabled(SMMUV3State *s)
> +{
> +    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
> +}
> +
> +#endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> new file mode 100644
> index 0000000..0a7cd1c
> --- /dev/null
> +++ b/hw/arm/smmuv3.c
> @@ -0,0 +1,239 @@
> +/*
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/boards.h"
> +#include "sysemu/sysemu.h"
> +#include "hw/sysbus.h"
> +#include "hw/pci/pci.h"
> +#include "exec/address-spaces.h"
> +#include "trace.h"
> +#include "qemu/error-report.h"
> +
> +#include "hw/arm/smmuv3.h"
> +#include "smmuv3-internal.h"
> +
> +static void smmuv3_init_regs(SMMUV3State *s)
> +{
> +    uint32_t data =
> +        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
> +        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
> +        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
> +        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
> +        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
> +        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
> +        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
> +        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
> +        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
> +        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
> +        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
> +        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
> +        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
> +
> +    smmu_write32_reg(s, SMMU_REG_IDR0, data);
> +
> +#define SMMU_QUEUE_SIZE_LOG2  19
> +    data =
> +        1 << 27 |                    /* Attr Types override */
> +        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
> +        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
> +        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
> +        0  << 6 |                    /* SSID not supported */
> +        SMMU_IDR1_SIDSIZE;
> +
> +    smmu_write32_reg(s, SMMU_REG_IDR1, data);
> +
> +    s->sid_size = SMMU_IDR1_SIDSIZE;
> +
> +    data = SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;

For VFIO case, should we not set the granule size based on underlying 
pagesize bitmap derived from VFIO_IOMMU_GET_INFO. Else if guest kernel
is build with 4k page size and the host kernel is 64k we would start
getting map errors. 



> +
> +    smmu_write32_reg(s, SMMU_REG_IDR5, data);
> +}
> +
> +static void smmuv3_init_queues(SMMUV3State *s)
> +{
> +    s->cmdq.prod = 0;
> +    s->cmdq.cons = 0;
> +    s->cmdq.wrap.prod = 0;
> +    s->cmdq.wrap.cons = 0;
> +
> +    s->evtq.prod = 0;
> +    s->evtq.cons = 0;
> +    s->evtq.wrap.prod = 0;
> +    s->evtq.wrap.cons = 0;
> +
> +    s->cmdq.entries = SMMU_QUEUE_SIZE_LOG2;
> +    s->cmdq.ent_size = sizeof(Cmd);
> +    s->evtq.entries = SMMU_QUEUE_SIZE_LOG2;
> +    s->evtq.ent_size = sizeof(Evt);
> +}
> +
> +static void smmuv3_init(SMMUV3State *s)
> +{
> +    smmuv3_init_regs(s);
> +    smmuv3_init_queues(s);
> +}
> +
> +static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
> +                                        uint64_t val)
> +{
> +    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
> +}
> +
> +static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
> +{
> +    switch (*addr) {
> +    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
> +    case 0x100c8: case 0x100cc:
> +        *addr ^= (hwaddr)0x10000;
> +    }
> +}
> +
> +static void smmu_write_mmio(void *opaque, hwaddr addr,
> +                            uint64_t val, unsigned size)
> +{
> +}
> +
> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
> +{
> +    SMMUState *sys = opaque;
> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> +    uint64_t val;
> +
> +    smmu_write_mmio_fixup(s, &addr);
> +
> +    /* Primecell/Corelink ID registers */
> +    switch (addr) {
> +    case 0xFF0 ... 0xFFC:
> +    case 0xFDC ... 0xFE4:
> +        val = 0;
> +        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
> +        break;
> +    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
> +    case SMMU_REG_EVTQ_BASE:
> +    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
> +        val = smmu_read64_reg(s, addr);
> +        break;
> +    default:
> +        val = (uint64_t)smmu_read32_reg(s, addr);
> +        break;
> +    }
> +
> +    trace_smmuv3_read_mmio(addr, val, size);
> +    return val;
> +}
> +
> +static const MemoryRegionOps smmu_mem_ops = {
> +    .read = smmu_read_mmio,
> +    .write = smmu_write_mmio,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
> +static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
> +{
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
> +        sysbus_init_irq(dev, &s->irq[i]);
> +    }
> +}
> +
> +static void smmu_reset(DeviceState *dev)
> +{
> +    SMMUV3State *s = SMMU_V3_DEV(dev);
> +    smmuv3_init(s);
> +}
> +
> +static void smmu_realize(DeviceState *d, Error **errp)
> +{
> +    SMMUState *sys = SMMU_SYS_DEV(d);
> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
> +
> +    memory_region_init_io(&sys->iomem, OBJECT(s),
> +                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
> +
> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
> +
> +    sysbus_init_mmio(dev, &sys->iomem);
> +
> +    smmu_init_irq(s, dev);
> +}
> +
> +static const VMStateDescription vmstate_smmuv3 = {
> +    .name = "smmuv3",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
> +        VMSTATE_END_OF_LIST(),
> +    },
> +};
> +
> +static void smmuv3_instance_init(Object *obj)
> +{
> +    /* Nothing much to do here as of now */
> +}
> +
> +static void smmuv3_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->reset   = smmu_reset;
> +    dc->vmsd    = &vmstate_smmuv3;
> +    dc->realize = smmu_realize;
> +}
> +
> +static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
> +                                                  void *data)
> +{
> +}
> +
> +static const TypeInfo smmuv3_type_info = {
> +    .name          = TYPE_SMMU_V3_DEV,
> +    .parent        = TYPE_SMMU_DEV_BASE,
> +    .instance_size = sizeof(SMMUV3State),
> +    .instance_init = smmuv3_instance_init,
> +    .class_data    = NULL,
> +    .class_size    = sizeof(SMMUV3Class),
> +    .class_init    = smmuv3_class_init,
> +};
> +
> +static const TypeInfo smmuv3_iommu_memory_region_info = {
> +    .parent = TYPE_IOMMU_MEMORY_REGION,
> +    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
> +    .class_init = smmuv3_iommu_memory_region_class_init,
> +};
> +
> +static void smmuv3_register_types(void)
> +{
> +    type_register(&smmuv3_type_info);
> +    type_register(&smmuv3_iommu_memory_region_info);
> +}
> +
> +type_init(smmuv3_register_types)
> +
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index c67cd39..8affbf7 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -14,3 +14,6 @@ smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t
>  smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
>  smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
> +
> +#hw/arm/smmuv3.c
> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
> new file mode 100644
> index 0000000..0c8973d
> --- /dev/null
> +++ b/include/hw/arm/smmuv3.h
> @@ -0,0 +1,79 @@
> +/*
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMUV3_H
> +#define HW_ARM_SMMUV3_H
> +
> +#include "hw/arm/smmu-common.h"
> +
> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
> +
> +#define SMMU_NREGS            0x200
> +
> +typedef struct SMMUQueue {
> +     hwaddr base;
> +     uint32_t prod;
> +     uint32_t cons;
> +     union {
> +          struct {
> +               uint8_t prod:1;
> +               uint8_t cons:1;
> +          };
> +          uint8_t unused;
> +     } wrap;
> +
> +     uint16_t entries;           /* Number of entries */
> +     uint8_t  ent_size;          /* Size of entry in bytes */
> +     uint8_t  shift;             /* Size in log2 */
> +} SMMUQueue;
> +
> +typedef struct SMMUV3State {
> +    SMMUState     smmu_state;
> +
> +    /* Local cache of most-frequently used registers */
> +#define SMMU_FEATURE_2LVL_STE (1 << 0)
> +    uint32_t     features;
> +    uint16_t     sid_size;
> +    uint16_t     sid_split;
> +    uint64_t     strtab_base;
> +
> +    uint32_t    regs[SMMU_NREGS];
> +
> +    qemu_irq     irq[4];
> +    SMMUQueue    cmdq, evtq;
> +
> +} SMMUV3State;
> +
> +typedef enum {
> +    SMMU_IRQ_EVTQ,
> +    SMMU_IRQ_PRIQ,
> +    SMMU_IRQ_CMD_SYNC,
> +    SMMU_IRQ_GERROR,
> +} SMMUIrq;
> +
> +typedef struct {
> +    SMMUBaseClass smmu_base_class;
> +} SMMUV3Class;
> +
> +#define TYPE_SMMU_V3_DEV   "smmuv3"
> +#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
> +#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
> +    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
> +
> +#endif
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton
  2017-09-08 10:52   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-09-08 15:18     ` Auger Eric
  2017-09-12  6:14       ` Linu Cherian
  0 siblings, 1 reply; 72+ messages in thread
From: Auger Eric @ 2017-09-08 15:18 UTC (permalink / raw)
  To: Linu Cherian
  Cc: peter.maydell, drjones, tcain, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, jean-philippe.brucker, tn, bharat.bhushan, mst,
	will.deacon, qemu-devel, peterx, alex.williamson, qemu-arm,
	christoffer.dall, linu.cherian, wtownsen, robin.murphy,
	prem.mallappa, eric.auger.pro

Hi Linu,

On 08/09/2017 12:52, Linu Cherian wrote:
> Hi Eric,
> 
> On Fri Sep 01, 2017 at 07:21:08PM +0200, Eric Auger wrote:
>> From: Prem Mallappa <prem.mallappa@broadcom.com>
>>
>> This patch implements a skeleton for the smmuv3 device.
>> Datatypes and register definitions are introduced. The MMIO
>> region, the interrupts and the queue are initialized (PRI is
>> not supported).
>>
>> Only the MMIO read operation is implemented here.
>>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>> v6 -> v7:
>> - split into several patches
>>
>> v5 -> v6:
>> - Use IOMMUMemoryregion
>> - regs become uint32_t and fix 64b MMIO access (.impl)
>> - trace_smmuv3_write/read_mmio take the size param
>>
>> v4 -> v5:
>> - change smmuv3_translate proto (IOMMUAccessFlags flag)
>> - has_stagex replaced by is_ste_stagex
>> - smmu_cfg_populate removed
>> - added smmuv3_decode_config and reworked error management
>> - remwork the naming of IOMMU mrs
>> - fix SMMU_CMDQ_CONS offset
>>
>> v3 -> v4
>> - smmu_irq_update
>> - fix hash key allocation
>> - set smmu_iommu_ops
>> - set SMMU_REG_CR0,
>> - smmuv3_translate: ret.perm not set in bypass mode
>> - use trace events
>> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
>> - rework smmu_find_ste
>> - fix tg2granule in TT0/0b10 corresponds to 16kB
>>
>> v2 -> v3:
>> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
>> - compilation allowed
>> - fix sbus allocation in smmu_init_pci_iommu
>> - restructure code into headers
>> - misc cleanups
>> ---
>>  hw/arm/Makefile.objs     |   2 +-
>>  hw/arm/smmuv3-internal.h | 201 +++++++++++++++++++++++++++++++++++++++
>>  hw/arm/smmuv3.c          | 239 +++++++++++++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events      |   3 +
>>  include/hw/arm/smmuv3.h  |  79 ++++++++++++++++
>>  5 files changed, 523 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/arm/smmuv3-internal.h
>>  create mode 100644 hw/arm/smmuv3.c
>>  create mode 100644 include/hw/arm/smmuv3.h
>>
>> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
>> index 5b2d38d..a7c808b 100644
>> --- a/hw/arm/Makefile.objs
>> +++ b/hw/arm/Makefile.objs
>> @@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
>>  obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
>>  obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
>>  obj-$(CONFIG_MPS2) += mps2.o
>> -obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
>> +obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> new file mode 100644
>> index 0000000..488acc8
>> --- /dev/null
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -0,0 +1,201 @@
>> +/*
>> + * ARM SMMUv3 support - Internal API
>> + *
>> + * Copyright (C) 2014-2016 Broadcom Corporation
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation, either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_ARM_SMMU_V3_INTERNAL_H
>> +#define HW_ARM_SMMU_V3_INTERNAL_H
>> +
>> +#include "trace.h"
>> +#include "qemu/error-report.h"
>> +#include "hw/arm/smmu-common.h"
>> +
>> +/*****************************
>> + * MMIO Register
>> + *****************************/
>> +enum {
>> +    SMMU_REG_IDR0            = 0x0,
>> +
>> +/* IDR0 Field Values and supported features */
>> +
>> +#define SMMU_IDR0_S2P      1  /* stage 2 */
>> +#define SMMU_IDR0_S1P      1  /* stage 1 */
>> +#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
>> +#define SMMU_IDR0_COHACC   1  /* IO coherent access */
>> +#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
>> +#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
>> +#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
>> +#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
>> +#define SMMU_IDR0_PRI      0  /* Page Request Interface */
>> +#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
>> +#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
>> +#define SMMU_IDR0_STALL    1  /* Stalling fault model */
>> +#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
>> +#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
>> +
>> +#define SMMU_IDR0_S2P_SHIFT      0
>> +#define SMMU_IDR0_S1P_SHIFT      1
>> +#define SMMU_IDR0_TTF_SHIFT      2
>> +#define SMMU_IDR0_COHACC_SHIFT   4
>> +#define SMMU_IDR0_HTTU_SHIFT     6
>> +#define SMMU_IDR0_HYP_SHIFT      9
>> +#define SMMU_IDR0_ATS_SHIFT      10
>> +#define SMMU_IDR0_ASID16_SHIFT   12
>> +#define SMMU_IDR0_PRI_SHIFT      16
>> +#define SMMU_IDR0_VMID16_SHIFT   18
>> +#define SMMU_IDR0_CD2L_SHIFT     19
>> +#define SMMU_IDR0_STALL_SHIFT    24
>> +#define SMMU_IDR0_TERM_SHIFT     26
>> +#define SMMU_IDR0_STLEVEL_SHIFT  27
>> +
>> +    SMMU_REG_IDR1            = 0x4,
>> +#define SMMU_IDR1_SIDSIZE 16
>> +    SMMU_REG_IDR2            = 0x8,
>> +    SMMU_REG_IDR3            = 0xc,
>> +    SMMU_REG_IDR4            = 0x10,
>> +    SMMU_REG_IDR5            = 0x14,
>> +#define SMMU_IDR5_GRAN_SHIFT 4
>> +#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
>> +#define SMMU_IDR5_OAS        4     /* 44 bits */
>> +    SMMU_REG_IIDR            = 0x1c,
>> +    SMMU_REG_CR0             = 0x20,
>> +
>> +#define SMMU_CR0_SMMU_ENABLE (1 << 0)
>> +#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
>> +#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
>> +#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
>> +#define SMMU_CR0_ATS_CHECK   (1 << 4)
>> +
>> +    SMMU_REG_CR0_ACK         = 0x24,
>> +    SMMU_REG_CR1             = 0x28,
>> +    SMMU_REG_CR2             = 0x2c,
>> +
>> +    SMMU_REG_STATUSR         = 0x40,
>> +
>> +    SMMU_REG_IRQ_CTRL        = 0x50,
>> +    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
>> +
>> +#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
>> +#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
>> +#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
>> +
>> +    SMMU_REG_GERROR          = 0x60,
>> +
>> +#define SMMU_GERROR_CMDQ           (1 << 0)
>> +#define SMMU_GERROR_EVENTQ_ABT     (1 << 2)
>> +#define SMMU_GERROR_PRIQ_ABT       (1 << 3)
>> +#define SMMU_GERROR_MSI_CMDQ_ABT   (1 << 4)
>> +#define SMMU_GERROR_MSI_EVENTQ_ABT (1 << 5)
>> +#define SMMU_GERROR_MSI_PRIQ_ABT   (1 << 6)
>> +#define SMMU_GERROR_MSI_GERROR_ABT (1 << 7)
>> +#define SMMU_GERROR_SFM_ERR        (1 << 8)
>> +
>> +    SMMU_REG_GERRORN         = 0x64,
>> +    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
>> +    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
>> +    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
>> +
>> +    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
>> +#define SMMU_BASE_RA        (1ULL << 62)
>> +    SMMU_REG_STRTAB_BASE     = 0x80,
>> +    SMMU_REG_STRTAB_BASE_CFG = 0x88,
>> +
>> +    SMMU_REG_CMDQ_BASE       = 0x90,
>> +    SMMU_REG_CMDQ_PROD       = 0x98,
>> +    SMMU_REG_CMDQ_CONS       = 0x9c,
>> +    /* CMD Consumer (CONS) */
>> +#define SMMU_CMD_CONS_ERR_SHIFT        24
>> +#define SMMU_CMD_CONS_ERR_BITS         7
>> +
>> +    SMMU_REG_EVTQ_BASE       = 0xa0,
>> +    SMMU_REG_EVTQ_PROD       = 0xa8,
>> +    SMMU_REG_EVTQ_CONS       = 0xac,
>> +    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
>> +    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
>> +    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
>> +
>> +    SMMU_REG_PRIQ_BASE       = 0xc0,
>> +    SMMU_REG_PRIQ_PROD       = 0xc8,
>> +    SMMU_REG_PRIQ_CONS       = 0xcc,
>> +    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
>> +    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
>> +    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
>> +
>> +    SMMU_ID_REGS_OFFSET      = 0xfd0,
>> +
>> +    /* Secure registers are not used for now */
>> +    SMMU_SECURE_OFFSET       = 0x8000,
>> +};
>> +
>> +/**********************
>> + * Data Structures
>> + **********************/
>> +
>> +struct __smmu_data2 {
>> +    uint32_t word[2];
>> +};
>> +
>> +struct __smmu_data8 {
>> +    uint32_t word[8];
>> +};
>> +
>> +struct __smmu_data16 {
>> +    uint32_t word[16];
>> +};
>> +
>> +struct __smmu_data4 {
>> +    uint32_t word[4];
>> +};
>> +
>> +typedef struct __smmu_data4  Cmd; /* Command Entry */
>> +typedef struct __smmu_data8  Evt; /* Event Entry */
>> +
>> +/*****************************
>> + *  Register Access Primitives
>> + *****************************/
>> +
>> +static inline void smmu_write32_reg(SMMUV3State *s, uint32_t addr, uint32_t val)
>> +{
>> +    s->regs[addr >> 2] = val;
>> +}
>> +
>> +static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
>> +{
>> +    addr >>= 2;
>> +    s->regs[addr] = extract64(val, 0, 32);
>> +    s->regs[addr + 1] = extract64(val, 32, 32);
>> +}
>> +
>> +static inline uint32_t smmu_read32_reg(SMMUV3State *s, uint32_t addr)
>> +{
>> +    return s->regs[addr >> 2];
>> +}
>> +
>> +static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
>> +{
>> +    addr >>= 2;
>> +    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
>> +}
>> +
>> +static inline int smmu_enabled(SMMUV3State *s)
>> +{
>> +    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
>> +}
>> +
>> +#endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> new file mode 100644
>> index 0000000..0a7cd1c
>> --- /dev/null
>> +++ b/hw/arm/smmuv3.c
>> @@ -0,0 +1,239 @@
>> +/*
>> + * Copyright (C) 2014-2016 Broadcom Corporation
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation, either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "hw/boards.h"
>> +#include "sysemu/sysemu.h"
>> +#include "hw/sysbus.h"
>> +#include "hw/pci/pci.h"
>> +#include "exec/address-spaces.h"
>> +#include "trace.h"
>> +#include "qemu/error-report.h"
>> +
>> +#include "hw/arm/smmuv3.h"
>> +#include "smmuv3-internal.h"
>> +
>> +static void smmuv3_init_regs(SMMUV3State *s)
>> +{
>> +    uint32_t data =
>> +        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
>> +        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
>> +        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
>> +        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
>> +        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
>> +        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
>> +        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
>> +        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
>> +        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
>> +        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
>> +        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
>> +        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
>> +        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
>> +
>> +    smmu_write32_reg(s, SMMU_REG_IDR0, data);
>> +
>> +#define SMMU_QUEUE_SIZE_LOG2  19
>> +    data =
>> +        1 << 27 |                    /* Attr Types override */
>> +        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
>> +        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
>> +        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
>> +        0  << 6 |                    /* SSID not supported */
>> +        SMMU_IDR1_SIDSIZE;
>> +
>> +    smmu_write32_reg(s, SMMU_REG_IDR1, data);
>> +
>> +    s->sid_size = SMMU_IDR1_SIDSIZE;
>> +
>> +    data = SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
> 
> For VFIO case, should we not set the granule size based on underlying 
> pagesize bitmap derived from VFIO_IOMMU_GET_INFO. Else if guest kernel
> is build with 4k page size and the host kernel is 64k we would start
> getting map errors. 

yes at the moment this is not implemented (1st target of the series is
virtio/vhost).

On Intel if I understand correctly the minimum requested is 4K, 2MB.
1GB is optional. I understand the emulated model does not expose 1GB
(FL1GP = 0).

On ARM nothing is mandatory although 4K and 64K minimal granules are
"strongly recommended", leading to the following additional sizes.

        if (reg & IDR5_GRAN64K)
                smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
        if (reg & IDR5_GRAN16K)
                smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
        if (reg & IDR5_GRAN4K)
                smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;

Maybe we can override the ID5 values using the vfio_memory_listener. I
will try to prototype this idea.

Thanks

Eric


> 
> 
> 
>> +
>> +    smmu_write32_reg(s, SMMU_REG_IDR5, data);
>> +}
>> +
>> +static void smmuv3_init_queues(SMMUV3State *s)
>> +{
>> +    s->cmdq.prod = 0;
>> +    s->cmdq.cons = 0;
>> +    s->cmdq.wrap.prod = 0;
>> +    s->cmdq.wrap.cons = 0;
>> +
>> +    s->evtq.prod = 0;
>> +    s->evtq.cons = 0;
>> +    s->evtq.wrap.prod = 0;
>> +    s->evtq.wrap.cons = 0;
>> +
>> +    s->cmdq.entries = SMMU_QUEUE_SIZE_LOG2;
>> +    s->cmdq.ent_size = sizeof(Cmd);
>> +    s->evtq.entries = SMMU_QUEUE_SIZE_LOG2;
>> +    s->evtq.ent_size = sizeof(Evt);
>> +}
>> +
>> +static void smmuv3_init(SMMUV3State *s)
>> +{
>> +    smmuv3_init_regs(s);
>> +    smmuv3_init_queues(s);
>> +}
>> +
>> +static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>> +                                        uint64_t val)
>> +{
>> +    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
>> +}
>> +
>> +static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
>> +{
>> +    switch (*addr) {
>> +    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
>> +    case 0x100c8: case 0x100cc:
>> +        *addr ^= (hwaddr)0x10000;
>> +    }
>> +}
>> +
>> +static void smmu_write_mmio(void *opaque, hwaddr addr,
>> +                            uint64_t val, unsigned size)
>> +{
>> +}
>> +
>> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
>> +{
>> +    SMMUState *sys = opaque;
>> +    SMMUV3State *s = SMMU_V3_DEV(sys);
>> +    uint64_t val;
>> +
>> +    smmu_write_mmio_fixup(s, &addr);
>> +
>> +    /* Primecell/Corelink ID registers */
>> +    switch (addr) {
>> +    case 0xFF0 ... 0xFFC:
>> +    case 0xFDC ... 0xFE4:
>> +        val = 0;
>> +        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
>> +        break;
>> +    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
>> +    case SMMU_REG_EVTQ_BASE:
>> +    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
>> +        val = smmu_read64_reg(s, addr);
>> +        break;
>> +    default:
>> +        val = (uint64_t)smmu_read32_reg(s, addr);
>> +        break;
>> +    }
>> +
>> +    trace_smmuv3_read_mmio(addr, val, size);
>> +    return val;
>> +}
>> +
>> +static const MemoryRegionOps smmu_mem_ops = {
>> +    .read = smmu_read_mmio,
>> +    .write = smmu_write_mmio,
>> +    .endianness = DEVICE_LITTLE_ENDIAN,
>> +    .valid = {
>> +        .min_access_size = 4,
>> +        .max_access_size = 8,
>> +    },
>> +    .impl = {
>> +        .min_access_size = 4,
>> +        .max_access_size = 8,
>> +    },
>> +};
>> +
>> +static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
>> +        sysbus_init_irq(dev, &s->irq[i]);
>> +    }
>> +}
>> +
>> +static void smmu_reset(DeviceState *dev)
>> +{
>> +    SMMUV3State *s = SMMU_V3_DEV(dev);
>> +    smmuv3_init(s);
>> +}
>> +
>> +static void smmu_realize(DeviceState *d, Error **errp)
>> +{
>> +    SMMUState *sys = SMMU_SYS_DEV(d);
>> +    SMMUV3State *s = SMMU_V3_DEV(sys);
>> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
>> +
>> +    memory_region_init_io(&sys->iomem, OBJECT(s),
>> +                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
>> +
>> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
>> +
>> +    sysbus_init_mmio(dev, &sys->iomem);
>> +
>> +    smmu_init_irq(s, dev);
>> +}
>> +
>> +static const VMStateDescription vmstate_smmuv3 = {
>> +    .name = "smmuv3",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
>> +        VMSTATE_END_OF_LIST(),
>> +    },
>> +};
>> +
>> +static void smmuv3_instance_init(Object *obj)
>> +{
>> +    /* Nothing much to do here as of now */
>> +}
>> +
>> +static void smmuv3_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    dc->reset   = smmu_reset;
>> +    dc->vmsd    = &vmstate_smmuv3;
>> +    dc->realize = smmu_realize;
>> +}
>> +
>> +static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>> +                                                  void *data)
>> +{
>> +}
>> +
>> +static const TypeInfo smmuv3_type_info = {
>> +    .name          = TYPE_SMMU_V3_DEV,
>> +    .parent        = TYPE_SMMU_DEV_BASE,
>> +    .instance_size = sizeof(SMMUV3State),
>> +    .instance_init = smmuv3_instance_init,
>> +    .class_data    = NULL,
>> +    .class_size    = sizeof(SMMUV3Class),
>> +    .class_init    = smmuv3_class_init,
>> +};
>> +
>> +static const TypeInfo smmuv3_iommu_memory_region_info = {
>> +    .parent = TYPE_IOMMU_MEMORY_REGION,
>> +    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
>> +    .class_init = smmuv3_iommu_memory_region_class_init,
>> +};
>> +
>> +static void smmuv3_register_types(void)
>> +{
>> +    type_register(&smmuv3_type_info);
>> +    type_register(&smmuv3_iommu_memory_region_info);
>> +}
>> +
>> +type_init(smmuv3_register_types)
>> +
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index c67cd39..8affbf7 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -14,3 +14,6 @@ smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t
>>  smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
>>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
>>  smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
>> +
>> +#hw/arm/smmuv3.c
>> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
>> diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
>> new file mode 100644
>> index 0000000..0c8973d
>> --- /dev/null
>> +++ b/include/hw/arm/smmuv3.h
>> @@ -0,0 +1,79 @@
>> +/*
>> + * Copyright (C) 2014-2016 Broadcom Corporation
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation, either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_ARM_SMMUV3_H
>> +#define HW_ARM_SMMUV3_H
>> +
>> +#include "hw/arm/smmu-common.h"
>> +
>> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
>> +
>> +#define SMMU_NREGS            0x200
>> +
>> +typedef struct SMMUQueue {
>> +     hwaddr base;
>> +     uint32_t prod;
>> +     uint32_t cons;
>> +     union {
>> +          struct {
>> +               uint8_t prod:1;
>> +               uint8_t cons:1;
>> +          };
>> +          uint8_t unused;
>> +     } wrap;
>> +
>> +     uint16_t entries;           /* Number of entries */
>> +     uint8_t  ent_size;          /* Size of entry in bytes */
>> +     uint8_t  shift;             /* Size in log2 */
>> +} SMMUQueue;
>> +
>> +typedef struct SMMUV3State {
>> +    SMMUState     smmu_state;
>> +
>> +    /* Local cache of most-frequently used registers */
>> +#define SMMU_FEATURE_2LVL_STE (1 << 0)
>> +    uint32_t     features;
>> +    uint16_t     sid_size;
>> +    uint16_t     sid_split;
>> +    uint64_t     strtab_base;
>> +
>> +    uint32_t    regs[SMMU_NREGS];
>> +
>> +    qemu_irq     irq[4];
>> +    SMMUQueue    cmdq, evtq;
>> +
>> +} SMMUV3State;
>> +
>> +typedef enum {
>> +    SMMU_IRQ_EVTQ,
>> +    SMMU_IRQ_PRIQ,
>> +    SMMU_IRQ_CMD_SYNC,
>> +    SMMU_IRQ_GERROR,
>> +} SMMUIrq;
>> +
>> +typedef struct {
>> +    SMMUBaseClass smmu_base_class;
>> +} SMMUV3Class;
>> +
>> +#define TYPE_SMMU_V3_DEV   "smmuv3"
>> +#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
>> +#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
>> +    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
>> +
>> +#endif
>> -- 
>> 2.5.5
>>
>>
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton
  2017-09-08 15:18     ` Auger Eric
@ 2017-09-12  6:14       ` Linu Cherian
  0 siblings, 0 replies; 72+ messages in thread
From: Linu Cherian @ 2017-09-12  6:14 UTC (permalink / raw)
  To: Auger Eric
  Cc: Linu Cherian, peter.maydell, drjones, tcain, Radha.Chintakuntla,
	Sunil.Goutham, mohun106, jean-philippe.brucker, tn,
	bharat.bhushan, mst, will.deacon, qemu-devel, peterx,
	alex.williamson, qemu-arm, christoffer.dall, wtownsen,
	robin.murphy, prem.mallappa, eric.auger.pro

On Fri Sep 08, 2017 at 05:18:19PM +0200, Auger Eric wrote:
> Hi Linu,
> 
> On 08/09/2017 12:52, Linu Cherian wrote:
> > Hi Eric,
> > 
> > On Fri Sep 01, 2017 at 07:21:08PM +0200, Eric Auger wrote:
> >> From: Prem Mallappa <prem.mallappa@broadcom.com>
> >>
> >> This patch implements a skeleton for the smmuv3 device.
> >> Datatypes and register definitions are introduced. The MMIO
> >> region, the interrupts and the queue are initialized (PRI is
> >> not supported).
> >>
> >> Only the MMIO read operation is implemented here.
> >>
> >> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >> v6 -> v7:
> >> - split into several patches
> >>
> >> v5 -> v6:
> >> - Use IOMMUMemoryregion
> >> - regs become uint32_t and fix 64b MMIO access (.impl)
> >> - trace_smmuv3_write/read_mmio take the size param
> >>
> >> v4 -> v5:
> >> - change smmuv3_translate proto (IOMMUAccessFlags flag)
> >> - has_stagex replaced by is_ste_stagex
> >> - smmu_cfg_populate removed
> >> - added smmuv3_decode_config and reworked error management
> >> - remwork the naming of IOMMU mrs
> >> - fix SMMU_CMDQ_CONS offset
> >>
> >> v3 -> v4
> >> - smmu_irq_update
> >> - fix hash key allocation
> >> - set smmu_iommu_ops
> >> - set SMMU_REG_CR0,
> >> - smmuv3_translate: ret.perm not set in bypass mode
> >> - use trace events
> >> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
> >> - rework smmu_find_ste
> >> - fix tg2granule in TT0/0b10 corresponds to 16kB
> >>
> >> v2 -> v3:
> >> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
> >> - compilation allowed
> >> - fix sbus allocation in smmu_init_pci_iommu
> >> - restructure code into headers
> >> - misc cleanups
> >> ---
> >>  hw/arm/Makefile.objs     |   2 +-
> >>  hw/arm/smmuv3-internal.h | 201 +++++++++++++++++++++++++++++++++++++++
> >>  hw/arm/smmuv3.c          | 239 +++++++++++++++++++++++++++++++++++++++++++++++
> >>  hw/arm/trace-events      |   3 +
> >>  include/hw/arm/smmuv3.h  |  79 ++++++++++++++++
> >>  5 files changed, 523 insertions(+), 1 deletion(-)
> >>  create mode 100644 hw/arm/smmuv3-internal.h
> >>  create mode 100644 hw/arm/smmuv3.c
> >>  create mode 100644 include/hw/arm/smmuv3.h
> >>
> >> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> >> index 5b2d38d..a7c808b 100644
> >> --- a/hw/arm/Makefile.objs
> >> +++ b/hw/arm/Makefile.objs
> >> @@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
> >>  obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
> >>  obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
> >>  obj-$(CONFIG_MPS2) += mps2.o
> >> -obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
> >> +obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
> >> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> >> new file mode 100644
> >> index 0000000..488acc8
> >> --- /dev/null
> >> +++ b/hw/arm/smmuv3-internal.h
> >> @@ -0,0 +1,201 @@
> >> +/*
> >> + * ARM SMMUv3 support - Internal API
> >> + *
> >> + * Copyright (C) 2014-2016 Broadcom Corporation
> >> + * Copyright (c) 2017 Red Hat, Inc.
> >> + * Written by Prem Mallappa, Eric Auger
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation, either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License along
> >> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#ifndef HW_ARM_SMMU_V3_INTERNAL_H
> >> +#define HW_ARM_SMMU_V3_INTERNAL_H
> >> +
> >> +#include "trace.h"
> >> +#include "qemu/error-report.h"
> >> +#include "hw/arm/smmu-common.h"
> >> +
> >> +/*****************************
> >> + * MMIO Register
> >> + *****************************/
> >> +enum {
> >> +    SMMU_REG_IDR0            = 0x0,
> >> +
> >> +/* IDR0 Field Values and supported features */
> >> +
> >> +#define SMMU_IDR0_S2P      1  /* stage 2 */
> >> +#define SMMU_IDR0_S1P      1  /* stage 1 */
> >> +#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */
> >> +#define SMMU_IDR0_COHACC   1  /* IO coherent access */
> >> +#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
> >> +#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
> >> +#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
> >> +#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
> >> +#define SMMU_IDR0_PRI      0  /* Page Request Interface */
> >> +#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
> >> +#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
> >> +#define SMMU_IDR0_STALL    1  /* Stalling fault model */
> >> +#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
> >> +#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
> >> +
> >> +#define SMMU_IDR0_S2P_SHIFT      0
> >> +#define SMMU_IDR0_S1P_SHIFT      1
> >> +#define SMMU_IDR0_TTF_SHIFT      2
> >> +#define SMMU_IDR0_COHACC_SHIFT   4
> >> +#define SMMU_IDR0_HTTU_SHIFT     6
> >> +#define SMMU_IDR0_HYP_SHIFT      9
> >> +#define SMMU_IDR0_ATS_SHIFT      10
> >> +#define SMMU_IDR0_ASID16_SHIFT   12
> >> +#define SMMU_IDR0_PRI_SHIFT      16
> >> +#define SMMU_IDR0_VMID16_SHIFT   18
> >> +#define SMMU_IDR0_CD2L_SHIFT     19
> >> +#define SMMU_IDR0_STALL_SHIFT    24
> >> +#define SMMU_IDR0_TERM_SHIFT     26
> >> +#define SMMU_IDR0_STLEVEL_SHIFT  27
> >> +
> >> +    SMMU_REG_IDR1            = 0x4,
> >> +#define SMMU_IDR1_SIDSIZE 16
> >> +    SMMU_REG_IDR2            = 0x8,
> >> +    SMMU_REG_IDR3            = 0xc,
> >> +    SMMU_REG_IDR4            = 0x10,
> >> +    SMMU_REG_IDR5            = 0x14,
> >> +#define SMMU_IDR5_GRAN_SHIFT 4
> >> +#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
> >> +#define SMMU_IDR5_OAS        4     /* 44 bits */
> >> +    SMMU_REG_IIDR            = 0x1c,
> >> +    SMMU_REG_CR0             = 0x20,
> >> +
> >> +#define SMMU_CR0_SMMU_ENABLE (1 << 0)
> >> +#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
> >> +#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
> >> +#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
> >> +#define SMMU_CR0_ATS_CHECK   (1 << 4)
> >> +
> >> +    SMMU_REG_CR0_ACK         = 0x24,
> >> +    SMMU_REG_CR1             = 0x28,
> >> +    SMMU_REG_CR2             = 0x2c,
> >> +
> >> +    SMMU_REG_STATUSR         = 0x40,
> >> +
> >> +    SMMU_REG_IRQ_CTRL        = 0x50,
> >> +    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
> >> +
> >> +#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
> >> +#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
> >> +#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
> >> +
> >> +    SMMU_REG_GERROR          = 0x60,
> >> +
> >> +#define SMMU_GERROR_CMDQ           (1 << 0)
> >> +#define SMMU_GERROR_EVENTQ_ABT     (1 << 2)
> >> +#define SMMU_GERROR_PRIQ_ABT       (1 << 3)
> >> +#define SMMU_GERROR_MSI_CMDQ_ABT   (1 << 4)
> >> +#define SMMU_GERROR_MSI_EVENTQ_ABT (1 << 5)
> >> +#define SMMU_GERROR_MSI_PRIQ_ABT   (1 << 6)
> >> +#define SMMU_GERROR_MSI_GERROR_ABT (1 << 7)
> >> +#define SMMU_GERROR_SFM_ERR        (1 << 8)
> >> +
> >> +    SMMU_REG_GERRORN         = 0x64,
> >> +    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
> >> +    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
> >> +    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
> >> +
> >> +    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
> >> +#define SMMU_BASE_RA        (1ULL << 62)
> >> +    SMMU_REG_STRTAB_BASE     = 0x80,
> >> +    SMMU_REG_STRTAB_BASE_CFG = 0x88,
> >> +
> >> +    SMMU_REG_CMDQ_BASE       = 0x90,
> >> +    SMMU_REG_CMDQ_PROD       = 0x98,
> >> +    SMMU_REG_CMDQ_CONS       = 0x9c,
> >> +    /* CMD Consumer (CONS) */
> >> +#define SMMU_CMD_CONS_ERR_SHIFT        24
> >> +#define SMMU_CMD_CONS_ERR_BITS         7
> >> +
> >> +    SMMU_REG_EVTQ_BASE       = 0xa0,
> >> +    SMMU_REG_EVTQ_PROD       = 0xa8,
> >> +    SMMU_REG_EVTQ_CONS       = 0xac,
> >> +    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
> >> +    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
> >> +    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
> >> +
> >> +    SMMU_REG_PRIQ_BASE       = 0xc0,
> >> +    SMMU_REG_PRIQ_PROD       = 0xc8,
> >> +    SMMU_REG_PRIQ_CONS       = 0xcc,
> >> +    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
> >> +    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
> >> +    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
> >> +
> >> +    SMMU_ID_REGS_OFFSET      = 0xfd0,
> >> +
> >> +    /* Secure registers are not used for now */
> >> +    SMMU_SECURE_OFFSET       = 0x8000,
> >> +};
> >> +
> >> +/**********************
> >> + * Data Structures
> >> + **********************/
> >> +
> >> +struct __smmu_data2 {
> >> +    uint32_t word[2];
> >> +};
> >> +
> >> +struct __smmu_data8 {
> >> +    uint32_t word[8];
> >> +};
> >> +
> >> +struct __smmu_data16 {
> >> +    uint32_t word[16];
> >> +};
> >> +
> >> +struct __smmu_data4 {
> >> +    uint32_t word[4];
> >> +};
> >> +
> >> +typedef struct __smmu_data4  Cmd; /* Command Entry */
> >> +typedef struct __smmu_data8  Evt; /* Event Entry */
> >> +
> >> +/*****************************
> >> + *  Register Access Primitives
> >> + *****************************/
> >> +
> >> +static inline void smmu_write32_reg(SMMUV3State *s, uint32_t addr, uint32_t val)
> >> +{
> >> +    s->regs[addr >> 2] = val;
> >> +}
> >> +
> >> +static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
> >> +{
> >> +    addr >>= 2;
> >> +    s->regs[addr] = extract64(val, 0, 32);
> >> +    s->regs[addr + 1] = extract64(val, 32, 32);
> >> +}
> >> +
> >> +static inline uint32_t smmu_read32_reg(SMMUV3State *s, uint32_t addr)
> >> +{
> >> +    return s->regs[addr >> 2];
> >> +}
> >> +
> >> +static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
> >> +{
> >> +    addr >>= 2;
> >> +    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
> >> +}
> >> +
> >> +static inline int smmu_enabled(SMMUV3State *s)
> >> +{
> >> +    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
> >> +}
> >> +
> >> +#endif
> >> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> >> new file mode 100644
> >> index 0000000..0a7cd1c
> >> --- /dev/null
> >> +++ b/hw/arm/smmuv3.c
> >> @@ -0,0 +1,239 @@
> >> +/*
> >> + * Copyright (C) 2014-2016 Broadcom Corporation
> >> + * Copyright (c) 2017 Red Hat, Inc.
> >> + * Written by Prem Mallappa, Eric Auger
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation, either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License along
> >> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "hw/boards.h"
> >> +#include "sysemu/sysemu.h"
> >> +#include "hw/sysbus.h"
> >> +#include "hw/pci/pci.h"
> >> +#include "exec/address-spaces.h"
> >> +#include "trace.h"
> >> +#include "qemu/error-report.h"
> >> +
> >> +#include "hw/arm/smmuv3.h"
> >> +#include "smmuv3-internal.h"
> >> +
> >> +static void smmuv3_init_regs(SMMUV3State *s)
> >> +{
> >> +    uint32_t data =
> >> +        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
> >> +        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
> >> +        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
> >> +        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
> >> +        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
> >> +        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
> >> +        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
> >> +        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
> >> +        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
> >> +        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
> >> +        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
> >> +        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
> >> +        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
> >> +
> >> +    smmu_write32_reg(s, SMMU_REG_IDR0, data);
> >> +
> >> +#define SMMU_QUEUE_SIZE_LOG2  19
> >> +    data =
> >> +        1 << 27 |                    /* Attr Types override */
> >> +        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
> >> +        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
> >> +        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
> >> +        0  << 6 |                    /* SSID not supported */
> >> +        SMMU_IDR1_SIDSIZE;
> >> +
> >> +    smmu_write32_reg(s, SMMU_REG_IDR1, data);
> >> +
> >> +    s->sid_size = SMMU_IDR1_SIDSIZE;
> >> +
> >> +    data = SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
> > 
> > For VFIO case, should we not set the granule size based on underlying 
> > pagesize bitmap derived from VFIO_IOMMU_GET_INFO. Else if guest kernel
> > is build with 4k page size and the host kernel is 64k we would start
> > getting map errors. 
> 
> yes at the moment this is not implemented (1st target of the series is
> virtio/vhost).
> 
> On Intel if I understand correctly the minimum requested is 4K, 2MB.
> 1GB is optional. I understand the emulated model does not expose 1GB
> (FL1GP = 0).
> 
> On ARM nothing is mandatory although 4K and 64K minimal granules are
> "strongly recommended", leading to the following additional sizes.
> 
>         if (reg & IDR5_GRAN64K)
>                 smmu->pgsize_bitmap |= SZ_64K | SZ_512M;
>         if (reg & IDR5_GRAN16K)
>                 smmu->pgsize_bitmap |= SZ_16K | SZ_32M;
>         if (reg & IDR5_GRAN4K)
>                 smmu->pgsize_bitmap |= SZ_4K | SZ_2M | SZ_1G;
> 
> Maybe we can override the ID5 values using the vfio_memory_listener. I
> will try to prototype this idea.

Ok. Thanks.


> 
> Thanks
> 
> Eric
> 
> 
> > 
> > 
> > 
> >> +
> >> +    smmu_write32_reg(s, SMMU_REG_IDR5, data);
> >> +}
> >> +
> >> +static void smmuv3_init_queues(SMMUV3State *s)
> >> +{
> >> +    s->cmdq.prod = 0;
> >> +    s->cmdq.cons = 0;
> >> +    s->cmdq.wrap.prod = 0;
> >> +    s->cmdq.wrap.cons = 0;
> >> +
> >> +    s->evtq.prod = 0;
> >> +    s->evtq.cons = 0;
> >> +    s->evtq.wrap.prod = 0;
> >> +    s->evtq.wrap.cons = 0;
> >> +
> >> +    s->cmdq.entries = SMMU_QUEUE_SIZE_LOG2;
> >> +    s->cmdq.ent_size = sizeof(Cmd);
> >> +    s->evtq.entries = SMMU_QUEUE_SIZE_LOG2;
> >> +    s->evtq.ent_size = sizeof(Evt);
> >> +}
> >> +
> >> +static void smmuv3_init(SMMUV3State *s)
> >> +{
> >> +    smmuv3_init_regs(s);
> >> +    smmuv3_init_queues(s);
> >> +}
> >> +
> >> +static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
> >> +                                        uint64_t val)
> >> +{
> >> +    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
> >> +}
> >> +
> >> +static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
> >> +{
> >> +    switch (*addr) {
> >> +    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
> >> +    case 0x100c8: case 0x100cc:
> >> +        *addr ^= (hwaddr)0x10000;
> >> +    }
> >> +}
> >> +
> >> +static void smmu_write_mmio(void *opaque, hwaddr addr,
> >> +                            uint64_t val, unsigned size)
> >> +{
> >> +}
> >> +
> >> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
> >> +{
> >> +    SMMUState *sys = opaque;
> >> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> >> +    uint64_t val;
> >> +
> >> +    smmu_write_mmio_fixup(s, &addr);
> >> +
> >> +    /* Primecell/Corelink ID registers */
> >> +    switch (addr) {
> >> +    case 0xFF0 ... 0xFFC:
> >> +    case 0xFDC ... 0xFE4:
> >> +        val = 0;
> >> +        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);
> >> +        break;
> >> +    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
> >> +    case SMMU_REG_EVTQ_BASE:
> >> +    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
> >> +        val = smmu_read64_reg(s, addr);
> >> +        break;
> >> +    default:
> >> +        val = (uint64_t)smmu_read32_reg(s, addr);
> >> +        break;
> >> +    }
> >> +
> >> +    trace_smmuv3_read_mmio(addr, val, size);
> >> +    return val;
> >> +}
> >> +
> >> +static const MemoryRegionOps smmu_mem_ops = {
> >> +    .read = smmu_read_mmio,
> >> +    .write = smmu_write_mmio,
> >> +    .endianness = DEVICE_LITTLE_ENDIAN,
> >> +    .valid = {
> >> +        .min_access_size = 4,
> >> +        .max_access_size = 8,
> >> +    },
> >> +    .impl = {
> >> +        .min_access_size = 4,
> >> +        .max_access_size = 8,
> >> +    },
> >> +};
> >> +
> >> +static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
> >> +{
> >> +    int i;
> >> +
> >> +    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
> >> +        sysbus_init_irq(dev, &s->irq[i]);
> >> +    }
> >> +}
> >> +
> >> +static void smmu_reset(DeviceState *dev)
> >> +{
> >> +    SMMUV3State *s = SMMU_V3_DEV(dev);
> >> +    smmuv3_init(s);
> >> +}
> >> +
> >> +static void smmu_realize(DeviceState *d, Error **errp)
> >> +{
> >> +    SMMUState *sys = SMMU_SYS_DEV(d);
> >> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> >> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
> >> +
> >> +    memory_region_init_io(&sys->iomem, OBJECT(s),
> >> +                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
> >> +
> >> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
> >> +
> >> +    sysbus_init_mmio(dev, &sys->iomem);
> >> +
> >> +    smmu_init_irq(s, dev);
> >> +}
> >> +
> >> +static const VMStateDescription vmstate_smmuv3 = {
> >> +    .name = "smmuv3",
> >> +    .version_id = 1,
> >> +    .minimum_version_id = 1,
> >> +    .fields = (VMStateField[]) {
> >> +        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
> >> +        VMSTATE_END_OF_LIST(),
> >> +    },
> >> +};
> >> +
> >> +static void smmuv3_instance_init(Object *obj)
> >> +{
> >> +    /* Nothing much to do here as of now */
> >> +}
> >> +
> >> +static void smmuv3_class_init(ObjectClass *klass, void *data)
> >> +{
> >> +    DeviceClass *dc = DEVICE_CLASS(klass);
> >> +
> >> +    dc->reset   = smmu_reset;
> >> +    dc->vmsd    = &vmstate_smmuv3;
> >> +    dc->realize = smmu_realize;
> >> +}
> >> +
> >> +static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
> >> +                                                  void *data)
> >> +{
> >> +}
> >> +
> >> +static const TypeInfo smmuv3_type_info = {
> >> +    .name          = TYPE_SMMU_V3_DEV,
> >> +    .parent        = TYPE_SMMU_DEV_BASE,
> >> +    .instance_size = sizeof(SMMUV3State),
> >> +    .instance_init = smmuv3_instance_init,
> >> +    .class_data    = NULL,
> >> +    .class_size    = sizeof(SMMUV3Class),
> >> +    .class_init    = smmuv3_class_init,
> >> +};
> >> +
> >> +static const TypeInfo smmuv3_iommu_memory_region_info = {
> >> +    .parent = TYPE_IOMMU_MEMORY_REGION,
> >> +    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
> >> +    .class_init = smmuv3_iommu_memory_region_class_init,
> >> +};
> >> +
> >> +static void smmuv3_register_types(void)
> >> +{
> >> +    type_register(&smmuv3_type_info);
> >> +    type_register(&smmuv3_iommu_memory_region_info);
> >> +}
> >> +
> >> +type_init(smmuv3_register_types)
> >> +
> >> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> >> index c67cd39..8affbf7 100644
> >> --- a/hw/arm/trace-events
> >> +++ b/hw/arm/trace-events
> >> @@ -14,3 +14,6 @@ smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t
> >>  smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
> >>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
> >>  smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
> >> +
> >> +#hw/arm/smmuv3.c
> >> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> >> diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
> >> new file mode 100644
> >> index 0000000..0c8973d
> >> --- /dev/null
> >> +++ b/include/hw/arm/smmuv3.h
> >> @@ -0,0 +1,79 @@
> >> +/*
> >> + * Copyright (C) 2014-2016 Broadcom Corporation
> >> + * Copyright (c) 2017 Red Hat, Inc.
> >> + * Written by Prem Mallappa, Eric Auger
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation, either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License along
> >> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#ifndef HW_ARM_SMMUV3_H
> >> +#define HW_ARM_SMMUV3_H
> >> +
> >> +#include "hw/arm/smmu-common.h"
> >> +
> >> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
> >> +
> >> +#define SMMU_NREGS            0x200
> >> +
> >> +typedef struct SMMUQueue {
> >> +     hwaddr base;
> >> +     uint32_t prod;
> >> +     uint32_t cons;
> >> +     union {
> >> +          struct {
> >> +               uint8_t prod:1;
> >> +               uint8_t cons:1;
> >> +          };
> >> +          uint8_t unused;
> >> +     } wrap;
> >> +
> >> +     uint16_t entries;           /* Number of entries */
> >> +     uint8_t  ent_size;          /* Size of entry in bytes */
> >> +     uint8_t  shift;             /* Size in log2 */
> >> +} SMMUQueue;
> >> +
> >> +typedef struct SMMUV3State {
> >> +    SMMUState     smmu_state;
> >> +
> >> +    /* Local cache of most-frequently used registers */
> >> +#define SMMU_FEATURE_2LVL_STE (1 << 0)
> >> +    uint32_t     features;
> >> +    uint16_t     sid_size;
> >> +    uint16_t     sid_split;
> >> +    uint64_t     strtab_base;
> >> +
> >> +    uint32_t    regs[SMMU_NREGS];
> >> +
> >> +    qemu_irq     irq[4];
> >> +    SMMUQueue    cmdq, evtq;
> >> +
> >> +} SMMUV3State;
> >> +
> >> +typedef enum {
> >> +    SMMU_IRQ_EVTQ,
> >> +    SMMU_IRQ_PRIQ,
> >> +    SMMU_IRQ_CMD_SYNC,
> >> +    SMMU_IRQ_GERROR,
> >> +} SMMUIrq;
> >> +
> >> +typedef struct {
> >> +    SMMUBaseClass smmu_base_class;
> >> +} SMMUV3Class;
> >> +
> >> +#define TYPE_SMMU_V3_DEV   "smmuv3"
> >> +#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
> >> +#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
> >> +    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
> >> +
> >> +#endif
> >> -- 
> >> 2.5.5
> >>
> >>
> > 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (21 preceding siblings ...)
  2017-09-08  5:47 ` Michael S. Tsirkin
@ 2017-09-12  6:18 ` Linu Cherian
  2017-09-12  6:38   ` Auger Eric
  2017-09-28  6:43 ` Linu Cherian
  2017-10-24  5:38 ` Linu Cherian
  24 siblings, 1 reply; 72+ messages in thread
From: Linu Cherian @ 2017-09-12  6:18 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> This series implements the emulation code for ARM SMMUv3.
> 
> Changes since v6:
> - DPDK testpmd now running on guest with 2 assigned VFs
> - Changed the instantiation method: add the following option to
>   the QEMU command line
>   -device smmuv3 # for virtio/vhost use cases
>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> - splitted the series into smaller patches to allow the review
> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>   is isolated from the rest: last 2 patches, not for upstream.
>   This is shipped for testing/bench until a better solution is found.
> - Reworked permission flag checks and event generation
> 
> testing:
> - in dt and ACPI modes
> - virtio-net-pci and vhost-net devices using dma ops with various
>   guest page sizes [2]
> - assigned VFs using dma ops [3]:
>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>   with guest and host page size equal (4kB)
> 
> Known limitations:
> - no VMSAv8-32 suport
> - no nested stage support (S1 + S2)
> - no support for HYP mappings
> - register fine emulation, commands, interrupts and errors were
>   not accurately tested. Handling is sufficient to run use cases
>   described above though.
> - interrupts and event generation not observed yet.
> 

By design, shouldnt this work on hardware with smmuv2 implementations as well. 
ie. Guest with smmuv3 emulation + Host with smmuv2 hardware.

Or Is there any known limitations for this ?

> Best Regards
> 
> Eric
> 
> This series can be found at:
> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
> Previous version at:
> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
> 
> References:
> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>     https://lkml.org/lkml/2017/8/11/426
> 
> [2] qemu cmd line excerpt:
> -device smmuv3 \
> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
> [3] use -device smmuv3,caching-mode
> 
> 
> History:
> v6 -> v7:
> - see above
> 
> v5 -> v6:
> - Rebase on 2.10 and IOMMUMemoryRegion
> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>   ACPI mode)
> - fix block replay
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> 
> v4 -> v5:
> - initial_level now part of SMMUTransCfg
> - smmu_page_walk_64 takes into account the max input size
> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
> - smmuv3_translate: bug fix: don't walk on bypass
> - smmu_update_qreg: fix PROD index update
> - I did not yet address Peter's comments as the code is not mature enough
>   to be split into sub patches.
> 
> v3 -> v4 [Eric]:
> - page table walk rewritten to allow scan of the page table within a
>   range of IOVA. This prepares for VFIO integration and replay.
> - configuration parsing partially reworked.
> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>   PRI, ATS, ..
> - added ACPI table generation
> - migrated to dynamic traces
> - mingw compilation fix
> 
> v2 -> v3 [Eric]:
> - rebased on 2.9
> - mostly code and patch reorganization to ease the review process
> - optional patches removed. They may be handled separately. I am currently
>   working on ACPI enablement.
> - optional instantiation of the smmu in mach-virt
> - removed [2/9] (fdt functions) since not mandated
> - start splitting main patch into base and derived object
> - no new function feature added
> 
> v1 -> v2 [Prem]:
> - Adopted review comments from Eric Auger
>         - Make SMMU_DPRINTF to internally call qemu_log
>             (since translation requests are too many, we need control
>              on the type of log we want)
>         - SMMUTransCfg modified to suite simplicity
>         - Change RegInfo to uint64 register array
>         - Code cleanup
>         - Test cleanups
> - Reshuffled patches
> 
> v0 -> v1 [Prem]:
> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
> - Reworked register access/update logic
> - Factored out translation code for
>         - single point bug fix
>         - sharing/removal in future
> - (optional) Unit tests added, with PCI test device
>         - S1 with 4k/64k, S1+S2 with 4k/64k
>         - (S1 or S2) only can be verified by Linux 4.7 driver
>         - (optional) Priliminary ACPI support
> 
> v0 [Prem]:
> - Implements SMMUv3 spec 11.0
> - Supported for PCIe devices,
> - Command Queue and Event Queue supported
> - LPAE only, S1 is supported and Tested, S2 not tested
> - BE mode Translation not supported
> - IRQ support (legacy, no MSI)
> 
> Eric Auger (18):
>   hw/arm/smmu-common: smmu base device and datatypes
>   hw/arm/smmu-common: IOMMU memory region and address space setup
>   hw/arm/smmu-common: smmu_read/write_sysmem
>   hw/arm/smmu-common: VMSAv8-64 page table walk
>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>   hw/arm/smmuv3: Queue helpers
>   hw/arm/smmuv3: Implement MMIO write operations
>   hw/arm/smmuv3: Event queue recording helper
>   hw/arm/smmuv3: Implement translate callback
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>     notifications
>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>   hw/arm/virt: Store the PCI host controller dt phandle
>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>     functions
>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>     PlatformBusFDTData
>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
> 
> Prem Mallappa (2):
>   hw/arm/smmuv3: Skeleton
>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
> 
>  default-configs/aarch64-softmmu.mak |    1 +
>  hw/arm/Makefile.objs                |    1 +
>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>  hw/arm/smmu-internal.h              |  105 ++++
>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>  hw/arm/sysbus-fdt.c                 |  129 +++-
>  hw/arm/trace-events                 |   48 ++
>  hw/arm/virt-acpi-build.c            |   63 +-
>  hw/arm/virt.c                       |    6 +-
>  include/hw/acpi/acpi-defs.h         |   15 +
>  include/hw/arm/smmu-common.h        |  123 ++++
>  include/hw/arm/smmuv3.h             |   80 +++
>  include/hw/arm/sysbus-fdt.h         |    2 +
>  include/hw/arm/virt.h               |   15 +
>  target/arm/kvm.c                    |   27 +
>  target/arm/trace-events             |    3 +
>  17 files changed, 2886 insertions(+), 24 deletions(-)
>  create mode 100644 hw/arm/smmu-common.c
>  create mode 100644 hw/arm/smmu-internal.h
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmu-common.h
>  create mode 100644 include/hw/arm/smmuv3.h
> 
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-12  6:18 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-09-12  6:38   ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-12  6:38 UTC (permalink / raw)
  To: Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Linu,

On 12/09/2017 08:18, Linu Cherian wrote:
> Hi Eric,
> 
> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
>> This series implements the emulation code for ARM SMMUv3.
>>
>> Changes since v6:
>> - DPDK testpmd now running on guest with 2 assigned VFs
>> - Changed the instantiation method: add the following option to
>>   the QEMU command line
>>   -device smmuv3 # for virtio/vhost use cases
>>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
>> - splitted the series into smaller patches to allow the review
>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>>   is isolated from the rest: last 2 patches, not for upstream.
>>   This is shipped for testing/bench until a better solution is found.
>> - Reworked permission flag checks and event generation
>>
>> testing:
>> - in dt and ACPI modes
>> - virtio-net-pci and vhost-net devices using dma ops with various
>>   guest page sizes [2]
>> - assigned VFs using dma ops [3]:
>>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>>   with guest and host page size equal (4kB)
>>
>> Known limitations:
>> - no VMSAv8-32 suport
>> - no nested stage support (S1 + S2)
>> - no support for HYP mappings
>> - register fine emulation, commands, interrupts and errors were
>>   not accurately tested. Handling is sufficient to run use cases
>>   described above though.
>> - interrupts and event generation not observed yet.
>>
> 
> By design, shouldnt this work on hardware with smmuv2 implementations as well. 
> ie. Guest with smmuv3 emulation + Host with smmuv2 hardware.

Yes indeed. I am mostly testing with a host featuring smmuv2 at the moment.

Thanks

Eric
> 
> Or Is there any known limitations for this ?
> 
>> Best Regards
>>
>> Eric
>>
>> This series can be found at:
>> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
>> Previous version at:
>> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
>>
>> References:
>> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>>     https://lkml.org/lkml/2017/8/11/426
>>
>> [2] qemu cmd line excerpt:
>> -device smmuv3 \
>> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
>> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
>> [3] use -device smmuv3,caching-mode
>>
>>
>> History:
>> v6 -> v7:
>> - see above
>>
>> v5 -> v6:
>> - Rebase on 2.10 and IOMMUMemoryRegion
>> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>>   ACPI mode)
>> - fix block replay
>> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>>   (goes along with TLBI_ON_MAP FW quirk)
>> - replay systematically unmap the whole range first
>> - smmuv3_map_hook does not unmap anymore and the unmap is done
>>   before the replay
>> - add and use smmuv3_context_device_invalidate instead of
>>   blindly replaying everything
>>
>> v4 -> v5:
>> - initial_level now part of SMMUTransCfg
>> - smmu_page_walk_64 takes into account the max input size
>> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
>> - smmuv3_translate: bug fix: don't walk on bypass
>> - smmu_update_qreg: fix PROD index update
>> - I did not yet address Peter's comments as the code is not mature enough
>>   to be split into sub patches.
>>
>> v3 -> v4 [Eric]:
>> - page table walk rewritten to allow scan of the page table within a
>>   range of IOVA. This prepares for VFIO integration and replay.
>> - configuration parsing partially reworked.
>> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>>   PRI, ATS, ..
>> - added ACPI table generation
>> - migrated to dynamic traces
>> - mingw compilation fix
>>
>> v2 -> v3 [Eric]:
>> - rebased on 2.9
>> - mostly code and patch reorganization to ease the review process
>> - optional patches removed. They may be handled separately. I am currently
>>   working on ACPI enablement.
>> - optional instantiation of the smmu in mach-virt
>> - removed [2/9] (fdt functions) since not mandated
>> - start splitting main patch into base and derived object
>> - no new function feature added
>>
>> v1 -> v2 [Prem]:
>> - Adopted review comments from Eric Auger
>>         - Make SMMU_DPRINTF to internally call qemu_log
>>             (since translation requests are too many, we need control
>>              on the type of log we want)
>>         - SMMUTransCfg modified to suite simplicity
>>         - Change RegInfo to uint64 register array
>>         - Code cleanup
>>         - Test cleanups
>> - Reshuffled patches
>>
>> v0 -> v1 [Prem]:
>> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
>> - Reworked register access/update logic
>> - Factored out translation code for
>>         - single point bug fix
>>         - sharing/removal in future
>> - (optional) Unit tests added, with PCI test device
>>         - S1 with 4k/64k, S1+S2 with 4k/64k
>>         - (S1 or S2) only can be verified by Linux 4.7 driver
>>         - (optional) Priliminary ACPI support
>>
>> v0 [Prem]:
>> - Implements SMMUv3 spec 11.0
>> - Supported for PCIe devices,
>> - Command Queue and Event Queue supported
>> - LPAE only, S1 is supported and Tested, S2 not tested
>> - BE mode Translation not supported
>> - IRQ support (legacy, no MSI)
>>
>> Eric Auger (18):
>>   hw/arm/smmu-common: smmu base device and datatypes
>>   hw/arm/smmu-common: IOMMU memory region and address space setup
>>   hw/arm/smmu-common: smmu_read/write_sysmem
>>   hw/arm/smmu-common: VMSAv8-64 page table walk
>>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>>   hw/arm/smmuv3: Queue helpers
>>   hw/arm/smmuv3: Implement MMIO write operations
>>   hw/arm/smmuv3: Event queue recording helper
>>   hw/arm/smmuv3: Implement translate callback
>>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>>     notifications
>>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>>   hw/arm/virt: Store the PCI host controller dt phandle
>>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>>     functions
>>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>>     PlatformBusFDTData
>>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
>>
>> Prem Mallappa (2):
>>   hw/arm/smmuv3: Skeleton
>>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
>>
>>  default-configs/aarch64-softmmu.mak |    1 +
>>  hw/arm/Makefile.objs                |    1 +
>>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>>  hw/arm/smmu-internal.h              |  105 ++++
>>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>>  hw/arm/sysbus-fdt.c                 |  129 +++-
>>  hw/arm/trace-events                 |   48 ++
>>  hw/arm/virt-acpi-build.c            |   63 +-
>>  hw/arm/virt.c                       |    6 +-
>>  include/hw/acpi/acpi-defs.h         |   15 +
>>  include/hw/arm/smmu-common.h        |  123 ++++
>>  include/hw/arm/smmuv3.h             |   80 +++
>>  include/hw/arm/sysbus-fdt.h         |    2 +
>>  include/hw/arm/virt.h               |   15 +
>>  target/arm/kvm.c                    |   27 +
>>  target/arm/trace-events             |    3 +
>>  17 files changed, 2886 insertions(+), 24 deletions(-)
>>  create mode 100644 hw/arm/smmu-common.c
>>  create mode 100644 hw/arm/smmu-internal.h
>>  create mode 100644 hw/arm/smmuv3-internal.h
>>  create mode 100644 hw/arm/smmuv3.c
>>  create mode 100644 include/hw/arm/smmu-common.h
>>  create mode 100644 include/hw/arm/smmuv3.h
>>
>> -- 
>> 2.5.5
>>
>>
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback Eric Auger
@ 2017-09-14  9:27   ` Linu Cherian
  2017-09-14 14:31     ` Tomasz Nowicki
  2017-09-15  7:23     ` Auger Eric
  0 siblings, 2 replies; 72+ messages in thread
From: Linu Cherian @ 2017-09-14  9:27 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
> memory_region_iommu_replay() is used for VFIO integration.
> 
> However its default implementation is not adapted to SMMUv3
> IOMMU memory region. Indeed the input address range is too
> huge and its execution is too slow as it calls the translate()
> callback on each granule.
> 
> Let's implement the replay callback which hierarchically walk
> over the page table structure and notify only the segments
> that are populated with valid entries.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events |  1 +
>  2 files changed, 37 insertions(+)
> 
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 8e7d10d..c43bd93 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry *entry, void *private)
>      return 0;
>  }
>  
> +/* Unmap the whole notifier's range */
> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
> +{
> +    IOMMUTLBEntry entry;
> +    hwaddr size = n->end - n->start + 1;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.iova = n->start & ~(size - 1);
> +    entry.perm = IOMMU_NONE;
> +    entry.addr_mask = size - 1;
> +
> +    memory_region_notify_one(n, &entry);
> +}
> +
> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
> +{
> +    SMMUTransCfg cfg = {};
> +    int ret;
> +
> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
> +    smmuv3_unmap_notifier_range(n);
> +
> +    ret = smmuv3_decode_config(mr, &cfg);
> +    if (ret) {
> +        error_report("%s error decoding the configuration for iommu mr=%s",
> +                     __func__, mr->parent_obj.name);
> +    }
> 

On an invalid config being found, shouldnt we return rather than proceeding with
page table walk. For example on an invalid Stream table entry.

+
> +    if (cfg.disabled || cfg.bypassed) {
> +        return;
> +    }
> +    /* walk the page tables and replay valid entries */
> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
> +                   smmuv3_notify_entry, n);
> +}
>  static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>                                       uint64_t iova, size_t size)
>  {
> @@ -1095,6 +1130,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>  
>      imrc->translate = smmuv3_translate;
>      imrc->notify_flag_changed = smmuv3_notify_flag_changed;
> +    imrc->replay = smmuv3_replay;
>  }
>  
>  static const TypeInfo smmuv3_type_info = {
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 4ac264d..15f84d6 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, ui
>  smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
>  smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
>  smmuv3_replay_mr(const char *name) "iommu mr=%s"
> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end) "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>  smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>  smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-14  9:27   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-09-14 14:31     ` Tomasz Nowicki
  2017-09-14 14:43       ` Tomasz Nowicki
  2017-09-15  7:23     ` Auger Eric
  1 sibling, 1 reply; 72+ messages in thread
From: Tomasz Nowicki @ 2017-09-14 14:31 UTC (permalink / raw)
  To: Linu Cherian, Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On 14.09.2017 11:27, Linu Cherian wrote:
> Hi Eric,
> 
> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>> memory_region_iommu_replay() is used for VFIO integration.
>>
>> However its default implementation is not adapted to SMMUv3
>> IOMMU memory region. Indeed the input address range is too
>> huge and its execution is too slow as it calls the translate()
>> callback on each granule.
>>
>> Let's implement the replay callback which hierarchically walk
>> over the page table structure and notify only the segments
>> that are populated with valid entries.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>   hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>   hw/arm/trace-events |  1 +
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 8e7d10d..c43bd93 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry *entry, void *private)
>>       return 0;
>>   }
>>   
>> +/* Unmap the whole notifier's range */
>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>> +{
>> +    IOMMUTLBEntry entry;
>> +    hwaddr size = n->end - n->start + 1;
>> +
>> +    entry.target_as = &address_space_memory;
>> +    entry.iova = n->start & ~(size - 1);
>> +    entry.perm = IOMMU_NONE;
>> +    entry.addr_mask = size - 1;
>> +
>> +    memory_region_notify_one(n, &entry);
>> +}
>> +
>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>> +{
>> +    SMMUTransCfg cfg = {};
>> +    int ret;
>> +
>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>> +    smmuv3_unmap_notifier_range(n);
>> +
>> +    ret = smmuv3_decode_config(mr, &cfg);
>> +    if (ret) {
>> +        error_report("%s error decoding the configuration for iommu mr=%s",
>> +                     __func__, mr->parent_obj.name);
>> +    }
>>
> 
> On an invalid config being found, shouldnt we return rather than proceeding with
> page table walk. For example on an invalid Stream table entry.

Indeed, without return here vhost case is not working for me.

Thanks,
Tomasz

> 
> +
>> +    if (cfg.disabled || cfg.bypassed) {
>> +        return;
>> +    }
>> +    /* walk the page tables and replay valid entries */
>> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>> +                   smmuv3_notify_entry, n);
>> +}
>>   static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>>                                        uint64_t iova, size_t size)
>>   {
>> @@ -1095,6 +1130,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>   
>>       imrc->translate = smmuv3_translate;
>>       imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>> +    imrc->replay = smmuv3_replay;
>>   }
>>   
>>   static const TypeInfo smmuv3_type_info = {
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 4ac264d..15f84d6 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, ui
>>   smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
>>   smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
>>   smmuv3_replay_mr(const char *name) "iommu mr=%s"
>> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end) "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>>   smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>>   smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>> -- 
>> 2.5.5
>>
>>
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-14 14:31     ` Tomasz Nowicki
@ 2017-09-14 14:43       ` Tomasz Nowicki
  2017-09-15  7:30         ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Tomasz Nowicki @ 2017-09-14 14:43 UTC (permalink / raw)
  To: Linu Cherian, Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

On 14.09.2017 16:31, Tomasz Nowicki wrote:
> Hi Eric,
> 
> On 14.09.2017 11:27, Linu Cherian wrote:
>> Hi Eric,
>>
>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>> memory_region_iommu_replay() is used for VFIO integration.
>>>
>>> However its default implementation is not adapted to SMMUv3
>>> IOMMU memory region. Indeed the input address range is too
>>> huge and its execution is too slow as it calls the translate()
>>> callback on each granule.
>>>
>>> Let's implement the replay callback which hierarchically walk
>>> over the page table structure and notify only the segments
>>> that are populated with valid entries.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>>   hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>   hw/arm/trace-events |  1 +
>>>   2 files changed, 37 insertions(+)
>>>
>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>> index 8e7d10d..c43bd93 100644
>>> --- a/hw/arm/smmuv3.c
>>> +++ b/hw/arm/smmuv3.c
>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry 
>>> *entry, void *private)
>>>       return 0;
>>>   }
>>> +/* Unmap the whole notifier's range */
>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>> +{
>>> +    IOMMUTLBEntry entry;
>>> +    hwaddr size = n->end - n->start + 1;
>>> +
>>> +    entry.target_as = &address_space_memory;
>>> +    entry.iova = n->start & ~(size - 1);
>>> +    entry.perm = IOMMU_NONE;
>>> +    entry.addr_mask = size - 1;
>>> +
>>> +    memory_region_notify_one(n, &entry);
>>> +}
>>> +
>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>> +{
>>> +    SMMUTransCfg cfg = {};
>>> +    int ret;
>>> +
>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>> +    smmuv3_unmap_notifier_range(n);
>>> +
>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>> +    if (ret) {
>>> +        error_report("%s error decoding the configuration for iommu 
>>> mr=%s",
>>> +                     __func__, mr->parent_obj.name);
>>> +    }
>>>
>>
>> On an invalid config being found, shouldnt we return rather than 
>> proceeding with
>> page table walk. For example on an invalid Stream table entry.
> 
> Indeed, without return here vhost case is not working for me.

I was just lucky one time. return here has no influence. Vhost still not 
working. Sorry for noise.

Tomasz

> 
> Thanks,
> Tomasz
> 
>>
>> +
>>> +    if (cfg.disabled || cfg.bypassed) {
>>> +        return;
>>> +    }
>>> +    /* walk the page tables and replay valid entries */
>>> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>>> +                   smmuv3_notify_entry, n);
>>> +}
>>>   static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, 
>>> IOMMUNotifier *n,
>>>                                        uint64_t iova, size_t size)
>>>   {
>>> @@ -1095,6 +1130,7 @@ static void 
>>> smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>>       imrc->translate = smmuv3_translate;
>>>       imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>>> +    imrc->replay = smmuv3_replay;
>>>   }
>>>   static const TypeInfo smmuv3_type_info = {
>>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>>> index 4ac264d..15f84d6 100644
>>> --- a/hw/arm/trace-events
>>> +++ b/hw/arm/trace-events
>>> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, 
>>> uint64_t ttbr, bool aa64, ui
>>>   smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node 
>>> for iommu mr=%s"
>>>   smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node 
>>> for iommu mr=%s"
>>>   smmuv3_replay_mr(const char *name) "iommu mr=%s"
>>> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end) 
>>> "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>>>   smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) 
>>> "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>>>   smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t 
>>> size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>>> -- 
>>> 2.5.5
>>>
>>>
>>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-14  9:27   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  2017-09-14 14:31     ` Tomasz Nowicki
@ 2017-09-15  7:23     ` Auger Eric
  1 sibling, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-15  7:23 UTC (permalink / raw)
  To: Linu Cherian
  Cc: peter.maydell, drjones, tcain, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, jean-philippe.brucker, tn, bharat.bhushan, mst,
	will.deacon, qemu-devel, peterx, alex.williamson, qemu-arm,
	christoffer.dall, linu.cherian, wtownsen, robin.murphy,
	prem.mallappa, eric.auger.pro

Hi Linu,

On 14/09/2017 11:27, Linu Cherian wrote:
> Hi Eric,
> 
> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>> memory_region_iommu_replay() is used for VFIO integration.
>>
>> However its default implementation is not adapted to SMMUv3
>> IOMMU memory region. Indeed the input address range is too
>> huge and its execution is too slow as it calls the translate()
>> callback on each granule.
>>
>> Let's implement the replay callback which hierarchically walk
>> over the page table structure and notify only the segments
>> that are populated with valid entries.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>  hw/arm/trace-events |  1 +
>>  2 files changed, 37 insertions(+)
>>
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 8e7d10d..c43bd93 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry *entry, void *private)
>>      return 0;
>>  }
>>  
>> +/* Unmap the whole notifier's range */
>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>> +{
>> +    IOMMUTLBEntry entry;
>> +    hwaddr size = n->end - n->start + 1;
>> +
>> +    entry.target_as = &address_space_memory;
>> +    entry.iova = n->start & ~(size - 1);
>> +    entry.perm = IOMMU_NONE;
>> +    entry.addr_mask = size - 1;
>> +
>> +    memory_region_notify_one(n, &entry);
>> +}
>> +
>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>> +{
>> +    SMMUTransCfg cfg = {};
>> +    int ret;
>> +
>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>> +    smmuv3_unmap_notifier_range(n);
>> +
>> +    ret = smmuv3_decode_config(mr, &cfg);
>> +    if (ret) {
>> +        error_report("%s error decoding the configuration for iommu mr=%s",
>> +                     __func__, mr->parent_obj.name);
>> +    }
>>
> 
> On an invalid config being found, shouldnt we return rather than proceeding with
> page table walk. For example on an invalid Stream table entry.
Yes that's correct. I am going to fix that.

Thanks!

Eric
> 
> +
>> +    if (cfg.disabled || cfg.bypassed) {
>> +        return;
>> +    }
>> +    /* walk the page tables and replay valid entries */
>> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>> +                   smmuv3_notify_entry, n);
>> +}
>>  static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr, IOMMUNotifier *n,
>>                                       uint64_t iova, size_t size)
>>  {
>> @@ -1095,6 +1130,7 @@ static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>  
>>      imrc->translate = smmuv3_translate;
>>      imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>> +    imrc->replay = smmuv3_replay;
>>  }
>>  
>>  static const TypeInfo smmuv3_type_info = {
>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>> index 4ac264d..15f84d6 100644
>> --- a/hw/arm/trace-events
>> +++ b/hw/arm/trace-events
>> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t tsz, uint64_t ttbr, bool aa64, ui
>>  smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu mr=%s"
>>  smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu mr=%s"
>>  smmuv3_replay_mr(const char *name) "iommu mr=%s"
>> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end) "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>>  smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm) "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>>  smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>> -- 
>> 2.5.5
>>
>>
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-14 14:43       ` Tomasz Nowicki
@ 2017-09-15  7:30         ` Auger Eric
  2017-09-15  7:41           ` Auger Eric
  2017-09-15 10:42           ` tn
  0 siblings, 2 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-15  7:30 UTC (permalink / raw)
  To: Tomasz Nowicki, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Tomasz,

On 14/09/2017 16:43, Tomasz Nowicki wrote:
> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>> Hi Eric,
>>
>> On 14.09.2017 11:27, Linu Cherian wrote:
>>> Hi Eric,
>>>
>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>
>>>> However its default implementation is not adapted to SMMUv3
>>>> IOMMU memory region. Indeed the input address range is too
>>>> huge and its execution is too slow as it calls the translate()
>>>> callback on each granule.
>>>>
>>>> Let's implement the replay callback which hierarchically walk
>>>> over the page table structure and notify only the segments
>>>> that are populated with valid entries.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> ---
>>>>   hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>   hw/arm/trace-events |  1 +
>>>>   2 files changed, 37 insertions(+)
>>>>
>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>> index 8e7d10d..c43bd93 100644
>>>> --- a/hw/arm/smmuv3.c
>>>> +++ b/hw/arm/smmuv3.c
>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>> *entry, void *private)
>>>>       return 0;
>>>>   }
>>>> +/* Unmap the whole notifier's range */
>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>> +{
>>>> +    IOMMUTLBEntry entry;
>>>> +    hwaddr size = n->end - n->start + 1;
>>>> +
>>>> +    entry.target_as = &address_space_memory;
>>>> +    entry.iova = n->start & ~(size - 1);
>>>> +    entry.perm = IOMMU_NONE;
>>>> +    entry.addr_mask = size - 1;
>>>> +
>>>> +    memory_region_notify_one(n, &entry);
>>>> +}
>>>> +
>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>> +{
>>>> +    SMMUTransCfg cfg = {};
>>>> +    int ret;
>>>> +
>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>> +    smmuv3_unmap_notifier_range(n);
>>>> +
>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>> +    if (ret) {
>>>> +        error_report("%s error decoding the configuration for iommu
>>>> mr=%s",
>>>> +                     __func__, mr->parent_obj.name);
>>>> +    }
>>>>
>>>
>>> On an invalid config being found, shouldnt we return rather than
>>> proceeding with
>>> page table walk. For example on an invalid Stream table entry.
>>
>> Indeed, without return here vhost case is not working for me.
> 
> I was just lucky one time. return here has no influence. Vhost still not
> working. Sorry for noise.

As far as I understand the replay() callback only is called in VFIO use
case. So this shouldn't impact vhost.

I can't reproduce your vhost issue on my side. I will review the
invalidate code again and compare against the last version.

What is the page size used by your guest?

Thanks

Eric
> 
> Tomasz
> 
>>
>> Thanks,
>> Tomasz
>>
>>>
>>> +
>>>> +    if (cfg.disabled || cfg.bypassed) {
>>>> +        return;
>>>> +    }
>>>> +    /* walk the page tables and replay valid entries */
>>>> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>>>> +                   smmuv3_notify_entry, n);
>>>> +}
>>>>   static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr,
>>>> IOMMUNotifier *n,
>>>>                                        uint64_t iova, size_t size)
>>>>   {
>>>> @@ -1095,6 +1130,7 @@ static void
>>>> smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>>>       imrc->translate = smmuv3_translate;
>>>>       imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>>>> +    imrc->replay = smmuv3_replay;
>>>>   }
>>>>   static const TypeInfo smmuv3_type_info = {
>>>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>>>> index 4ac264d..15f84d6 100644
>>>> --- a/hw/arm/trace-events
>>>> +++ b/hw/arm/trace-events
>>>> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t
>>>> tsz, uint64_t ttbr, bool aa64, ui
>>>>   smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node
>>>> for iommu mr=%s"
>>>>   smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node
>>>> for iommu mr=%s"
>>>>   smmuv3_replay_mr(const char *name) "iommu mr=%s"
>>>> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end)
>>>> "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>>>>   smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm)
>>>> "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>>>>   smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t
>>>> size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>>>> -- 
>>>> 2.5.5
>>>>
>>>>
>>>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-15  7:30         ` Auger Eric
@ 2017-09-15  7:41           ` Auger Eric
  2017-09-15 10:42           ` tn
  1 sibling, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-15  7:41 UTC (permalink / raw)
  To: Tomasz Nowicki, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian



On 15/09/2017 09:30, Auger Eric wrote:
> Hi Tomasz,
> 
> On 14/09/2017 16:43, Tomasz Nowicki wrote:
>> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>>> Hi Eric,
>>>
>>> On 14.09.2017 11:27, Linu Cherian wrote:
>>>> Hi Eric,
>>>>
>>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>>
>>>>> However its default implementation is not adapted to SMMUv3
>>>>> IOMMU memory region. Indeed the input address range is too
>>>>> huge and its execution is too slow as it calls the translate()
>>>>> callback on each granule.
>>>>>
>>>>> Let's implement the replay callback which hierarchically walk
>>>>> over the page table structure and notify only the segments
>>>>> that are populated with valid entries.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> ---
>>>>>   hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>>   hw/arm/trace-events |  1 +
>>>>>   2 files changed, 37 insertions(+)
>>>>>
>>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>>> index 8e7d10d..c43bd93 100644
>>>>> --- a/hw/arm/smmuv3.c
>>>>> +++ b/hw/arm/smmuv3.c
>>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>>> *entry, void *private)
>>>>>       return 0;
>>>>>   }
>>>>> +/* Unmap the whole notifier's range */
>>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>>> +{
>>>>> +    IOMMUTLBEntry entry;
>>>>> +    hwaddr size = n->end - n->start + 1;
>>>>> +
>>>>> +    entry.target_as = &address_space_memory;
>>>>> +    entry.iova = n->start & ~(size - 1);
>>>>> +    entry.perm = IOMMU_NONE;
>>>>> +    entry.addr_mask = size - 1;
>>>>> +
>>>>> +    memory_region_notify_one(n, &entry);
>>>>> +}
>>>>> +
>>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>>> +{
>>>>> +    SMMUTransCfg cfg = {};
>>>>> +    int ret;
>>>>> +
>>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>>> +    smmuv3_unmap_notifier_range(n);
>>>>> +
>>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>>> +    if (ret) {
>>>>> +        error_report("%s error decoding the configuration for iommu
>>>>> mr=%s",
>>>>> +                     __func__, mr->parent_obj.name);
>>>>> +    }
>>>>>
>>>>
>>>> On an invalid config being found, shouldnt we return rather than
>>>> proceeding with
>>>> page table walk. For example on an invalid Stream table entry.
>>>
>>> Indeed, without return here vhost case is not working for me.
>>
>> I was just lucky one time. return here has no influence. Vhost still not
>> working. Sorry for noise.
> 
> As far as I understand the replay() callback only is called in VFIO use
> case. So this shouldn't impact vhost.
Forget that, that's potentially called from some invalidation commands
also in vhost case.

Thanks

Eric
> 
> I can't reproduce your vhost issue on my side. I will review the
> invalidate code again and compare against the last version.
> 
> What is the page size used by your guest?
> 
> Thanks
> 
> Eric
>>
>> Tomasz
>>
>>>
>>> Thanks,
>>> Tomasz
>>>
>>>>
>>>> +
>>>>> +    if (cfg.disabled || cfg.bypassed) {
>>>>> +        return;
>>>>> +    }
>>>>> +    /* walk the page tables and replay valid entries */
>>>>> +    smmu_page_walk(&cfg, 0, (1ULL << (64 - cfg.tsz)) - 1, false,
>>>>> +                   smmuv3_notify_entry, n);
>>>>> +}
>>>>>   static void smmuv3_notify_iova_range(IOMMUMemoryRegion *mr,
>>>>> IOMMUNotifier *n,
>>>>>                                        uint64_t iova, size_t size)
>>>>>   {
>>>>> @@ -1095,6 +1130,7 @@ static void
>>>>> smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
>>>>>       imrc->translate = smmuv3_translate;
>>>>>       imrc->notify_flag_changed = smmuv3_notify_flag_changed;
>>>>> +    imrc->replay = smmuv3_replay;
>>>>>   }
>>>>>   static const TypeInfo smmuv3_type_info = {
>>>>> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
>>>>> index 4ac264d..15f84d6 100644
>>>>> --- a/hw/arm/trace-events
>>>>> +++ b/hw/arm/trace-events
>>>>> @@ -46,5 +46,6 @@ smmuv3_cfg_stage(int s, uint32_t oas, uint32_t
>>>>> tsz, uint64_t ttbr, bool aa64, ui
>>>>>   smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node
>>>>> for iommu mr=%s"
>>>>>   smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node
>>>>> for iommu mr=%s"
>>>>>   smmuv3_replay_mr(const char *name) "iommu mr=%s"
>>>>> +smmuv3_replay(const char *name, void *n, hwaddr start, hwaddr end)
>>>>> "iommu mr=%s notifier=%p [0x%"PRIx64",0x%"PRIx64"]"
>>>>>   smmuv3_notify_entry(hwaddr iova, hwaddr pa, hwaddr mask, int perm)
>>>>> "iova=0x%"PRIx64" pa=0x%" PRIx64" mask=0x%"PRIx64" perm=%d"
>>>>>   smmuv3_notify_iova_range(const char *name, uint64_t iova, size_t
>>>>> size, void *n) "iommu mr=%s iova=0x%"PRIx64" size=0x%lx n=%p"
>>>>> -- 
>>>>> 2.5.5
>>>>>
>>>>>
>>>>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-15  7:30         ` Auger Eric
  2017-09-15  7:41           ` Auger Eric
@ 2017-09-15 10:42           ` tn
  2017-09-15 13:19             ` Auger Eric
  2017-09-15 14:50             ` Auger Eric
  1 sibling, 2 replies; 72+ messages in thread
From: tn @ 2017-09-15 10:42 UTC (permalink / raw)
  To: Auger Eric, Tomasz Nowicki, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On 15.09.2017 09:30, Auger Eric wrote:
> Hi Tomasz,
> 
> On 14/09/2017 16:43, Tomasz Nowicki wrote:
>> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>>> Hi Eric,
>>>
>>> On 14.09.2017 11:27, Linu Cherian wrote:
>>>> Hi Eric,
>>>>
>>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>>
>>>>> However its default implementation is not adapted to SMMUv3
>>>>> IOMMU memory region. Indeed the input address range is too
>>>>> huge and its execution is too slow as it calls the translate()
>>>>> callback on each granule.
>>>>>
>>>>> Let's implement the replay callback which hierarchically walk
>>>>> over the page table structure and notify only the segments
>>>>> that are populated with valid entries.
>>>>>
>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>> ---
>>>>>    hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>>    hw/arm/trace-events |  1 +
>>>>>    2 files changed, 37 insertions(+)
>>>>>
>>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>>> index 8e7d10d..c43bd93 100644
>>>>> --- a/hw/arm/smmuv3.c
>>>>> +++ b/hw/arm/smmuv3.c
>>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>>> *entry, void *private)
>>>>>        return 0;
>>>>>    }
>>>>> +/* Unmap the whole notifier's range */
>>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>>> +{
>>>>> +    IOMMUTLBEntry entry;
>>>>> +    hwaddr size = n->end - n->start + 1;
>>>>> +
>>>>> +    entry.target_as = &address_space_memory;
>>>>> +    entry.iova = n->start & ~(size - 1);
>>>>> +    entry.perm = IOMMU_NONE;
>>>>> +    entry.addr_mask = size - 1;
>>>>> +
>>>>> +    memory_region_notify_one(n, &entry);
>>>>> +}
>>>>> +
>>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>>> +{
>>>>> +    SMMUTransCfg cfg = {};
>>>>> +    int ret;
>>>>> +
>>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>>> +    smmuv3_unmap_notifier_range(n);
>>>>> +
>>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>>> +    if (ret) {
>>>>> +        error_report("%s error decoding the configuration for iommu
>>>>> mr=%s",
>>>>> +                     __func__, mr->parent_obj.name);
>>>>> +    }
>>>>>
>>>>
>>>> On an invalid config being found, shouldnt we return rather than
>>>> proceeding with
>>>> page table walk. For example on an invalid Stream table entry.
>>>
>>> Indeed, without return here vhost case is not working for me.
>>
>> I was just lucky one time. return here has no influence. Vhost still not
>> working. Sorry for noise.
> 
> As far as I understand the replay() callback only is called in VFIO use
> case. So this shouldn't impact vhost.
> 
> I can't reproduce your vhost issue on my side. I will review the
> invalidate code again and compare against the last version.
> 
> What is the page size used by your guest?

64K page size for guest as well as for host.

However, I've just checked 4K page size for guest and then vhost is 
working fine.

Thanks,
Tomasz

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-15 10:42           ` tn
@ 2017-09-15 13:19             ` Auger Eric
  2017-09-15 14:50             ` Auger Eric
  1 sibling, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-15 13:19 UTC (permalink / raw)
  To: tn, Tomasz Nowicki, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Tomasz,
On 15/09/2017 12:42, tn wrote:
> Hi Eric,
> 
> On 15.09.2017 09:30, Auger Eric wrote:
>> Hi Tomasz,
>>
>> On 14/09/2017 16:43, Tomasz Nowicki wrote:
>>> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>>>> Hi Eric,
>>>>
>>>> On 14.09.2017 11:27, Linu Cherian wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>>>
>>>>>> However its default implementation is not adapted to SMMUv3
>>>>>> IOMMU memory region. Indeed the input address range is too
>>>>>> huge and its execution is too slow as it calls the translate()
>>>>>> callback on each granule.
>>>>>>
>>>>>> Let's implement the replay callback which hierarchically walk
>>>>>> over the page table structure and notify only the segments
>>>>>> that are populated with valid entries.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>    hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>>>    hw/arm/trace-events |  1 +
>>>>>>    2 files changed, 37 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>>>> index 8e7d10d..c43bd93 100644
>>>>>> --- a/hw/arm/smmuv3.c
>>>>>> +++ b/hw/arm/smmuv3.c
>>>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>>>> *entry, void *private)
>>>>>>        return 0;
>>>>>>    }
>>>>>> +/* Unmap the whole notifier's range */
>>>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>>>> +{
>>>>>> +    IOMMUTLBEntry entry;
>>>>>> +    hwaddr size = n->end - n->start + 1;
>>>>>> +
>>>>>> +    entry.target_as = &address_space_memory;
>>>>>> +    entry.iova = n->start & ~(size - 1);
>>>>>> +    entry.perm = IOMMU_NONE;
>>>>>> +    entry.addr_mask = size - 1;
>>>>>> +
>>>>>> +    memory_region_notify_one(n, &entry);
>>>>>> +}
>>>>>> +
>>>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>>>> +{
>>>>>> +    SMMUTransCfg cfg = {};
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>>>> +    smmuv3_unmap_notifier_range(n);
>>>>>> +
>>>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>>>> +    if (ret) {
>>>>>> +        error_report("%s error decoding the configuration for iommu
>>>>>> mr=%s",
>>>>>> +                     __func__, mr->parent_obj.name);
>>>>>> +    }
>>>>>>
>>>>>
>>>>> On an invalid config being found, shouldnt we return rather than
>>>>> proceeding with
>>>>> page table walk. For example on an invalid Stream table entry.
>>>>
>>>> Indeed, without return here vhost case is not working for me.
>>>
>>> I was just lucky one time. return here has no influence. Vhost still not
>>> working. Sorry for noise.
>>
>> As far as I understand the replay() callback only is called in VFIO use
>> case. So this shouldn't impact vhost.
>>
>> I can't reproduce your vhost issue on my side. I will review the
>> invalidate code again and compare against the last version.
>>
>> What is the page size used by your guest?
> 
> 64K page size for guest as well as for host.
> 
> However, I've just checked 4K page size for guest and then vhost is
> working fine.
I can reproduce the issue with vhost on 64KB page guest. Currently
investigating...

Thanks!

Eric
> 
> Thanks,
> Tomasz

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-15 10:42           ` tn
  2017-09-15 13:19             ` Auger Eric
@ 2017-09-15 14:50             ` Auger Eric
  2017-09-18  9:50               ` Tomasz Nowicki
  1 sibling, 1 reply; 72+ messages in thread
From: Auger Eric @ 2017-09-15 14:50 UTC (permalink / raw)
  To: tn, Tomasz Nowicki, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi,

On 15/09/2017 12:42, tn wrote:
> Hi Eric,
> 
> On 15.09.2017 09:30, Auger Eric wrote:
>> Hi Tomasz,
>>
>> On 14/09/2017 16:43, Tomasz Nowicki wrote:
>>> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>>>> Hi Eric,
>>>>
>>>> On 14.09.2017 11:27, Linu Cherian wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>>>
>>>>>> However its default implementation is not adapted to SMMUv3
>>>>>> IOMMU memory region. Indeed the input address range is too
>>>>>> huge and its execution is too slow as it calls the translate()
>>>>>> callback on each granule.
>>>>>>
>>>>>> Let's implement the replay callback which hierarchically walk
>>>>>> over the page table structure and notify only the segments
>>>>>> that are populated with valid entries.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> ---
>>>>>>    hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>>>    hw/arm/trace-events |  1 +
>>>>>>    2 files changed, 37 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>>>> index 8e7d10d..c43bd93 100644
>>>>>> --- a/hw/arm/smmuv3.c
>>>>>> +++ b/hw/arm/smmuv3.c
>>>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>>>> *entry, void *private)
>>>>>>        return 0;
>>>>>>    }
>>>>>> +/* Unmap the whole notifier's range */
>>>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>>>> +{
>>>>>> +    IOMMUTLBEntry entry;
>>>>>> +    hwaddr size = n->end - n->start + 1;
>>>>>> +
>>>>>> +    entry.target_as = &address_space_memory;
>>>>>> +    entry.iova = n->start & ~(size - 1);
>>>>>> +    entry.perm = IOMMU_NONE;
>>>>>> +    entry.addr_mask = size - 1;
>>>>>> +
>>>>>> +    memory_region_notify_one(n, &entry);
>>>>>> +}
>>>>>> +
>>>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>>>> +{
>>>>>> +    SMMUTransCfg cfg = {};
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>>>> +    smmuv3_unmap_notifier_range(n);
>>>>>> +
>>>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>>>> +    if (ret) {
>>>>>> +        error_report("%s error decoding the configuration for iommu
>>>>>> mr=%s",
>>>>>> +                     __func__, mr->parent_obj.name);
>>>>>> +    }
>>>>>>
>>>>>
>>>>> On an invalid config being found, shouldnt we return rather than
>>>>> proceeding with
>>>>> page table walk. For example on an invalid Stream table entry.
>>>>
>>>> Indeed, without return here vhost case is not working for me.
>>>
>>> I was just lucky one time. return here has no influence. Vhost still not
>>> working. Sorry for noise.
>>
>> As far as I understand the replay() callback only is called in VFIO use
>> case. So this shouldn't impact vhost.
>>
>> I can't reproduce your vhost issue on my side. I will review the
>> invalidate code again and compare against the last version.
>>
>> What is the page size used by your guest?
> 
> 64K page size for guest as well as for host.
> 
> However, I've just checked 4K page size for guest and then vhost is
> working fine.
So the bug stems from the incorrect target page size used on
SMMU_CMD_TLBI_NH_VA invalidation. This is now corrected by using the
actual config granule size and this fixes the issue with vhost use case
and 64KB page guest. I will release this fix early next week. Sorry for
the inconvenience.

Thanks

Eric
> 
> Thanks,
> Tomasz

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback
  2017-09-15 14:50             ` Auger Eric
@ 2017-09-18  9:50               ` Tomasz Nowicki
  0 siblings, 0 replies; 72+ messages in thread
From: Tomasz Nowicki @ 2017-09-18  9:50 UTC (permalink / raw)
  To: Auger Eric, tn, Linu Cherian
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On 15.09.2017 16:50, Auger Eric wrote:
> Hi,
> 
> On 15/09/2017 12:42, tn wrote:
>> Hi Eric,
>>
>> On 15.09.2017 09:30, Auger Eric wrote:
>>> Hi Tomasz,
>>>
>>> On 14/09/2017 16:43, Tomasz Nowicki wrote:
>>>> On 14.09.2017 16:31, Tomasz Nowicki wrote:
>>>>> Hi Eric,
>>>>>
>>>>> On 14.09.2017 11:27, Linu Cherian wrote:
>>>>>> Hi Eric,
>>>>>>
>>>>>> On Fri Sep 01, 2017 at 07:21:16PM +0200, Eric Auger wrote:
>>>>>>> memory_region_iommu_replay() is used for VFIO integration.
>>>>>>>
>>>>>>> However its default implementation is not adapted to SMMUv3
>>>>>>> IOMMU memory region. Indeed the input address range is too
>>>>>>> huge and its execution is too slow as it calls the translate()
>>>>>>> callback on each granule.
>>>>>>>
>>>>>>> Let's implement the replay callback which hierarchically walk
>>>>>>> over the page table structure and notify only the segments
>>>>>>> that are populated with valid entries.
>>>>>>>
>>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>>> ---
>>>>>>>     hw/arm/smmuv3.c     | 36 ++++++++++++++++++++++++++++++++++++
>>>>>>>     hw/arm/trace-events |  1 +
>>>>>>>     2 files changed, 37 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>>>>>>> index 8e7d10d..c43bd93 100644
>>>>>>> --- a/hw/arm/smmuv3.c
>>>>>>> +++ b/hw/arm/smmuv3.c
>>>>>>> @@ -657,6 +657,41 @@ static int smmuv3_notify_entry(IOMMUTLBEntry
>>>>>>> *entry, void *private)
>>>>>>>         return 0;
>>>>>>>     }
>>>>>>> +/* Unmap the whole notifier's range */
>>>>>>> +static void smmuv3_unmap_notifier_range(IOMMUNotifier *n)
>>>>>>> +{
>>>>>>> +    IOMMUTLBEntry entry;
>>>>>>> +    hwaddr size = n->end - n->start + 1;
>>>>>>> +
>>>>>>> +    entry.target_as = &address_space_memory;
>>>>>>> +    entry.iova = n->start & ~(size - 1);
>>>>>>> +    entry.perm = IOMMU_NONE;
>>>>>>> +    entry.addr_mask = size - 1;
>>>>>>> +
>>>>>>> +    memory_region_notify_one(n, &entry);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void smmuv3_replay(IOMMUMemoryRegion *mr, IOMMUNotifier *n)
>>>>>>> +{
>>>>>>> +    SMMUTransCfg cfg = {};
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    trace_smmuv3_replay(mr->parent_obj.name, n, n->start, n->end);
>>>>>>> +    smmuv3_unmap_notifier_range(n);
>>>>>>> +
>>>>>>> +    ret = smmuv3_decode_config(mr, &cfg);
>>>>>>> +    if (ret) {
>>>>>>> +        error_report("%s error decoding the configuration for iommu
>>>>>>> mr=%s",
>>>>>>> +                     __func__, mr->parent_obj.name);
>>>>>>> +    }
>>>>>>>
>>>>>>
>>>>>> On an invalid config being found, shouldnt we return rather than
>>>>>> proceeding with
>>>>>> page table walk. For example on an invalid Stream table entry.
>>>>>
>>>>> Indeed, without return here vhost case is not working for me.
>>>>
>>>> I was just lucky one time. return here has no influence. Vhost still not
>>>> working. Sorry for noise.
>>>
>>> As far as I understand the replay() callback only is called in VFIO use
>>> case. So this shouldn't impact vhost.
>>>
>>> I can't reproduce your vhost issue on my side. I will review the
>>> invalidate code again and compare against the last version.
>>>
>>> What is the page size used by your guest?
>>
>> 64K page size for guest as well as for host.
>>
>> However, I've just checked 4K page size for guest and then vhost is
>> working fine.
> So the bug stems from the incorrect target page size used on
> SMMU_CMD_TLBI_NH_VA invalidation. This is now corrected by using the
> actual config granule size and this fixes the issue with vhost use case
> and 64KB page guest. I will release this fix early next week. Sorry for
> the inconvenience.
> 

Yes, that was it. I will provide my t-b for the next series version.

Thanks,
Tomasz

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
@ 2017-09-27 17:38   ` Peter Maydell
  2017-09-28  7:57     ` Auger Eric
  2017-09-30  8:28     ` Prem Mallappa
  0 siblings, 2 replies; 72+ messages in thread
From: Peter Maydell @ 2017-09-27 17:38 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 10:21, Eric Auger <eric.auger@redhat.com> wrote:
> The patch introduces the smmu base device and class for the ARM
> smmu. Devices for specific versions will be derived from this
> base device.
>
> We also introduce some important datatypes.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>

> +/*
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.

Really GPL-2-only ?

Unless there's a good reason, you should probably fix this (including
getting an ack from Prem/Broadcom where it applies to code that came
from him).

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (22 preceding siblings ...)
  2017-09-12  6:18 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-09-28  6:43 ` Linu Cherian
  2017-09-28  7:13   ` Peter Xu
  2017-10-24  5:38 ` Linu Cherian
  24 siblings, 1 reply; 72+ messages in thread
From: Linu Cherian @ 2017-09-28  6:43 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen

Hi Eric,


On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> This series implements the emulation code for ARM SMMUv3.
> 
> Changes since v6:
> - DPDK testpmd now running on guest with 2 assigned VFs
> - Changed the instantiation method: add the following option to
>   the QEMU command line
>   -device smmuv3 # for virtio/vhost use cases
>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> - splitted the series into smaller patches to allow the review
> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>   is isolated from the rest: last 2 patches, not for upstream.
>   This is shipped for testing/bench until a better solution is found.
> - Reworked permission flag checks and event generation
> e testing:
> - in dt and ACPI modes
> - virtio-net-pci and vhost-net devices using dma ops with various
>   guest page sizes [2]
> - assigned VFs using dma ops [3]:
>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>   with guest and host page size equal (4kB)
> 
> Known limitations:
> - no VMSAv8-32 suport
> - no nested stage support (S1 + S2)
> - no support for HYP mappings
> - register fine emulation, commands, interrupts and errors were
>   not accurately tested. Handling is sufficient to run use cases
>   described above though.
> - interrupts and event generation not observed yet.

While testing with vfio-pci, observed that the below two  Qemu command,
results in two different behaviour. Is this expected by design ?

Case 1:
# -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode
 Here iommu is not attached to the pci bus in Qemu backend, since
 pci_setup_iommu is not called before vfio_realize.

Case 2:
# -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3
This works as expected, iommu is attached to the pci bus.

> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
> Previous version at:
> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
> 
> References:
> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>     https://lkml.org/lkml/2017/8/11/426
> 
> [2] qemu cmd line excerpt:
> -device smmuv3 \
> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
> [3] use -device smmuv3,caching-mode
> 
> 
> History:
> v6 -> v7:
> - see above
> 
> v5 -> v6:
> - Rebase on 2.10 and IOMMUMemoryRegion
> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>   ACPI mode)
> - fix block replay
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> 
> v4 -> v5:
> - initial_level now part of SMMUTransCfg
> - smmu_page_walk_64 takes into account the max input size
> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
> - smmuv3_translate: bug fix: don't walk on bypass
> - smmu_update_qreg: fix PROD index update
> - I did not yet address Peter's comments as the code is not mature enough
>   to be split into sub patches.
> 
> v3 -> v4 [Eric]:
> - page table walk rewritten to allow scan of the page table within a
>   range of IOVA. This prepares for VFIO integration and replay.
> - configuration parsing partially reworked.
> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>   PRI, ATS, ..
> - added ACPI table generation
> - migrated to dynamic traces
> - mingw compilation fix
> 
> v2 -> v3 [Eric]:
> - rebased on 2.9
> - mostly code and patch reorganization to ease the review process
> - optional patches removed. They may be handled separately. I am currently
>   working on ACPI enablement.
> - optional instantiation of the smmu in mach-virt
> - removed [2/9] (fdt functions) since not mandated
> - start splitting main patch into base and derived object
> - no new function feature added
> 
> v1 -> v2 [Prem]:
> - Adopted review comments from Eric Auger
>         - Make SMMU_DPRINTF to internally call qemu_log
>             (since translation requests are too many, we need control
>              on the type of log we want)
>         - SMMUTransCfg modified to suite simplicity
>         - Change RegInfo to uint64 register array
>         - Code cleanup
>         - Test cleanups
> - Reshuffled patches
> 
> v0 -> v1 [Prem]:
> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
> - Reworked register access/update logic
> - Factored out translation code for
>         - single point bug fix
>         - sharing/removal in future
> - (optional) Unit tests added, with PCI test device
>         - S1 with 4k/64k, S1+S2 with 4k/64k
>         - (S1 or S2) only can be verified by Linux 4.7 driver
>         - (optional) Priliminary ACPI support
> 
> v0 [Prem]:
> - Implements SMMUv3 spec 11.0
> - Supported for PCIe devices,
> - Command Queue and Event Queue supported
> - LPAE only, S1 is supported and Tested, S2 not tested
> - BE mode Translation not supported
> - IRQ support (legacy, no MSI)
> 
> Eric Auger (18):
>   hw/arm/smmu-common: smmu base device and datatypes
>   hw/arm/smmu-common: IOMMU memory region and address space setup
>   hw/arm/smmu-common: smmu_read/write_sysmem
>   hw/arm/smmu-common: VMSAv8-64 page table walk
>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>   hw/arm/smmuv3: Queue helpers
>   hw/arm/smmuv3: Implement MMIO write operations
>   hw/arm/smmuv3: Event queue recording helper
>   hw/arm/smmuv3: Implement translate callback
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>     notifications
>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>   hw/arm/virt: Store the PCI host controller dt phandle
>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>     functions
>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>     PlatformBusFDTData
>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
> 
> Prem Mallappa (2):
>   hw/arm/smmuv3: Skeleton
>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
> 
>  default-configs/aarch64-softmmu.mak |    1 +
>  hw/arm/Makefile.objs                |    1 +
>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>  hw/arm/smmu-internal.h              |  105 ++++
>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>  hw/arm/sysbus-fdt.c                 |  129 +++-
>  hw/arm/trace-events                 |   48 ++
>  hw/arm/virt-acpi-build.c            |   63 +-
>  hw/arm/virt.c                       |    6 +-
>  include/hw/acpi/acpi-defs.h         |   15 +
>  include/hw/arm/smmu-common.h        |  123 ++++
>  include/hw/arm/smmuv3.h             |   80 +++
>  include/hw/arm/sysbus-fdt.h         |    2 +
>  include/hw/arm/virt.h               |   15 +
>  target/arm/kvm.c                    |   27 +
>  target/arm/trace-events             |    3 +
>  17 files changed, 2886 insertions(+), 24 deletions(-)
>  create mode 100644 hw/arm/smmu-common.c
>  create mode 100644 hw/arm/smmu-internal.h
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmu-common.h
>  create mode 100644 include/hw/arm/smmuv3.h
> 
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-28  6:43 ` Linu Cherian
@ 2017-09-28  7:13   ` Peter Xu
  2017-09-28  7:54     ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Xu @ 2017-09-28  7:13 UTC (permalink / raw)
  To: Linu Cherian
  Cc: Eric Auger, eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, bharat.bhushan, christoffer.dall,
	wtownsen

On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote:
> Hi Eric,
> 
> 
> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> > This series implements the emulation code for ARM SMMUv3.
> > 
> > Changes since v6:
> > - DPDK testpmd now running on guest with 2 assigned VFs
> > - Changed the instantiation method: add the following option to
> >   the QEMU command line
> >   -device smmuv3 # for virtio/vhost use cases
> >   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> > - splitted the series into smaller patches to allow the review
> > - the VFIO integration based on "tlbi-on-map" smmuv3 driver
> >   is isolated from the rest: last 2 patches, not for upstream.
> >   This is shipped for testing/bench until a better solution is found.
> > - Reworked permission flag checks and event generation
> > e testing:
> > - in dt and ACPI modes
> > - virtio-net-pci and vhost-net devices using dma ops with various
> >   guest page sizes [2]
> > - assigned VFs using dma ops [3]:
> >   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
> >   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
> >   with guest and host page size equal (4kB)
> > 
> > Known limitations:
> > - no VMSAv8-32 suport
> > - no nested stage support (S1 + S2)
> > - no support for HYP mappings
> > - register fine emulation, commands, interrupts and errors were
> >   not accurately tested. Handling is sufficient to run use cases
> >   described above though.
> > - interrupts and event generation not observed yet.
> 
> While testing with vfio-pci, observed that the below two  Qemu command,
> results in two different behaviour. Is this expected by design ?
> 
> Case 1:
> # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode
>  Here iommu is not attached to the pci bus in Qemu backend, since
>  pci_setup_iommu is not called before vfio_realize.
> 
> Case 2:
> # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3
> This works as expected, iommu is attached to the pci bus.

Not sure about SMMU, but VT-d should have similar issue - the vIOMMU
device needs to be created before the rest of the devices.

Now for VT-d the ordering of devices should be assured by Libvirt:

https://bugzilla.redhat.com/show_bug.cgi?id=1427005

For your reference only.  Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-28  7:13   ` Peter Xu
@ 2017-09-28  7:54     ` Auger Eric
  2017-09-28  9:21       ` Linu Cherian
  0 siblings, 1 reply; 72+ messages in thread
From: Auger Eric @ 2017-09-28  7:54 UTC (permalink / raw)
  To: Peter Xu, Linu Cherian
  Cc: peter.maydell, drjones, tcain, Radha.Chintakuntla, Sunil.Goutham,
	mohun106, jean-philippe.brucker, tn, bharat.bhushan, mst,
	will.deacon, qemu-devel, alex.williamson, qemu-arm,
	christoffer.dall, wtownsen, robin.murphy, prem.mallappa,
	eric.auger.pro

Hi Linu, Peter,

On 28/09/2017 09:13, Peter Xu wrote:
> On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote:
>> Hi Eric,
>>
>>
>> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
>>> This series implements the emulation code for ARM SMMUv3.
>>>
>>> Changes since v6:
>>> - DPDK testpmd now running on guest with 2 assigned VFs
>>> - Changed the instantiation method: add the following option to
>>>   the QEMU command line
>>>   -device smmuv3 # for virtio/vhost use cases
>>>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
>>> - splitted the series into smaller patches to allow the review
>>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>>>   is isolated from the rest: last 2 patches, not for upstream.
>>>   This is shipped for testing/bench until a better solution is found.
>>> - Reworked permission flag checks and event generation
>>> e testing:
>>> - in dt and ACPI modes
>>> - virtio-net-pci and vhost-net devices using dma ops with various
>>>   guest page sizes [2]
>>> - assigned VFs using dma ops [3]:
>>>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>>>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
>>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>>>   with guest and host page size equal (4kB)
>>>
>>> Known limitations:
>>> - no VMSAv8-32 suport
>>> - no nested stage support (S1 + S2)
>>> - no support for HYP mappings
>>> - register fine emulation, commands, interrupts and errors were
>>>   not accurately tested. Handling is sufficient to run use cases
>>>   described above though.
>>> - interrupts and event generation not observed yet.
>>
>> While testing with vfio-pci, observed that the below two  Qemu command,
>> results in two different behaviour. Is this expected by design ?
>>
>> Case 1:
>> # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode
>>  Here iommu is not attached to the pci bus in Qemu backend, since
>>  pci_setup_iommu is not called before vfio_realize.
>>
>> Case 2:
>> # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3
>> This works as expected, iommu is attached to the pci bus.
> 
> Not sure about SMMU, but VT-d should have similar issue - the vIOMMU
> device needs to be created before the rest of the devices.

Yes this is an expected limitation right now. I should have documented
it though. As you noticed, the pci_set_iommu() is called on virtio-iommu
realize and it relies on the fact the PCIe devices already are realized.

Maybe we could relax this constraint by calling the pci_set_iommu in a
machine init done notifier.

Thanks

Eric


> 
> Now for VT-d the ordering of devices should be assured by Libvirt:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1427005
> 
> For your reference only.  Thanks,
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes
  2017-09-27 17:38   ` Peter Maydell
@ 2017-09-28  7:57     ` Auger Eric
  2017-09-30  8:28     ` Prem Mallappa
  1 sibling, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-09-28  7:57 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

Hi Peter,
On 27/09/2017 19:38, Peter Maydell wrote:
> On 1 September 2017 at 10:21, Eric Auger <eric.auger@redhat.com> wrote:
>> The patch introduces the smmu base device and class for the ARM
>> smmu. Devices for specific versions will be derived from this
>> base device.
>>
>> We also introduce some important datatypes.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> 
>> +/*
>> + * Copyright (C) 2014-2016 Broadcom Corporation
>> + * Copyright (c) 2017 Red Hat, Inc.
>> + * Written by Prem Mallappa, Eric Auger
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
> 
> Really GPL-2-only ?
> 
> Unless there's a good reason, you should probably fix this (including
> getting an ack from Prem/Broadcom where it applies to code that came
> from him).
OK, thanks for the notice,  I will investigate how/if we can change the
license.

Best Regards

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-28  7:54     ` Auger Eric
@ 2017-09-28  9:21       ` Linu Cherian
  0 siblings, 0 replies; 72+ messages in thread
From: Linu Cherian @ 2017-09-28  9:21 UTC (permalink / raw)
  To: Auger Eric
  Cc: Peter Xu, peter.maydell, drjones, tcain, Radha.Chintakuntla,
	Sunil.Goutham, mohun106, jean-philippe.brucker, tn,
	bharat.bhushan, mst, will.deacon, qemu-devel, alex.williamson,
	qemu-arm, christoffer.dall, wtownsen, robin.murphy,
	prem.mallappa, eric.auger.pro, linu.cherian

On Thu Sep 28, 2017 at 09:54:20AM +0200, Auger Eric wrote:
> Hi Linu, Peter,
> 
> On 28/09/2017 09:13, Peter Xu wrote:
> > On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote:
> >> Hi Eric,
> >>
> >>
> >> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> >>> This series implements the emulation code for ARM SMMUv3.
> >>>
> >>> Changes since v6:
> >>> - DPDK testpmd now running on guest with 2 assigned VFs
> >>> - Changed the instantiation method: add the following option to
> >>>   the QEMU command line
> >>>   -device smmuv3 # for virtio/vhost use cases
> >>>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> >>> - splitted the series into smaller patches to allow the review
> >>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
> >>>   is isolated from the rest: last 2 patches, not for upstream.
> >>>   This is shipped for testing/bench until a better solution is found.
> >>> - Reworked permission flag checks and event generation
> >>> e testing:
> >>> - in dt and ACPI modes
> >>> - virtio-net-pci and vhost-net devices using dma ops with various
> >>>   guest page sizes [2]
> >>> - assigned VFs using dma ops [3]:
> >>>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
> >>>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> >>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
> >>>   with guest and host page size equal (4kB)
> >>>
> >>> Known limitations:
> >>> - no VMSAv8-32 suport
> >>> - no nested stage support (S1 + S2)
> >>> - no support for HYP mappings
> >>> - register fine emulation, commands, interrupts and errors were
> >>>   not accurately tested. Handling is sufficient to run use cases
> >>>   described above though.
> >>> - interrupts and event generation not observed yet.
> >>
> >> While testing with vfio-pci, observed that the below two  Qemu command,
> >> results in two different behaviour. Is this expected by design ?
> >>
> >> Case 1:
> >> # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode
> >>  Here iommu is not attached to the pci bus in Qemu backend, since
> >>  pci_setup_iommu is not called before vfio_realize.
> >>
> >> Case 2:
> >> # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3
> >> This works as expected, iommu is attached to the pci bus.
> > 
> > Not sure about SMMU, but VT-d should have similar issue - the vIOMMU
> > device needs to be created before the rest of the devices.
> 
> Yes this is an expected limitation right now. I should have documented
> it though. As you noticed, the pci_set_iommu() is called on virtio-iommu
> realize and it relies on the fact the PCIe devices already are realized.
> 
> Maybe we could relax this constraint by calling the pci_set_iommu in a
> machine init done notifier.
> 
> Thanks
> 
> Eric

Thanks for confirming. 

> 
> 
> > 
> > Now for VT-d the ordering of devices should be assured by Libvirt:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1427005
> > 
> > For your reference only.  Thanks,
> > 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes
  2017-09-27 17:38   ` Peter Maydell
  2017-09-28  7:57     ` Auger Eric
@ 2017-09-30  8:28     ` Prem Mallappa
  2017-10-02  7:43       ` Auger Eric
  1 sibling, 1 reply; 72+ messages in thread
From: Prem Mallappa @ 2017-09-30  8:28 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Eric Auger, eric.auger.pro, qemu-arm, QEMU Developers,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

All,

I am no longer associated with Broadcom, so can't help much here I guess.

If there is anything else I can do; please let me know.

Cheers,
/Prem



--
Prem

On Wed, Sep 27, 2017 at 11:08 PM, Peter Maydell <peter.maydell@linaro.org>
wrote:

> On 1 September 2017 at 10:21, Eric Auger <eric.auger@redhat.com> wrote:
> > The patch introduces the smmu base device and class for the ARM
> > smmu. Devices for specific versions will be derived from this
> > base device.
> >
> > We also introduce some important datatypes.
> >
> > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
>
> > +/*
> > + * Copyright (C) 2014-2016 Broadcom Corporation
> > + * Copyright (c) 2017 Red Hat, Inc.
> > + * Written by Prem Mallappa, Eric Auger
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
>
> Really GPL-2-only ?
>
> Unless there's a good reason, you should probably fix this (including
> getting an ack from Prem/Broadcom where it applies to code that came
> from him).
>
> thanks
> -- PMM
>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes
  2017-09-30  8:28     ` Prem Mallappa
@ 2017-10-02  7:43       ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-10-02  7:43 UTC (permalink / raw)
  To: Prem Mallappa, Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Alex Williamson,
	Andrew Jones, Christoffer Dall, Radha.Chintakuntla,
	Sunil.Goutham, Radha Mohan, Trey Cain, Bharat Bhushan,
	Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

Hi Prem,

On 30/09/2017 10:28, Prem Mallappa wrote:
> All,
> 
> I am no longer associated with Broadcom, so can't help much here I guess.
> 
> If there is anything else I can do; please let me know.

Thank you very much for your reply. We are going to ask Broadcom's
representatives.

Thanks

Eric
> 
> Cheers,
> /Prem
> 
> 
> 
> --
> Prem
> 
> On Wed, Sep 27, 2017 at 11:08 PM, Peter Maydell
> <peter.maydell@linaro.org <mailto:peter.maydell@linaro.org>> wrote:
> 
>     On 1 September 2017 at 10:21, Eric Auger <eric.auger@redhat.com
>     <mailto:eric.auger@redhat.com>> wrote:
>     > The patch introduces the smmu base device and class for the ARM
>     > smmu. Devices for specific versions will be derived from this
>     > base device.
>     >
>     > We also introduce some important datatypes.
>     >
>     > Signed-off-by: Eric Auger <eric.auger@redhat.com <mailto:eric.auger@redhat.com>>
>     > Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com <mailto:prem.mallappa@broadcom.com>>
> 
>     > +/*
>     > + * Copyright (C) 2014-2016 Broadcom Corporation
>     > + * Copyright (c) 2017 Red Hat, Inc.
>     > + * Written by Prem Mallappa, Eric Auger
>     > + *
>     > + * This program is free software; you can redistribute it and/or modify
>     > + * it under the terms of the GNU General Public License version 2 as
>     > + * published by the Free Software Foundation.
> 
>     Really GPL-2-only ?
> 
>     Unless there's a good reason, you should probably fix this (including
>     getting an ack from Prem/Broadcom where it applies to code that came
>     from him).
> 
>     thanks
>     -- PMM
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
@ 2017-10-09 14:39   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 14:39 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> We enumerate all the PCI devices attached to the SMMU and
> initialize an associated IOMMU memory region and address space.
> This happens on SMMU base instance init.
>
> Those info are stored in SMMUDevice objects. The devices are
> grouped according to the PCIBus they belong to. A hash table
> indexed by the PCIBus poinet is used. Also an array indexed by
> the bus number allows to find the list of SMMUDevices.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmu-common.c         | 89 ++++++++++++++++++++++++++++++++++++++++++++
>  include/hw/arm/smmu-common.h |  6 +++
>  2 files changed, 95 insertions(+)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 56608f1..3e67992 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -30,8 +30,97 @@
>  #include "qemu/error-report.h"
>  #include "hw/arm/smmu-common.h"
>
> +/******************/
> +/* Infrastructure */
> +/******************/

Minor thing, but we don't really need this kind of fancy comment
formatting.

> +static inline gboolean smmu_uint64_equal(gconstpointer v1, gconstpointer v2)
> +{
> +    return *((const uint64_t *)v1) == *((const uint64_t *)v2);
> +}
> +
> +static inline guint smmu_uint64_hash(gconstpointer v)
> +{
> +    return (guint)*(const uint64_t *)v;
> +}
> +
> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num)
> +{
> +    SMMUPciBus *smmu_pci_bus = s->smmu_as_by_bus_num[bus_num];
> +
> +    if (!smmu_pci_bus) {
> +        GHashTableIter iter;
> +
> +        g_hash_table_iter_init(&iter, s->smmu_as_by_busptr);
> +        while (g_hash_table_iter_next(&iter, NULL, (void **)&smmu_pci_bus)) {
> +            if (pci_bus_num(smmu_pci_bus->bus) == bus_num) {
> +                s->smmu_as_by_bus_num[bus_num] = smmu_pci_bus;
> +                return smmu_pci_bus;
> +            }
> +        }
> +    }
> +    return smmu_pci_bus;
> +}
> +
> +static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn)
> +{
> +    SMMUState *s = opaque;
> +    uintptr_t key = (uintptr_t)bus;
> +    SMMUPciBus *sbus = g_hash_table_lookup(s->smmu_as_by_busptr, &key);
> +    SMMUDevice *sdev;
> +
> +    if (!sbus) {
> +        uintptr_t *new_key = g_malloc(sizeof(*new_key));
> +
> +        *new_key = (uintptr_t)bus;
> +        sbus = g_malloc0(sizeof(SMMUPciBus) +
> +                         sizeof(SMMUDevice *) * SMMU_PCI_DEVFN_MAX);
> +        sbus->bus = bus;
> +        g_hash_table_insert(s->smmu_as_by_busptr, new_key, sbus);

Why do we allocate memory containing a uintptr_t which we set to
be the (integer value of the) pointer to the bus, and then use the
pointer to that uintptr_t as the key, when we could just use the
pointer to the bus as the key ? That would save you having a specialist
equal function, hash function and having to free the keys.

> +    }
> +
> +    sdev = sbus->pbdev[devfn];
> +    if (!sdev) {
> +        char *name = g_strdup_printf("%s-%d-%d",
> +                                     s->mrtypename,
> +                                     pci_bus_num(bus), devfn);
> +        sdev = sbus->pbdev[devfn] = g_malloc0(sizeof(SMMUDevice));

g_new0() is slightly stylistically preferable for this kind of thing.

> +
> +        sdev->smmu = s;
> +        sdev->bus = bus;
> +        sdev->devfn = devfn;
> +
> +        memory_region_init_iommu(&sdev->iommu, sizeof(sdev->iommu),
> +                                 s->mrtypename,
> +                                 OBJECT(s), name, 1ULL << 48);

What is this 1ULL << 48 ? Is it intended to be the input address
size, intermediate address size or output address size? It's not
clear to me that hardcoded 1 << 48 is right in any of those cases...

> +        address_space_init(&sdev->as,
> +                           MEMORY_REGION(&sdev->iommu), name);
> +    }
> +
> +    return &sdev->as;
> +}
> +
> +static void smmu_init_iommu_as(SMMUState *s)
> +{
> +    PCIBus *pcibus = pci_find_primary_bus();

This looks odd. I would expect the board model to be
instantiating and wiring up the SMMU somehow so that
it is in the path of whatever PCI bus it is sitting in
front of. It shouldn't need to look for the PCI bus like
this, which prevents modelling a system where there are
two PCI buses each of which has its own SMMU.

> +
> +    if (pcibus) {
> +        pci_setup_iommu(pcibus, smmu_find_add_as, s);
> +    } else {
> +        error_report("No PCI bus, SMMU is not registered");
> +    }
> +}
> +
>  static void smmu_base_instance_init(Object *obj)
>  {
> +    SMMUState *s = SMMU_SYS_DEV(obj);
> +
> +    memset(s->smmu_as_by_bus_num, 0, sizeof(s->smmu_as_by_bus_num));

Instance init doesn't need to clear the data structure.

> +
> +    s->smmu_as_by_busptr = g_hash_table_new_full(smmu_uint64_hash,
> +                                                 smmu_uint64_equal,
> +                                                 g_free, g_free);
> +    smmu_init_iommu_as(s);
>  }
>
>  static void smmu_base_class_init(ObjectClass *klass, void *data)
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index 38cd18f..20f3fe6 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -105,4 +105,10 @@ typedef struct {
>  #define SMMU_DEVICE_CLASS(klass)                                    \
>      OBJECT_CLASS_CHECK(SMMUBaseClass, (klass), TYPE_SMMU_DEV_BASE)
>
> +SMMUPciBus *smmu_find_as_from_bus_num(SMMUState *s, uint8_t bus_num);
> +
> +static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
> +{
> +    return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
> +}
>  #endif  /* HW_ARM_SMMU_COMMON */
> --
> 2.5.5

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem Eric Auger
@ 2017-10-09 14:46   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 14:46 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> Those two functions will be used to access configuration
> data (STE, CD) and page table entries in guest RAM.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmu-common.c         | 37 +++++++++++++++++++++++++++++++++++++
>  include/hw/arm/smmu-common.h |  5 +++++
>  2 files changed, 42 insertions(+)
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 3e67992..2a94547 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -30,6 +30,43 @@
>  #include "qemu/error-report.h"
>  #include "hw/arm/smmu-common.h"
>
> +inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
> +                                    bool secure)
> +{
> +    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};

This isn't right. .unspecified = 1 means "transaction master has
not explicitly specified any attributes", but you are specifying
one (the secure one).

> +    switch (len) {
> +    case 4:
> +        *(uint32_t *)buf = ldl_le_phys(&address_space_memory, addr);
> +        break;
> +    case 8:
> +        *(uint64_t *)buf = ldq_le_phys(&address_space_memory, addr);
> +        break;
> +    default:
> +        return address_space_rw(&address_space_memory, addr,
> +                                attrs, buf, len, false);

Why do we have the special cases for 4 and 8? In particular, those
code paths will not correctly detect memory transaction failures.

> +    }
> +    return MEMTX_OK;
> +}
> +
> +inline void
> +smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure)
> +{
> +    MemTxAttrs attrs = {.unspecified = 1, .secure = secure};
> +
> +    switch (len) {
> +    case 4:
> +        stl_le_phys(&address_space_memory, addr, *(uint32_t *)buf);
> +        break;
> +    case 8:
> +        stq_le_phys(&address_space_memory, addr, *(uint64_t *)buf);
> +        break;
> +    default:
> +        address_space_rw(&address_space_memory, addr,
> +                         attrs, buf, len, true);
> +    }
> +}
> +
>  /******************/
>  /* Infrastructure */
>  /******************/
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index 20f3fe6..a5999b0 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -111,4 +111,9 @@ static inline uint16_t smmu_get_sid(SMMUDevice *sdev)
>  {
>      return  ((pci_bus_num(sdev->bus) & 0xff) << 8) | sdev->devfn;
>  }
> +
> +MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
> +                             dma_addr_t len, bool secure);
> +void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
> +

There are so few callers of this that I'm inclined to think you
should just open code the right kind of memory accessor function
in the callsites rather than having a weird switch statement.

>  #endif  /* HW_ARM_SMMU_COMMON */
> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
@ 2017-10-09 15:36   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 15:36 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> This patch implements the page table walk for VMSAv8-64.
>
> The page table walk function is devised to walk the tables
> for a range of IOVAs and to call a callback for each valid
> leaf entry (frame or block).
>
> smmu_page_walk_level_64() handles the walk from a specific level.
> The advantage of using recursivity is one easily skips invalid
> entries at any stage. Only if the entry of level n is valid then
> we walk the level n+1, otherwise we jump to the next index of
> level n.
>
> Walk for an IOVA range will be used for SMMU memory region custom
> replay. Translation function uses the same function for a granule.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v6 -> v7:
> - fix wrong error handling in walk_page_table
> - check perm in smmu_translate
>
> v5 -> v6:
> - use IOMMUMemoryRegion
> - remove initial_lookup_level()
> - fix block replay
>
> v4 -> v5:
> - add initial level in translation config
> - implement block pte
> - rename must_translate into nofail
> - introduce call_entry_hook
> - small changes to dynamic traces
> - smmu_page_walk code moved from smmuv3.c to this file
> - remove smmu_translate*
>
> v3 -> v4:
> - reworked page table walk to prepare for VFIO integration
>   (capability to scan a range of IOVA). Same function is used
>   for translate for a single iova. This is largely inspired
>   from intel_iommu.c
> - as the translate function was not straightforward to me,
>   I tried to stick more closely to the VMSA spec.
> - remove support of nested stage (kernel driver does not
>   support it anyway)
> - use error_report and trace events
> - add aa64[] field in SMMUTransCfg
> ---
>  hw/arm/smmu-common.c         | 343 +++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmu-internal.h       | 105 +++++++++++++
>  hw/arm/trace-events          |  12 ++
>  include/hw/arm/smmu-common.h |   4 +
>  4 files changed, 464 insertions(+)
>  create mode 100644 hw/arm/smmu-internal.h
>
> diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
> index 2a94547..f476120 100644
> --- a/hw/arm/smmu-common.c
> +++ b/hw/arm/smmu-common.c
> @@ -29,6 +29,349 @@
>
>  #include "qemu/error-report.h"
>  #include "hw/arm/smmu-common.h"
> +#include "smmu-internal.h"
> +
> +/*************************/
> +/* VMSAv8-64 Translation */
> +/*************************/
> +
> +/**
> + * get_pte - Get the content of a page table entry located in
> + * @base_addr[@index]
> + */
> +static uint64_t get_pte(dma_addr_t baseaddr, uint32_t index)
> +{
> +    uint64_t pte;
> +
> +    if (smmu_read_sysmem(baseaddr + index * sizeof(pte),
> +                         &pte, sizeof(pte), false)) {
> +        error_report("can't read pte at address=0x%"PRIx64,
> +                     baseaddr + index * sizeof(pte));

This is just a "guest has misprogrammed something" error presumably;
those are LOG_GUEST_ERROR. Or if it can happen in normal use then
don't log it at all.

> +        pte = (uint64_t)-1;

This doesn't look right. "Successfully read -1 from memory" and
"Failed to read memory" are different things, so you don't want to
mash them together into the same return code.

> +        return pte;
> +    }
> +    trace_smmu_get_pte(baseaddr, index, baseaddr + index * sizeof(pte), pte);
> +    /* TODO: handle endianness */
> +    return pte;
> +}
> +
> +/* VMSAv8-64 Translation Table Format Descriptor Decoding */
> +
> +#define PTE_ADDRESS(pte, shift) (extract64(pte, shift, 47 - shift) << shift)
> +
> +/**
> + * get_page_pte_address - returns the L3 descriptor output address,
> + * ie. the page frame
> + * ARM ARM spec: Figure D4-17 VMSAv8-64 level 3 descriptor format
> + */
> +static inline hwaddr get_page_pte_address(uint64_t pte, int granule_sz)
> +{
> +    return PTE_ADDRESS(pte, granule_sz);
> +}
> +
> +/**
> + * get_table_pte_address - return table descriptor output address,
> + * ie. address of next level table
> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
> + */
> +static inline hwaddr get_table_pte_address(uint64_t pte, int granule_sz)
> +{
> +    return PTE_ADDRESS(pte, granule_sz);
> +}
> +
> +/**
> + * get_block_pte_address - return block descriptor output address and block size
> + * ARM ARM Figure D4-16 VMSAv8-64 level0, level1, and level 2 descriptor formats
> + */
> +static hwaddr get_block_pte_address(uint64_t pte, int level, int granule_sz,
> +                                    uint64_t *bsz)
> +{
> +    int n;
> +
> +    switch (granule_sz) {
> +    case 12:
> +        if (level == 1) {
> +            n = 30;
> +        } else if (level == 2) {
> +            n = 21;
> +        } else {
> +            goto error_out;
> +        }
> +        break;
> +    case 14:
> +        if (level == 2) {
> +            n = 25;
> +        } else {
> +            goto error_out;
> +        }
> +        break;
> +    case 16:
> +        if (level == 2) {
> +            n = 29;
> +        } else {
> +            goto error_out;
> +        }
> +        break;
> +    default:
> +            goto error_out;
> +    }

This is essentially a check that the initial SMMUTransCfg didn't
specify an incompatible initial_level and granule, right? We should
check that earlier, rather than here, and if it's strictly a QEMU
code bug to get that wrong we should just assert. If a guest misconfig
can cause it then we shouldn't use error_report().

> +    *bsz = 1 << n;
> +    return PTE_ADDRESS(pte, n);
> +
> +error_out:
> +
> +    error_report("unexpected granule_sz=%d/level=%d for block pte",
> +                 granule_sz, level);
> +    *bsz = 0;
> +    return (hwaddr)-1;
> +}
> +
> +static int call_entry_hook(uint64_t iova, uint64_t mask, uint64_t gpa,
> +                           int perm, smmu_page_walk_hook hook_fn, void *private)
> +{
> +    IOMMUTLBEntry entry;
> +    int ret;
> +
> +    entry.target_as = &address_space_memory;
> +    entry.iova = iova & mask;
> +    entry.translated_addr = gpa;
> +    entry.addr_mask = ~mask;
> +    entry.perm = perm;
> +
> +    ret = hook_fn(&entry, private);
> +    if (ret) {
> +        error_report("%s hook returned %d", __func__, ret);
> +    }
> +    return ret;
> +}
> +
> +/**
> + * smmu_page_walk_level_64 - Walk an IOVA range from a specific level
> + * @baseaddr: table base address corresponding to @level
> + * @level: level
> + * @cfg: translation config
> + * @start: end of the IOVA range
> + * @end: end of the IOVA range
> + * @hook_fn: the hook that to be called for each detected area
> + * @private: private data for the hook function
> + * @flags: access flags of the parent
> + * @nofail: indicates whether each iova of the range
> + *  must be translated or whether failure is allowed
> + *
> + * Return 0 on success, < 0 on errors not related to translation
> + * process, > 1 on errors related to translation process (only
> + * if nofail is set)
> + */
> +static int
> +smmu_page_walk_level_64(dma_addr_t baseaddr, int level,
> +                        SMMUTransCfg *cfg, uint64_t start, uint64_t end,
> +                        smmu_page_walk_hook hook_fn, void *private,
> +                        IOMMUAccessFlags flags, bool nofail)
> +{
> +    uint64_t subpage_size, subpage_mask, pte, iova = start;
> +    int ret, granule_sz, stage, perm;
> +
> +    granule_sz = cfg->granule_sz;
> +    stage = cfg->stage;
> +    subpage_size = 1ULL << level_shift(level, granule_sz);
> +    subpage_mask = level_page_mask(level, granule_sz);
> +
> +    trace_smmu_page_walk_level_in(level, baseaddr, granule_sz,
> +                                  start, end, flags, subpage_size);
> +
> +    while (iova < end) {
> +        dma_addr_t next_table_baseaddr;
> +        uint64_t iova_next, pte_addr;
> +        uint32_t offset;
> +
> +        iova_next = (iova & subpage_mask) + subpage_size;
> +        offset = iova_level_offset(iova, level, granule_sz);
> +        pte_addr = baseaddr + offset * sizeof(pte);
> +        pte = get_pte(baseaddr, offset);
> +
> +        trace_smmu_page_walk_level(level, iova, subpage_size,
> +                                   baseaddr, offset, pte);
> +
> +        if (pte == (uint64_t)-1) {
> +            if (nofail) {
> +                return SMMU_TRANS_ERR_WALK_EXT_ABRT;
> +            }
> +            goto next;
> +        }
> +        if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
> +            trace_smmu_page_walk_level_res_invalid_pte(stage, level, baseaddr,
> +                                                       pte_addr, offset, pte);
> +            if (nofail) {
> +                return SMMU_TRANS_ERR_TRANS;
> +            }
> +            goto next;
> +        }
> +
> +        if (is_page_pte(pte, level)) {
> +            uint64_t gpa = get_page_pte_address(pte, granule_sz);
> +
> +            perm = flags & pte_ap_to_perm(pte, true);
> +
> +            trace_smmu_page_walk_level_page_pte(stage, level, iova,
> +                                                baseaddr, pte_addr, pte, gpa);
> +            ret = call_entry_hook(iova, subpage_mask, gpa, perm,
> +                                  hook_fn, private);
> +            if (ret) {
> +                return ret;
> +            }
> +            goto next;
> +        }
> +        if (is_block_pte(pte, level)) {

A block descriptor and a page descriptor are basically the same format,
and you can see in the CPU TLB walk code that we thus treat them basically
the same way. It's a bit odd that the code here handles them totally
separately and in apparently significantly different ways.

> +            size_t target_page_size = qemu_target_page_size();;

Stray extra semicolon. Also, this isn't really the current CPU target
page size (which in any case the SMMU has no way of knowning) so I'm
suspicious that it's not what you really want. (What do you want?)

> +            uint64_t block_size, top_iova;
> +            hwaddr gpa, block_gpa;
> +
> +            block_gpa = get_block_pte_address(pte, level, granule_sz,
> +                                              &block_size);
> +            perm = flags & pte_ap_to_perm(pte, true);
> +
> +            if (block_gpa == -1) {
> +                if (nofail) {
> +                    return SMMU_TRANS_ERR_WALK_EXT_ABRT;
> +                } else {
> +                    goto next;
> +                }
> +            }
> +            trace_smmu_page_walk_level_block_pte(stage, level, baseaddr,
> +                                                 pte_addr, pte, iova, block_gpa,
> +                                                 (int)(block_size >> 20));
> +
> +            gpa = block_gpa + (iova & (block_size - 1));
> +            if ((block_gpa == gpa) && (end >= iova_next - 1)) {
> +                ret = call_entry_hook(iova, ~(block_size - 1), block_gpa,
> +                                      perm, hook_fn, private);
> +                if (ret) {
> +                    return ret;
> +                }
> +                goto next;
> +            } else {
> +                top_iova = MIN(end, iova_next);
> +                while (iova < top_iova) {
> +                    gpa = block_gpa + (iova & (block_size - 1));
> +                    ret = call_entry_hook(iova, ~(target_page_size - 1),
> +                                          gpa, perm, hook_fn, private);
> +                    if (ret) {
> +                        return ret;
> +                    }
> +                    iova += target_page_size;
> +                }

No "goto next" ? All the other parts of this loop seem to do that
(though it also suggests that you want if ... else if ... else if ... else).

> +            }
> +        }
> +        if (level  == 3) {
> +            goto next;

Yuck!

> +        }
> +        /* table pte */
> +        next_table_baseaddr = get_table_pte_address(pte, granule_sz);
> +        trace_smmu_page_walk_level_table_pte(stage, level, baseaddr, pte_addr,
> +                                             pte, next_table_baseaddr);
> +        perm = flags & pte_ap_to_perm(pte, false);

This is converting the architectural TLB entry attribute flags into
the QEMU architecture independent IOMMUAccessFlags and then passing
the latter onto the next stage of the TLB walk. I think it would be
better to stick to using the architectural attribute flags, as the
CPU TLB walk code does.

> +        ret = smmu_page_walk_level_64(next_table_baseaddr, level + 1, cfg,
> +                                      iova, MIN(iova_next, end),
> +                                      hook_fn, private, perm, nofail);
> +        if (ret) {
> +            return ret;
> +        }
> +
> +next:
> +        iova = iova_next;
> +    }

The usual way to write "while (cond) { ... goto next; ...   next: something; }"
is "for (; cond; something) { ... ;continue; ...}".

> +
> +    return SMMU_TRANS_ERR_NONE;
> +}
> +
> +/**
> + * smmu_page_walk - walk a specific IOVA range from the initial
> + * lookup level, and call the hook for each valid entry
> + *
> + * @cfg: translation config
> + * @start: start of the IOVA range
> + * @end: end of the IOVA range
> + * @nofail: if true, each IOVA within the range must have a translation
> + * @hook_fn: the hook that to be called for each detected area
> + * @private: private data for the hook function
> + */
> +int smmu_page_walk(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
> +                   bool nofail, smmu_page_walk_hook hook_fn, void *private)
> +{
> +    uint64_t roof = MIN(end, (1ULL << (64 - cfg->tsz)) - 1);
> +    IOMMUAccessFlags perm = IOMMU_ACCESS_FLAG(true, true);
> +    int stage = cfg->stage;
> +    dma_addr_t ttbr;
> +
> +    if (!hook_fn) {
> +        return 0;
> +    }
> +
> +    if (!cfg->aa64) {
> +        error_report("VMSAv8-32 page walk is not yet implemented");
> +        abort();
> +    }
> +
> +    ttbr = extract64(cfg->ttbr, 0, 48);
> +    trace_smmu_page_walk(stage, cfg->ttbr, cfg->initial_level, start, roof);
> +
> +    return smmu_page_walk_level_64(ttbr, cfg->initial_level, cfg, start, roof,
> +                                   hook_fn, private, perm, nofail);
> +}
> +
> +/**
> + * set_translated_address: page table walk callback for smmu_translate
> + *
> + * once a leaf entry is found, applies the offset to the translated address
> + * and check the permission
> + *
> + * @entry: entry filled by the page table walk function, ie. contains the
> + * leaf entry iova/translated addr and permission flags
> + * @private: pointer to the original entry that must be translated
> + */
> +static int set_translated_address(IOMMUTLBEntry *entry, void *private)
> +{
> +    IOMMUTLBEntry *tlbe_in = (IOMMUTLBEntry *)private;
> +    size_t offset = tlbe_in->iova - entry->iova;
> +
> +    if (((tlbe_in->perm & IOMMU_RO) && !(entry->perm & IOMMU_RO)) ||
> +        ((tlbe_in->perm & IOMMU_WO) && !(entry->perm & IOMMU_WO))) {
> +        return SMMU_TRANS_ERR_PERM;
> +    }
> +    tlbe_in->translated_addr = entry->translated_addr + offset;
> +    trace_smmu_set_translated_address(tlbe_in->iova, tlbe_in->translated_addr);
> +    return 0;

return SMMU_TRANS_ERR_NONE; ?

> +}
> +
> +/**
> + * smmu_translate - Attempt to translate a given entry according to @cfg
> + *
> + * @cfg: translation configuration
> + * @tlbe: entry pre-filled with the input iova, mask
> + *
> + * return: !=0 if no mapping is found for the tlbe->iova or access permission
> + * does not match
> + */
> +int smmu_translate(SMMUTransCfg *cfg, IOMMUTLBEntry *tlbe)
> +{
> +    int ret = 0;
> +
> +    if (cfg->bypassed || cfg->disabled) {
> +        return 0;
> +    }
> +
> +    ret = smmu_page_walk(cfg, tlbe->iova, tlbe->iova + 1, true /* nofail */,
> +                         set_translated_address, tlbe);
> +
> +    if (ret) {
> +        error_report("translation failed for iova=0x%"PRIx64" perm=%d (%d)",
> +                     tlbe->iova, tlbe->perm, ret);
> +        goto exit;
> +    }
> +
> +exit:

Not much point in goto to next statement.

> +    return ret;
> +}
>
>  inline MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf, dma_addr_t len,
>                                      bool secure)
> diff --git a/hw/arm/smmu-internal.h b/hw/arm/smmu-internal.h
> new file mode 100644
> index 0000000..aeeadd4
> --- /dev/null
> +++ b/hw/arm/smmu-internal.h
> @@ -0,0 +1,105 @@
> +/*
> + * ARM SMMU support - Internal API
> + *
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMU_INTERNAL_H
> +#define HW_ARM_SMMU_INTERNAL_H
> +
> +#define ARM_LPAE_MAX_ADDR_BITS          48

Another 48... which address size limit is this intended to represent?

> +#define ARM_LPAE_MAX_LEVELS             4
> +
> +/* PTE Manipulation */
> +
> +#define ARM_LPAE_PTE_TYPE_SHIFT         0
> +#define ARM_LPAE_PTE_TYPE_MASK          0x3
> +
> +#define ARM_LPAE_PTE_TYPE_BLOCK         1
> +#define ARM_LPAE_PTE_TYPE_RESERVED      1
> +#define ARM_LPAE_PTE_TYPE_TABLE         3
> +#define ARM_LPAE_PTE_TYPE_PAGE          3

This looks weird, because several of these are the same as each other.
That's because they're really distinct sets of values, some for
L0/1/2 descriptors, and some for L3 descriptors. If you want
to define constant names for this can you make the prefixes different
for the different cases, please?

> +
> +#define ARM_LPAE_PTE_VALID              (1 << 0)
> +
> +static inline bool is_invalid_pte(uint64_t pte)
> +{
> +    return !(pte & ARM_LPAE_PTE_VALID);
> +}
> +
> +static inline bool is_reserved_pte(uint64_t pte, int level)
> +{
> +    return ((level == 3) &&
> +            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_RESERVED));

return isn't a function so you don't need the outer brackets here.

> +}
> +
> +static inline bool is_block_pte(uint64_t pte, int level)
> +{
> +    return ((level < 3) &&
> +            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_BLOCK));
> +}
> +
> +static inline bool is_table_pte(uint64_t pte, int level)
> +{
> +    return ((level < 3) &&
> +            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_TABLE));
> +}
> +
> +static inline bool is_page_pte(uint64_t pte, int level)
> +{
> +    return ((level == 3) &&
> +            ((pte & ARM_LPAE_PTE_TYPE_MASK) == ARM_LPAE_PTE_TYPE_PAGE));
> +}
> +
> +static IOMMUAccessFlags pte_ap_to_perm(uint64_t pte, bool is_leaf)
> +{
> +    int ap;
> +    IOMMUAccessFlags flags;
> +
> +    if (is_leaf) {
> +        ap = extract64(pte, 6, 2);
> +    } else {
> +        ap = extract64(pte, 61, 2);
> +    }
> +    flags = IOMMU_ACCESS_FLAG(true, !(ap & 0x2));
> +    return flags;
> +}
> +
> +/* Level Indexing */
> +
> +static inline int level_shift(int level, int granule_sz)
> +{
> +    return granule_sz + (3 - level) * (granule_sz - 3);
> +}
> +
> +static inline uint64_t level_page_mask(int level, int granule_sz)
> +{
> +    return ~((1ULL << level_shift(level, granule_sz)) - 1);
> +}
> +
> +/**
> + * TODO: handle the case where the level resolves less than
> + * granule_sz -3 IA bits.
> + */
> +static inline
> +uint64_t iova_level_offset(uint64_t iova, int level, int granule_sz)
> +{
> +    return (iova >> level_shift(level, granule_sz)) &
> +            ((1ULL << (granule_sz - 3)) - 1);
> +}
> +
> +#endif
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 193063e..c67cd39 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -2,3 +2,15 @@
>
>  # hw/arm/virt-acpi-build.c
>  virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
> +
> +# hw/arm/smmu-common.c
> +
> +smmu_page_walk(int stage, uint64_t baseaddr, int first_level, uint64_t start, uint64_t end) "stage=%d, baseaddr=0x%"PRIx64", first level=%d, start=0x%"PRIx64", end=0x%"PRIx64
> +smmu_page_walk_level_in(int level, uint64_t baseaddr, int granule_sz, uint64_t start, uint64_t end, int flags, uint64_t subpage_size) "level=%d baseaddr=0x%"PRIx64" granule=%d, start=0x%"PRIx64" end=0x%"PRIx64" flags=%d subpage_size=0x%lx"
> +smmu_page_walk_level(int level, uint64_t iova, size_t subpage_size, uint64_t baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%lx subpage_sz=0x%lx baseaddr=0x%"PRIx64" offset=%d => pte=0x%lx"
> +smmu_page_walk_level_res_invalid_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint32_t offset, uint64_t pte) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" offset=%d pte=0x%lx"
> +smmu_page_walk_level_page_pte(int stage, int level,  uint64_t iova, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d level=%d iova=0x%"PRIx64" base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" page address = 0x%"PRIx64
> +smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t iova, uint64_t gpa, int bsize_mb) "stage=%d level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" iova=0x%"PRIx64" block address = 0x%"PRIx64" block size = %d MiB"
> +smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
> +smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
> +smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
> diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
> index a5999b0..112a11c 100644
> --- a/include/hw/arm/smmu-common.h
> +++ b/include/hw/arm/smmu-common.h
> @@ -116,4 +116,8 @@ MemTxResult smmu_read_sysmem(dma_addr_t addr, void *buf,
>                               dma_addr_t len, bool secure);
>  void smmu_write_sysmem(dma_addr_t addr, void *buf, dma_addr_t len, bool secure);
>
> +int smmu_translate(SMMUTransCfg *cfg, IOMMUTLBEntry *tlbe);
> +int smmu_page_walk(SMMUTransCfg *cfg, uint64_t start, uint64_t end,
> +                   bool nofail, smmu_page_walk_hook hook_fn, void *private);
> +
>  #endif  /* HW_ARM_SMMU_COMMON */
> --

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton Eric Auger
  2017-09-08 10:52   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
@ 2017-10-09 16:17   ` Peter Maydell
  1 sibling, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 16:17 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> From: Prem Mallappa <prem.mallappa@broadcom.com>
>
> This patch implements a skeleton for the smmuv3 device.
> Datatypes and register definitions are introduced. The MMIO
> region, the interrupts and the queue are initialized (PRI is
> not supported).
>
> Only the MMIO read operation is implemented here.
>
> Signed-off-by: Prem Mallappa <prem.mallappa@broadcom.com>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
> v6 -> v7:
> - split into several patches
>
> v5 -> v6:
> - Use IOMMUMemoryregion
> - regs become uint32_t and fix 64b MMIO access (.impl)
> - trace_smmuv3_write/read_mmio take the size param
>
> v4 -> v5:
> - change smmuv3_translate proto (IOMMUAccessFlags flag)
> - has_stagex replaced by is_ste_stagex
> - smmu_cfg_populate removed
> - added smmuv3_decode_config and reworked error management
> - remwork the naming of IOMMU mrs
> - fix SMMU_CMDQ_CONS offset
>
> v3 -> v4
> - smmu_irq_update
> - fix hash key allocation
> - set smmu_iommu_ops
> - set SMMU_REG_CR0,
> - smmuv3_translate: ret.perm not set in bypass mode
> - use trace events
> - renamed STM2U64 into L1STD_L2PTR and STMSPAN into L1STD_SPAN
> - rework smmu_find_ste
> - fix tg2granule in TT0/0b10 corresponds to 16kB
>
> v2 -> v3:
> - move creation of include/hw/arm/smmuv3.h to this patch to fix compil issue
> - compilation allowed
> - fix sbus allocation in smmu_init_pci_iommu
> - restructure code into headers
> - misc cleanups
> ---
>  hw/arm/Makefile.objs     |   2 +-
>  hw/arm/smmuv3-internal.h | 201 +++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3.c          | 239 +++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |   3 +
>  include/hw/arm/smmuv3.h  |  79 ++++++++++++++++
>  5 files changed, 523 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmuv3.h
>
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index 5b2d38d..a7c808b 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -19,4 +19,4 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
>  obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
>  obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
>  obj-$(CONFIG_MPS2) += mps2.o
> -obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o
> +obj-$(CONFIG_ARM_SMMUV3) += smmu-common.o smmuv3.o
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> new file mode 100644
> index 0000000..488acc8
> --- /dev/null
> +++ b/hw/arm/smmuv3-internal.h
> @@ -0,0 +1,201 @@
> +/*
> + * ARM SMMUv3 support - Internal API
> + *
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMU_V3_INTERNAL_H
> +#define HW_ARM_SMMU_V3_INTERNAL_H
> +
> +#include "trace.h"
> +#include "qemu/error-report.h"
> +#include "hw/arm/smmu-common.h"
> +
> +/*****************************
> + * MMIO Register
> + *****************************/
> +enum {
> +    SMMU_REG_IDR0            = 0x0,
> +
> +/* IDR0 Field Values and supported features */
> +
> +#define SMMU_IDR0_S2P      1  /* stage 2 */
> +#define SMMU_IDR0_S1P      1  /* stage 1 */
> +#define SMMU_IDR0_TTF      2  /* Aarch64 only - not Aarch32 (LPAE) */

Two capital As in AArch32 and AArch64.

> +#define SMMU_IDR0_COHACC   1  /* IO coherent access */
> +#define SMMU_IDR0_HTTU     2  /* Access and Dirty flag update */
> +#define SMMU_IDR0_HYP      0  /* Hypervisor Stage 1 contexts */
> +#define SMMU_IDR0_ATS      0  /* PCIe RC ATS */
> +#define SMMU_IDR0_ASID16   1  /* 16-bit ASID */
> +#define SMMU_IDR0_PRI      0  /* Page Request Interface */
> +#define SMMU_IDR0_VMID16   0  /* 16-bit VMID */
> +#define SMMU_IDR0_CD2L     0  /* 2-level Context Descriptor table */
> +#define SMMU_IDR0_STALL    1  /* Stalling fault model */
> +#define SMMU_IDR0_TERM     1  /* Termination model behaviour */
> +#define SMMU_IDR0_STLEVEL  1  /* Multi-level Stream Table */
> +
> +#define SMMU_IDR0_S2P_SHIFT      0
> +#define SMMU_IDR0_S1P_SHIFT      1
> +#define SMMU_IDR0_TTF_SHIFT      2
> +#define SMMU_IDR0_COHACC_SHIFT   4
> +#define SMMU_IDR0_HTTU_SHIFT     6
> +#define SMMU_IDR0_HYP_SHIFT      9
> +#define SMMU_IDR0_ATS_SHIFT      10
> +#define SMMU_IDR0_ASID16_SHIFT   12
> +#define SMMU_IDR0_PRI_SHIFT      16
> +#define SMMU_IDR0_VMID16_SHIFT   18
> +#define SMMU_IDR0_CD2L_SHIFT     19
> +#define SMMU_IDR0_STALL_SHIFT    24
> +#define SMMU_IDR0_TERM_SHIFT     26
> +#define SMMU_IDR0_STLEVEL_SHIFT  27

Optional, but you might look at whether you like the FIELD()
macro in include/hw/registerfields.h for defining shift and
mask constants.

> +
> +    SMMU_REG_IDR1            = 0x4,
> +#define SMMU_IDR1_SIDSIZE 16
> +    SMMU_REG_IDR2            = 0x8,
> +    SMMU_REG_IDR3            = 0xc,
> +    SMMU_REG_IDR4            = 0x10,
> +    SMMU_REG_IDR5            = 0x14,
> +#define SMMU_IDR5_GRAN_SHIFT 4
> +#define SMMU_IDR5_GRAN       0b101 /* GRAN4K, GRAN64K */
> +#define SMMU_IDR5_OAS        4     /* 44 bits */
> +    SMMU_REG_IIDR            = 0x1c,
> +    SMMU_REG_CR0             = 0x20,
> +
> +#define SMMU_CR0_SMMU_ENABLE (1 << 0)
> +#define SMMU_CR0_PRIQ_ENABLE (1 << 1)
> +#define SMMU_CR0_EVTQ_ENABLE (1 << 2)
> +#define SMMU_CR0_CMDQ_ENABLE (1 << 3)
> +#define SMMU_CR0_ATS_CHECK   (1 << 4)
> +
> +    SMMU_REG_CR0_ACK         = 0x24,
> +    SMMU_REG_CR1             = 0x28,
> +    SMMU_REG_CR2             = 0x2c,
> +
> +    SMMU_REG_STATUSR         = 0x40,
> +
> +    SMMU_REG_IRQ_CTRL        = 0x50,
> +    SMMU_REG_IRQ_CTRL_ACK    = 0x54,
> +
> +#define SMMU_IRQ_CTRL_GERROR_EN (1 << 0)
> +#define SMMU_IRQ_CTRL_EVENT_EN  (1 << 1)
> +#define SMMU_IRQ_CTRL_PRI_EN    (1 << 2)
> +
> +    SMMU_REG_GERROR          = 0x60,
> +
> +#define SMMU_GERROR_CMDQ           (1 << 0)
> +#define SMMU_GERROR_EVENTQ_ABT     (1 << 2)
> +#define SMMU_GERROR_PRIQ_ABT       (1 << 3)
> +#define SMMU_GERROR_MSI_CMDQ_ABT   (1 << 4)
> +#define SMMU_GERROR_MSI_EVENTQ_ABT (1 << 5)
> +#define SMMU_GERROR_MSI_PRIQ_ABT   (1 << 6)
> +#define SMMU_GERROR_MSI_GERROR_ABT (1 << 7)
> +#define SMMU_GERROR_SFM_ERR        (1 << 8)
> +
> +    SMMU_REG_GERRORN         = 0x64,
> +    SMMU_REG_GERROR_IRQ_CFG0 = 0x68,
> +    SMMU_REG_GERROR_IRQ_CFG1 = 0x70,
> +    SMMU_REG_GERROR_IRQ_CFG2 = 0x74,
> +
> +    /* SMMU_BASE_RA Applies to STRTAB_BASE, CMDQ_BASE and EVTQ_BASE */
> +#define SMMU_BASE_RA        (1ULL << 62)
> +    SMMU_REG_STRTAB_BASE     = 0x80,
> +    SMMU_REG_STRTAB_BASE_CFG = 0x88,
> +
> +    SMMU_REG_CMDQ_BASE       = 0x90,
> +    SMMU_REG_CMDQ_PROD       = 0x98,
> +    SMMU_REG_CMDQ_CONS       = 0x9c,
> +    /* CMD Consumer (CONS) */
> +#define SMMU_CMD_CONS_ERR_SHIFT        24
> +#define SMMU_CMD_CONS_ERR_BITS         7
> +
> +    SMMU_REG_EVTQ_BASE       = 0xa0,
> +    SMMU_REG_EVTQ_PROD       = 0xa8,
> +    SMMU_REG_EVTQ_CONS       = 0xac,
> +    SMMU_REG_EVTQ_IRQ_CFG0   = 0xb0,
> +    SMMU_REG_EVTQ_IRQ_CFG1   = 0xb8,
> +    SMMU_REG_EVTQ_IRQ_CFG2   = 0xbc,
> +
> +    SMMU_REG_PRIQ_BASE       = 0xc0,
> +    SMMU_REG_PRIQ_PROD       = 0xc8,
> +    SMMU_REG_PRIQ_CONS       = 0xcc,
> +    SMMU_REG_PRIQ_IRQ_CFG0   = 0xd0,
> +    SMMU_REG_PRIQ_IRQ_CFG1   = 0xd8,
> +    SMMU_REG_PRIQ_IRQ_CFG2   = 0xdc,
> +
> +    SMMU_ID_REGS_OFFSET      = 0xfd0,
> +
> +    /* Secure registers are not used for now */
> +    SMMU_SECURE_OFFSET       = 0x8000,
> +};
> +
> +/**********************
> + * Data Structures
> + **********************/
> +
> +struct __smmu_data2 {
> +    uint32_t word[2];
> +};

Don't use __ prefixes -- those are reserved for the system.
But these structures look a bit like they're not very useful
anyway.

> +
> +struct __smmu_data8 {
> +    uint32_t word[8];
> +};
> +
> +struct __smmu_data16 {
> +    uint32_t word[16];
> +};
> +
> +struct __smmu_data4 {
> +    uint32_t word[4];
> +};
> +
> +typedef struct __smmu_data4  Cmd; /* Command Entry */
> +typedef struct __smmu_data8  Evt; /* Event Entry */
> +
> +/*****************************
> + *  Register Access Primitives
> + *****************************/
> +
> +static inline void smmu_write32_reg(SMMUV3State *s, uint32_t addr, uint32_t val)
> +{
> +    s->regs[addr >> 2] = val;
> +}
> +
> +static inline void smmu_write64_reg(SMMUV3State *s, uint32_t addr, uint64_t val)
> +{
> +    addr >>= 2;
> +    s->regs[addr] = extract64(val, 0, 32);
> +    s->regs[addr + 1] = extract64(val, 32, 32);
> +}
> +
> +static inline uint32_t smmu_read32_reg(SMMUV3State *s, uint32_t addr)
> +{
> +    return s->regs[addr >> 2];
> +}
> +
> +static inline uint64_t smmu_read64_reg(SMMUV3State *s, uint32_t addr)
> +{
> +    addr >>= 2;
> +    return s->regs[addr] | ((uint64_t)(s->regs[addr + 1]) << 32);
> +}

This kind of thing is why I'm not a fan of implementing device
register state as an array.

> +
> +static inline int smmu_enabled(SMMUV3State *s)
> +{
> +    return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
> +}
> +
> +#endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> new file mode 100644
> index 0000000..0a7cd1c
> --- /dev/null
> +++ b/hw/arm/smmuv3.c
> @@ -0,0 +1,239 @@
> +/*
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/boards.h"
> +#include "sysemu/sysemu.h"
> +#include "hw/sysbus.h"
> +#include "hw/pci/pci.h"
> +#include "exec/address-spaces.h"
> +#include "trace.h"
> +#include "qemu/error-report.h"
> +
> +#include "hw/arm/smmuv3.h"
> +#include "smmuv3-internal.h"
> +
> +static void smmuv3_init_regs(SMMUV3State *s)
> +{
> +    uint32_t data =
> +        SMMU_IDR0_STLEVEL << SMMU_IDR0_STLEVEL_SHIFT |
> +        SMMU_IDR0_TERM    << SMMU_IDR0_TERM_SHIFT    |
> +        SMMU_IDR0_STALL   << SMMU_IDR0_STALL_SHIFT   |
> +        SMMU_IDR0_VMID16  << SMMU_IDR0_VMID16_SHIFT  |
> +        SMMU_IDR0_PRI     << SMMU_IDR0_PRI_SHIFT     |
> +        SMMU_IDR0_ASID16  << SMMU_IDR0_ASID16_SHIFT  |
> +        SMMU_IDR0_ATS     << SMMU_IDR0_ATS_SHIFT     |
> +        SMMU_IDR0_HYP     << SMMU_IDR0_HYP_SHIFT     |
> +        SMMU_IDR0_HTTU    << SMMU_IDR0_HTTU_SHIFT    |
> +        SMMU_IDR0_COHACC  << SMMU_IDR0_COHACC_SHIFT  |
> +        SMMU_IDR0_TTF     << SMMU_IDR0_TTF_SHIFT     |
> +        SMMU_IDR0_S1P     << SMMU_IDR0_S1P_SHIFT     |
> +        SMMU_IDR0_S2P     << SMMU_IDR0_S2P_SHIFT;
> +
> +    smmu_write32_reg(s, SMMU_REG_IDR0, data);
> +
> +#define SMMU_QUEUE_SIZE_LOG2  19
> +    data =
> +        1 << 27 |                    /* Attr Types override */
> +        SMMU_QUEUE_SIZE_LOG2 << 21 | /* Cmd Q size */
> +        SMMU_QUEUE_SIZE_LOG2 << 16 | /* Event Q size */
> +        SMMU_QUEUE_SIZE_LOG2 << 11 | /* PRI Q size */
> +        0  << 6 |                    /* SSID not supported */
> +        SMMU_IDR1_SIDSIZE;
> +
> +    smmu_write32_reg(s, SMMU_REG_IDR1, data);
> +
> +    s->sid_size = SMMU_IDR1_SIDSIZE;
> +
> +    data = SMMU_IDR5_GRAN << SMMU_IDR5_GRAN_SHIFT | SMMU_IDR5_OAS;
> +
> +    smmu_write32_reg(s, SMMU_REG_IDR5, data);
> +}
> +
> +static void smmuv3_init_queues(SMMUV3State *s)
> +{
> +    s->cmdq.prod = 0;
> +    s->cmdq.cons = 0;
> +    s->cmdq.wrap.prod = 0;
> +    s->cmdq.wrap.cons = 0;
> +
> +    s->evtq.prod = 0;
> +    s->evtq.cons = 0;
> +    s->evtq.wrap.prod = 0;
> +    s->evtq.wrap.cons = 0;
> +
> +    s->cmdq.entries = SMMU_QUEUE_SIZE_LOG2;
> +    s->cmdq.ent_size = sizeof(Cmd);
> +    s->evtq.entries = SMMU_QUEUE_SIZE_LOG2;
> +    s->evtq.ent_size = sizeof(Evt);
> +}
> +
> +static void smmuv3_init(SMMUV3State *s)
> +{
> +    smmuv3_init_regs(s);
> +    smmuv3_init_queues(s);
> +}
> +
> +static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
> +                                        uint64_t val)
> +{
> +    *base = val & ~(SMMU_BASE_RA | 0x3fULL);
> +}
> +
> +static void smmu_write_mmio_fixup(SMMUV3State *s, hwaddr *addr)
> +{
> +    switch (*addr) {
> +    case 0x100a8: case 0x100ac:         /* Aliasing => page0 registers */
> +    case 0x100c8: case 0x100cc:
> +        *addr ^= (hwaddr)0x10000;
> +    }

Maybe we should just take advantage of the CONSTRAINED UNPREDICTABLE
choice to have page0 and page1 be exact aliases, and have
   addr &= ~0x10000;
unconditionally?

> +}
> +
> +static void smmu_write_mmio(void *opaque, hwaddr addr,
> +                            uint64_t val, unsigned size)
> +{
> +}
> +
> +static uint64_t smmu_read_mmio(void *opaque, hwaddr addr, unsigned size)
> +{
> +    SMMUState *sys = opaque;
> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> +    uint64_t val;
> +
> +    smmu_write_mmio_fixup(s, &addr);
> +
> +    /* Primecell/Corelink ID registers */
> +    switch (addr) {
> +    case 0xFF0 ... 0xFFC:
> +    case 0xFDC ... 0xFE4:
> +        val = 0;

Section "6.3.72 ID_REGS" in the spec defines what these registers
should read as, and it's not all-zeroes. We can claim to be an ARM
implementation, as we do with the GIC.

> +        error_report("addr:0x%"PRIx64" val:0x%"PRIx64, addr, val);

error_report for the guest reading the ID regs ??


> +        break;
> +    case SMMU_REG_STRTAB_BASE ... SMMU_REG_CMDQ_BASE:
> +    case SMMU_REG_EVTQ_BASE:
> +    case SMMU_REG_PRIQ_BASE ... SMMU_REG_PRIQ_IRQ_CFG1:
> +        val = smmu_read64_reg(s, addr);
> +        break;
> +    default:
> +        val = (uint64_t)smmu_read32_reg(s, addr);
> +        break;
> +    }
> +
> +    trace_smmuv3_read_mmio(addr, val, size);
> +    return val;
> +}
> +
> +static const MemoryRegionOps smmu_mem_ops = {
> +    .read = smmu_read_mmio,
> +    .write = smmu_write_mmio,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
> +static void smmu_init_irq(SMMUV3State *s, SysBusDevice *dev)
> +{
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(s->irq); i++) {
> +        sysbus_init_irq(dev, &s->irq[i]);
> +    }
> +}
> +
> +static void smmu_reset(DeviceState *dev)
> +{
> +    SMMUV3State *s = SMMU_V3_DEV(dev);
> +    smmuv3_init(s);
> +}
> +
> +static void smmu_realize(DeviceState *d, Error **errp)
> +{
> +    SMMUState *sys = SMMU_SYS_DEV(d);
> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> +    SysBusDevice *dev = SYS_BUS_DEVICE(d);
> +
> +    memory_region_init_io(&sys->iomem, OBJECT(s),
> +                          &smmu_mem_ops, sys, TYPE_SMMU_V3_DEV, 0x20000);
> +
> +    sys->mrtypename = g_strdup(TYPE_SMMUV3_IOMMU_MEMORY_REGION);
> +
> +    sysbus_init_mmio(dev, &sys->iomem);
> +
> +    smmu_init_irq(s, dev);
> +}
> +
> +static const VMStateDescription vmstate_smmuv3 = {
> +    .name = "smmuv3",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32_ARRAY(regs, SMMUV3State, SMMU_NREGS),
> +        VMSTATE_END_OF_LIST(),
> +    },
> +};
> +
> +static void smmuv3_instance_init(Object *obj)
> +{
> +    /* Nothing much to do here as of now */
> +}
> +
> +static void smmuv3_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    dc->reset   = smmu_reset;
> +    dc->vmsd    = &vmstate_smmuv3;
> +    dc->realize = smmu_realize;
> +}
> +
> +static void smmuv3_iommu_memory_region_class_init(ObjectClass *klass,
> +                                                  void *data)
> +{
> +}
> +
> +static const TypeInfo smmuv3_type_info = {
> +    .name          = TYPE_SMMU_V3_DEV,
> +    .parent        = TYPE_SMMU_DEV_BASE,
> +    .instance_size = sizeof(SMMUV3State),
> +    .instance_init = smmuv3_instance_init,
> +    .class_data    = NULL,

What?

> +    .class_size    = sizeof(SMMUV3Class),
> +    .class_init    = smmuv3_class_init,
> +};
> +
> +static const TypeInfo smmuv3_iommu_memory_region_info = {
> +    .parent = TYPE_IOMMU_MEMORY_REGION,
> +    .name = TYPE_SMMUV3_IOMMU_MEMORY_REGION,
> +    .class_init = smmuv3_iommu_memory_region_class_init,
> +};
> +
> +static void smmuv3_register_types(void)
> +{
> +    type_register(&smmuv3_type_info);
> +    type_register(&smmuv3_iommu_memory_region_info);
> +}
> +
> +type_init(smmuv3_register_types)
> +
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index c67cd39..8affbf7 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -14,3 +14,6 @@ smmu_page_walk_level_block_pte(int stage, int level, uint64_t baseaddr, uint64_t
>  smmu_page_walk_level_table_pte(int stage, int level, uint64_t baseaddr, uint64_t pteaddr, uint64_t pte, uint64_t address) "stage=%d, level=%d base@=0x%"PRIx64" pte@=0x%"PRIx64" pte=0x%"PRIx64" next table address = 0x%"PRIx64
>  smmu_get_pte(uint64_t baseaddr, int index, uint64_t pteaddr, uint64_t pte) "baseaddr=0x%"PRIx64" index=0x%x, pteaddr=0x%"PRIx64", pte=0x%"PRIx64
>  smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa = 0x%"PRIx64
> +
> +#hw/arm/smmuv3.c
> +smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
> new file mode 100644
> index 0000000..0c8973d
> --- /dev/null
> +++ b/include/hw/arm/smmuv3.h
> @@ -0,0 +1,79 @@
> +/*
> + * Copyright (C) 2014-2016 Broadcom Corporation
> + * Copyright (c) 2017 Red Hat, Inc.
> + * Written by Prem Mallappa, Eric Auger
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_ARM_SMMUV3_H
> +#define HW_ARM_SMMUV3_H
> +
> +#include "hw/arm/smmu-common.h"
> +
> +#define TYPE_SMMUV3_IOMMU_MEMORY_REGION "smmuv3-iommu-memory-region"
> +
> +#define SMMU_NREGS            0x200
> +
> +typedef struct SMMUQueue {
> +     hwaddr base;
> +     uint32_t prod;
> +     uint32_t cons;
> +     union {
> +          struct {
> +               uint8_t prod:1;
> +               uint8_t cons:1;
> +          };
> +          uint8_t unused;
> +     } wrap;

This is an inefficient way to represent the prod/cons registers.
Those are carefully arranged so that the wrap bit is just above
the queue index, so that you can implement the wrap bit toggling
by considering the register as an N+1 width integer for
increment purposes, but an N width register for indexing
into the queue. For QEMU we should just have a uint32_t prod;
and then we can always increment (including wrap bit handling)
with 'prod++', we can get the index into the queue with
'prod & some_mask', and the register read/write would be
handled using a mask that's 1 bit wider than some_mask (plus
any other high bits in the same register).

This also means you don't need weird special casing to
handle the architected behaviour for what happens to the
value in a queue register when the guest changes the queue
size (see eg the spec description of SMMU_EVENTQ_CONS in
6.3.29).

> +
> +     uint16_t entries;           /* Number of entries */
> +     uint8_t  ent_size;          /* Size of entry in bytes */
> +     uint8_t  shift;             /* Size in log2 */
> +} SMMUQueue;
> +
> +typedef struct SMMUV3State {
> +    SMMUState     smmu_state;
> +
> +    /* Local cache of most-frequently used registers */
> +#define SMMU_FEATURE_2LVL_STE (1 << 0)
> +    uint32_t     features;
> +    uint16_t     sid_size;
> +    uint16_t     sid_split;
> +    uint64_t     strtab_base;
> +
> +    uint32_t    regs[SMMU_NREGS];
> +
> +    qemu_irq     irq[4];
> +    SMMUQueue    cmdq, evtq;

This data structure setup means you have some register state
you're keeping in potentially two places: regs[X] and also
in fields in SMMUQueue. This is awkward for vmstate save/restore
and it doesn't look like you quite get it right. You can either:
 * only save/restore the regs[] in vmstate, treating those as
   the canonical source of data, and have a post-load hook to
   reload the info into the SMMUQueue fields
 * don't have info in regs[] for queue registers, instead keeping
   the data canonically in the SMMUQueue fields, and have
   vmstate fields for migrating the SMMUQueue fields

> +
> +} SMMUV3State;
> +
> +typedef enum {
> +    SMMU_IRQ_EVTQ,
> +    SMMU_IRQ_PRIQ,
> +    SMMU_IRQ_CMD_SYNC,
> +    SMMU_IRQ_GERROR,
> +} SMMUIrq;
> +
> +typedef struct {
> +    SMMUBaseClass smmu_base_class;
> +} SMMUV3Class;
> +
> +#define TYPE_SMMU_V3_DEV   "smmuv3"
> +#define SMMU_V3_DEV(obj) OBJECT_CHECK(SMMUV3State, (obj), TYPE_SMMU_V3_DEV)
> +#define SMMU_V3_DEVICE_GET_CLASS(obj)                              \
> +    OBJECT_GET_CLASS(SMMUBaseClass, (obj), TYPE_SMMU_V3_DEV)
> +
> +#endif
> --
> 2.5.5
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
@ 2017-10-09 17:01   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:01 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> We introduce some helpers to handle wired IRQs and especially
> GERROR interrupt. SMMU writes GERROR register on GERROR event
> and SW acks GERROR interrupts by setting GERRORn.
>
> The Wired interrupts are edge sensitive.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> Is CMD_SYNC interrupt enabled somewhere?
> ---
>  hw/arm/smmuv3-internal.h | 20 ++++++++++++++++++
>  hw/arm/smmuv3.c          | 55 ++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/trace-events      |  2 ++
>  3 files changed, 77 insertions(+)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 488acc8..2b44ee2 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -198,4 +198,24 @@ static inline int smmu_enabled(SMMUV3State *s)
>      return smmu_read32_reg(s, SMMU_REG_CR0) & SMMU_CR0_SMMU_ENABLE;
>  }
>
> +/*****************************
> + * Interrupts
> + *****************************/
> +
> +#define smmu_evt_irq_enabled(s)                   \
> +    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_EVENT_EN)
> +#define smmu_gerror_irq_enabled(s)                  \
> +    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_GERROR_EN)
> +#define smmu_pri_irq_enabled(s)                 \
> +    (smmu_read64_reg(s, SMMU_REG_IRQ_CTRL) & SMMU_IRQ_CTRL_PRI_EN)
> +
> +#define SMMU_PENDING_GERRORS(s) \
> +    (smmu_read32_reg(s, SMMU_REG_GERROR) ^ \
> +     smmu_read32_reg(s, SMMU_REG_GERRORN))

Hiding this XOR inside a macro is very confusing.

> +
> +#define SMMU_CMDQ_ERR(s) (SMMU_PENDING_GERRORS(s) & SMMU_GERROR_CMDQ)
> +
> +void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val);
> +void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn);
> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 0a7cd1c..468134f 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -29,6 +29,61 @@
>  #include "hw/arm/smmuv3.h"
>  #include "smmuv3-internal.h"
>
> +/**
> + * smmuv3_irq_trigger - pulse @irq if enabled and update
> + * GERROR register in case of GERROR interrupt
> + *
> + * @irq: irq type
> + * @gerror: gerror new value, only relevant if @irq is GERROR
> + */
> +void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
> +{
> +    uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
> +    bool pulse = false;
> +
> +    switch (irq) {
> +    case SMMU_IRQ_EVTQ:
> +        pulse = smmu_evt_irq_enabled(s);
> +        break;
> +    case SMMU_IRQ_PRIQ:
> +        pulse = smmu_pri_irq_enabled(s);
> +        break;
> +    case SMMU_IRQ_CMD_SYNC:
> +        pulse = true;
> +        break;
> +    case SMMU_IRQ_GERROR:
> +    {
> +        /* don't toggle an already pending error */
> +        bool new_gerrors = ~pending_gerrors & gerror_val;
> +        uint32_t gerror = smmu_read32_reg(s, SMMU_REG_GERROR);
> +
> +        smmu_write32_reg(s, SMMU_REG_GERROR, gerror | new_gerrors);

The SMMU toggles GERROR bits to indicate that they have been
set, it doesn't just set them to 1.

> +
> +        /* pulse the GERROR irq only if all fields were acked */
> +        pulse = smmu_gerror_irq_enabled(s) && !pending_gerrors;
> +        break;
> +    }
> +    }
> +    if (pulse) {
> +            trace_smmuv3_irq_trigger(irq,
> +                                     smmu_read32_reg(s, SMMU_REG_GERROR),
> +                                     SMMU_PENDING_GERRORS(s));
> +            qemu_irq_pulse(s->irq[irq]);

qemu_irq_pulse() is very rarely the right thing for an interrupt
line, but it looks like it's right here (per spec 3.18.2).

> +    }
> +}
> +
> +void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
> +{
> +    uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
> +    uint32_t sanitized;
> +
> +    /* Make sure SW does not toggle irqs that are not active */

If it does, this is CONSTRAINED UNPREDICTABLE. That is worth
remarking in either a comment or a LOG_GUEST_ERROR warning.

> +    sanitized = gerrorn & pending_gerrors;

This isn't right -- if you want to sanitize the result
being written then you need to make it only change bits
that are in pending_errors, but this expression allows
the guest to write a 0 to a field that is for a non-pending
error. (Or you could just not sanitize at all, that's allowed
by the CONSTRAINED UNPREDICTABLE.)

> +
> +    smmu_write32_reg(s, SMMU_REG_GERRORN, sanitized);
> +    trace_smmuv3_write_gerrorn(gerrorn, sanitized, SMMU_PENDING_GERRORS(s));
> +}
> +
>  static void smmuv3_init_regs(SMMUV3State *s)
>  {
>      uint32_t data =
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 8affbf7..c1ce8eb 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -17,3 +17,5 @@ smmu_set_translated_address(hwaddr iova, hwaddr pa) "iova = 0x%"PRIx64" -> pa =
>
>  #hw/arm/smmuv3.c
>  smmuv3_read_mmio(hwaddr addr, uint64_t val, unsigned size) "addr: 0x%"PRIx64" val:0x%"PRIx64" size: 0x%x"
> +smmuv3_irq_trigger(int irq, uint32_t gerror, uint32_t pending) "irq=%d gerror=0x%x pending gerrors=0x%x"
> +smmuv3_write_gerrorn(uint32_t gerrorn, uint32_t sanitized, uint32_t pending) "gerrorn=0x%x sanitized=0x%x pending=0x%x"
> --
> 2.5.5

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers Eric Auger
@ 2017-10-09 17:12   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:12 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> We introduce helpers to read/write into the circular queues.
> smmuv3_read_cmdq and smmuv3_write_evtq will become static
> later on.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

See comments on a previous patch where I suggest a better
way to implement the queue increment/wrapping handling.

> +typedef enum {
> +    CMD_Q_EMPTY,
> +    CMD_Q_FULL,
> +    CMD_Q_PARTIALLY_FILLED,
> +} SMMUQStatus;
> +
> +#define Q_ENTRY(q, idx)  (q->base + q->ent_size * idx)
> +#define Q_WRAP(q, pc)    ((pc) >> (q)->shift)
> +#define Q_IDX(q, pc)     ((pc) & ((1 << (q)->shift) - 1))
> +
> +static inline SMMUQStatus __smmu_queue_status(SMMUV3State *s, SMMUQueue *q)

No __ prefixes, please.

> +{
> +    uint32_t prod = Q_IDX(q, q->prod);
> +    uint32_t cons = Q_IDX(q, q->cons);
> +
> +    if ((prod == cons) && (q->wrap.prod != q->wrap.cons)) {
> +        return CMD_Q_FULL;
> +    } else if ((prod == cons) && (q->wrap.prod == q->wrap.cons)) {
> +        return CMD_Q_EMPTY;
> +    }
> +    return CMD_Q_PARTIALLY_FILLED;
> +}
> +#define smmu_is_q_full(s, q) (__smmu_queue_status(s, q) == CMD_Q_FULL)
> +#define smmu_is_q_empty(s, q) (__smmu_queue_status(s, q) == CMD_Q_EMPTY)
> +
> +static inline int __smmu_q_enabled(SMMUV3State *s, uint32_t q)
> +{
> +    return smmu_read32_reg(s, SMMU_REG_CR0) & q;
> +}
> +#define smmu_cmd_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_CMDQ_ENABLE)
> +#define smmu_evt_q_enabled(s) __smmu_q_enabled(s, SMMU_CR0_EVTQ_ENABLE)

This code seems to be rather macro-happy.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
@ 2017-10-09 17:17   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:17 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> Now we have relevant helpers for queue and irq
> management, let's implement MMIO write operations
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmuv3-internal.h | 103 +++++++++++++++++++++++-
>  hw/arm/smmuv3.c          | 204 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/arm/trace-events      |  15 ++++
>  3 files changed, 317 insertions(+), 5 deletions(-)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index d88f141..a5d60b4 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -215,8 +215,6 @@ static inline int smmu_enabled(SMMUV3State *s)
>
>  #define SMMU_CMDQ_ERR(s) (SMMU_PENDING_GERRORS(s) & SMMU_GERROR_CMDQ)
>
> -void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn);
> -
>  /***************************
>   * Queue Handling
>   ***************************/
> @@ -261,7 +259,106 @@ static inline void smmu_write_cmdq_err(SMMUV3State *s, uint32_t err_type)
>                          regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
>  }
>
> -MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd);
>  void smmuv3_write_evtq(SMMUV3State *s, Evt *evt);
>
> +/*****************************
> + * Commands
> + *****************************/
> +
> +enum {
> +    SMMU_CMD_PREFETCH_CONFIG = 0x01,
> +    SMMU_CMD_PREFETCH_ADDR,
> +    SMMU_CMD_CFGI_STE,
> +    SMMU_CMD_CFGI_STE_RANGE,
> +    SMMU_CMD_CFGI_CD,
> +    SMMU_CMD_CFGI_CD_ALL,
> +    SMMU_CMD_CFGI_ALL,
> +    SMMU_CMD_TLBI_NH_ALL     = 0x10,
> +    SMMU_CMD_TLBI_NH_ASID,
> +    SMMU_CMD_TLBI_NH_VA,
> +    SMMU_CMD_TLBI_NH_VAA,
> +    SMMU_CMD_TLBI_EL3_ALL    = 0x18,
> +    SMMU_CMD_TLBI_EL3_VA     = 0x1a,
> +    SMMU_CMD_TLBI_EL2_ALL    = 0x20,
> +    SMMU_CMD_TLBI_EL2_ASID,
> +    SMMU_CMD_TLBI_EL2_VA,
> +    SMMU_CMD_TLBI_EL2_VAA,  /* 0x23 */
> +    SMMU_CMD_TLBI_S12_VMALL  = 0x28,
> +    SMMU_CMD_TLBI_S2_IPA     = 0x2a,
> +    SMMU_CMD_TLBI_NSNH_ALL   = 0x30,
> +    SMMU_CMD_ATC_INV         = 0x40,
> +    SMMU_CMD_PRI_RESP,
> +    SMMU_CMD_RESUME          = 0x44,
> +    SMMU_CMD_STALL_TERM,
> +    SMMU_CMD_SYNC,          /* 0x46 */
> +};
> +
> +static const char *cmd_stringify[] = {
> +    [SMMU_CMD_PREFETCH_CONFIG] = "SMMU_CMD_PREFETCH_CONFIG",
> +    [SMMU_CMD_PREFETCH_ADDR]   = "SMMU_CMD_PREFETCH_ADDR",
> +    [SMMU_CMD_CFGI_STE]        = "SMMU_CMD_CFGI_STE",
> +    [SMMU_CMD_CFGI_STE_RANGE]  = "SMMU_CMD_CFGI_STE_RANGE",
> +    [SMMU_CMD_CFGI_CD]         = "SMMU_CMD_CFGI_CD",
> +    [SMMU_CMD_CFGI_CD_ALL]     = "SMMU_CMD_CFGI_CD_ALL",
> +    [SMMU_CMD_CFGI_ALL]        = "SMMU_CMD_CFGI_ALL",
> +    [SMMU_CMD_TLBI_NH_ALL]     = "SMMU_CMD_TLBI_NH_ALL",
> +    [SMMU_CMD_TLBI_NH_ASID]    = "SMMU_CMD_TLBI_NH_ASID",
> +    [SMMU_CMD_TLBI_NH_VA]      = "SMMU_CMD_TLBI_NH_VA",
> +    [SMMU_CMD_TLBI_NH_VAA]     = "SMMU_CMD_TLBI_NH_VAA",
> +    [SMMU_CMD_TLBI_EL3_ALL]    = "SMMU_CMD_TLBI_EL3_ALL",
> +    [SMMU_CMD_TLBI_EL3_VA]     = "SMMU_CMD_TLBI_EL3_VA",
> +    [SMMU_CMD_TLBI_EL2_ALL]    = "SMMU_CMD_TLBI_EL2_ALL",
> +    [SMMU_CMD_TLBI_EL2_ASID]   = "SMMU_CMD_TLBI_EL2_ASID",
> +    [SMMU_CMD_TLBI_EL2_VA]     = "SMMU_CMD_TLBI_EL2_VA",
> +    [SMMU_CMD_TLBI_EL2_VAA]    = "SMMU_CMD_TLBI_EL2_VAA",
> +    [SMMU_CMD_TLBI_S12_VMALL]  = "SMMU_CMD_TLBI_S12_VMALL",
> +    [SMMU_CMD_TLBI_S2_IPA]     = "SMMU_CMD_TLBI_S2_IPA",
> +    [SMMU_CMD_TLBI_NSNH_ALL]   = "SMMU_CMD_TLBI_NSNH_ALL",
> +    [SMMU_CMD_ATC_INV]         = "SMMU_CMD_ATC_INV",
> +    [SMMU_CMD_PRI_RESP]        = "SMMU_CMD_PRI_RESP",
> +    [SMMU_CMD_RESUME]          = "SMMU_CMD_RESUME",
> +    [SMMU_CMD_STALL_TERM]      = "SMMU_CMD_STALL_TERM",
> +    [SMMU_CMD_SYNC]            = "SMMU_CMD_SYNC",
> +};
> +
> +/*****************************
> + * CMDQ fields
> + *****************************/
> +
> +typedef enum {
> +    SMMU_CERROR_NONE = 0,
> +    SMMU_CERROR_ILL,
> +    SMMU_CERROR_ABT,
> +    SMMU_CERROR_ATC_INV_SYNC,
> +} SMMUCmdError;
> +
> +enum { /* Command completion notification */
> +    CMD_SYNC_SIG_NONE,
> +    CMD_SYNC_SIG_IRQ,
> +    CMD_SYNC_SIG_SEV,
> +};
> +
> +#define CMD_TYPE(x)  extract32((x)->word[0], 0, 8)
> +#define CMD_SEC(x)   extract32((x)->word[0], 9, 1)
> +#define CMD_SEV(x)   extract32((x)->word[0], 10, 1)
> +#define CMD_AC(x)    extract32((x)->word[0], 12, 1)
> +#define CMD_AB(x)    extract32((x)->word[0], 13, 1)
> +#define CMD_CS(x)    extract32((x)->word[0], 12, 2)
> +#define CMD_SSID(x)  extract32((x)->word[0], 16, 16)
> +#define CMD_SID(x)   ((x)->word[1])
> +#define CMD_VMID(x)  extract32((x)->word[1], 0, 16)
> +#define CMD_ASID(x)  extract32((x)->word[1], 16, 16)
> +#define CMD_STAG(x)  extract32((x)->word[2], 0, 16)
> +#define CMD_RESP(x)  extract32((x)->word[2], 11, 2)
> +#define CMD_GRPID(x) extract32((x)->word[3], 0, 8)
> +#define CMD_SIZE(x)  extract32((x)->word[3], 0, 16)
> +#define CMD_LEAF(x)  extract32((x)->word[3], 0, 1)
> +#define CMD_SPAN(x)  extract32((x)->word[3], 0, 5)
> +#define CMD_ADDR(x) ({                                  \
> +            uint64_t addr = (uint64_t)(x)->word[3];     \
> +            addr <<= 32;                                \
> +            addr |=  extract32((x)->word[3], 12, 20);   \
> +            addr;                                       \
> +        })

This definitely seems to be reimplementing some of the registerfields.h
functionality.

> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 2f96463..f35fadc 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -72,7 +72,7 @@ static void smmuv3_irq_trigger(SMMUV3State *s, SMMUIrq irq, uint32_t gerror_val)
>      }
>  }
>
> -void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
> +static void smmuv3_write_gerrorn(SMMUV3State *s, uint32_t gerrorn)
>  {
>      uint32_t pending_gerrors = SMMU_PENDING_GERRORS(s);
>      uint32_t sanitized;
> @@ -116,7 +116,7 @@ static void smmu_q_write(SMMUQueue *q, void *data)
>      }
>  }
>
> -MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
> +static MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
>  {
>      SMMUQueue *q = &s->cmdq;
>      MemTxResult ret = smmu_q_read(q, cmd);
> @@ -224,6 +224,147 @@ static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>      *base = val & ~(SMMU_BASE_RA | 0x3fULL);
>  }
>
> +static int smmuv3_cmdq_consume(SMMUV3State *s)
> +{
> +    SMMUCmdError cmd_error = SMMU_CERROR_NONE;
> +
> +    trace_smmuv3_cmdq_consume(SMMU_CMDQ_ERR(s), smmu_cmd_q_enabled(s),
> +                              s->cmdq.prod, s->cmdq.cons,
> +                              s->cmdq.wrap.prod, s->cmdq.wrap.cons);
> +
> +    if (!smmu_cmd_q_enabled(s)) {
> +        return 0;
> +    }
> +
> +    while (!SMMU_CMDQ_ERR(s) && !smmu_is_q_empty(s, &s->cmdq)) {
> +        uint32_t type;
> +        Cmd cmd;
> +
> +        if (smmuv3_read_cmdq(s, &cmd) != MEMTX_OK) {
> +            cmd_error = SMMU_CERROR_ABT;
> +            break;
> +        }
> +
> +        type = CMD_TYPE(&cmd);
> +
> +        trace_smmuv3_cmdq_opcode(cmd_stringify[type]);
> +
> +        switch (CMD_TYPE(&cmd)) {
> +        case SMMU_CMD_SYNC:
> +            if (CMD_CS(&cmd) & CMD_SYNC_SIG_IRQ) {
> +                smmuv3_irq_trigger(s, SMMU_IRQ_CMD_SYNC, 0);
> +            }
> +            break;
> +        case SMMU_CMD_PREFETCH_CONFIG:
> +        case SMMU_CMD_PREFETCH_ADDR:
> +            break;
> +        case SMMU_CMD_CFGI_STE:
> +        {
> +             uint32_t streamid = cmd.word[1];
> +
> +             trace_smmuv3_cmdq_cfgi_ste(streamid);
> +            break;
> +        }
> +        case SMMU_CMD_CFGI_STE_RANGE: /* same as SMMU_CMD_CFGI_ALL */
> +        {
> +            uint32_t start = cmd.word[1], range, end;
> +
> +            range = extract32(cmd.word[2], 0, 5);
> +            end = start + (1 << (range + 1)) - 1;
> +            trace_smmuv3_cmdq_cfgi_ste_range(start, end);
> +            break;
> +        }
> +        case SMMU_CMD_CFGI_CD:
> +        case SMMU_CMD_CFGI_CD_ALL:
> +            trace_smmuv3_unhandled_cmd(type);

Do we cache anything that actually needs to be invalidated?

> +            break;
> +        case SMMU_CMD_TLBI_NH_ALL:
> +        case SMMU_CMD_TLBI_NH_ASID:
> +            trace_smmuv3_unhandled_cmd(type);

Ditto.

> +            break;
> +        case SMMU_CMD_TLBI_NH_VA:
> +        {
> +            int asid = extract32(cmd.word[1], 16, 16);
> +            int vmid = extract32(cmd.word[1], 0, 16);
> +            uint64_t low = extract32(cmd.word[2], 12, 20);
> +            uint64_t high = cmd.word[3];
> +            uint64_t addr = high << 32 | (low << 12);
> +
> +            trace_smmuv3_cmdq_tlbi_nh_va(asid, vmid, addr);
> +            break;
> +        }
> +        case SMMU_CMD_TLBI_NH_VAA:
> +        case SMMU_CMD_TLBI_EL3_ALL:
> +        case SMMU_CMD_TLBI_EL3_VA:
> +        case SMMU_CMD_TLBI_EL2_ALL:
> +        case SMMU_CMD_TLBI_EL2_ASID:
> +        case SMMU_CMD_TLBI_EL2_VA:
> +        case SMMU_CMD_TLBI_EL2_VAA:
> +        case SMMU_CMD_TLBI_S12_VMALL:
> +        case SMMU_CMD_TLBI_S2_IPA:
> +        case SMMU_CMD_TLBI_NSNH_ALL:
> +            trace_smmuv3_unhandled_cmd(type);
> +            break;
> +        case SMMU_CMD_ATC_INV:
> +        case SMMU_CMD_PRI_RESP:
> +        case SMMU_CMD_RESUME:
> +        case SMMU_CMD_STALL_TERM:
> +            trace_smmuv3_unhandled_cmd(type);
> +            break;

You could merge these two sets of cases (or maybe the TLBI are
trivial nops?)

> +        default:
> +            cmd_error = SMMU_CERROR_ILL;
> +            error_report("Illegal command type: %d", CMD_TYPE(&cmd));
> +            break;
> +        }
> +    }
> +
> +    if (cmd_error) {
> +        error_report("GERROR_CMDQ: CONS.ERR=%d", cmd_error);
> +        smmu_write_cmdq_err(s, cmd_error);
> +        smmuv3_irq_trigger(s, SMMU_IRQ_GERROR, SMMU_GERROR_CMDQ);
> +    }
> +
> +    trace_smmuv3_cmdq_consume_out(s->cmdq.wrap.prod, s->cmdq.prod,
> +                                  s->cmdq.wrap.cons, s->cmdq.cons);
> +
> +    return 0;


>  {
> +    SMMUState *sys = opaque;
> +    SMMUV3State *s = SMMU_V3_DEV(sys);
> +
> +    smmu_write_mmio_fixup(s, &addr);
> +
> +    trace_smmuv3_write_mmio(addr, val, size);
> +
> +    switch (addr) {
> +    case 0xFDC ... 0xFFC:
> +    case SMMU_REG_IDR0 ... SMMU_REG_IDR5:
> +        trace_smmuv3_write_mmio_idr(addr, val);
> +        return;
> +    case SMMU_REG_GERRORN:
> +        smmuv3_write_gerrorn(s, val);
> +        /*
> +         * By acknowledging the CMDQ_ERR, SW may notify cmds can
> +         * be processed again
> +         */
> +        smmuv3_cmdq_consume(s);
> +        return;
> +    case SMMU_REG_CR0:
> +        smmu_write32_reg(s, SMMU_REG_CR0, val);
> +        /* immediatly reflect the changes in CR0_ACK */
> +        smmu_write32_reg(s, SMMU_REG_CR0_ACK, val);

"immediately"

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper Eric Auger
@ 2017-10-09 17:34   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:34 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> Let's introduce a helper function aiming at recording an
> event in the event queue.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> At the moment, for some events we do not fill all the fields.
> Typically filling the FetchAddr field resulting of an abort
> on page table walk would require to return more information
> from this latter in case of error.
>
> However with enabled use cases I have not seen any event
> recorded yet.
> ---
>  hw/arm/smmuv3-internal.h | 45 ++++++++++++++++++++++++--
>  hw/arm/smmuv3.c          | 84 +++++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 126 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index a5d60b4..e3e9828 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -259,8 +259,6 @@ static inline void smmu_write_cmdq_err(SMMUV3State *s, uint32_t err_type)
>                          regval | err_type << SMMU_CMD_CONS_ERR_SHIFT);
>  }
>
> -void smmuv3_write_evtq(SMMUV3State *s, Evt *evt);
> -
>  /*****************************
>   * Commands
>   *****************************/
> @@ -361,4 +359,47 @@ enum { /* Command completion notification */
>              addr;                                       \
>          })
>
> +/*****************************
> + * EVTQ fields
> + *****************************/
> +
> +#define EVT_Q_OVERFLOW        (1 << 31)
> +
> +#define EVT_SET_TYPE(x, t)    deposit32((x)->word[0], 0, 8, t)
> +#define EVT_SET_SID(x, s)     ((x)->word[1] =  s)
> +#define EVT_SET_INPUT_ADDR(x, addr) ({                    \
> +            (x)->word[5] = (uint32_t)(addr >> 32);        \
> +            (x)->word[4] = (uint32_t)(addr & 0xffffffff); \
> +        })
> +#define EVT_SET_RNW(x, rnw)     deposit32((x)->word[3], 3, 1, rnw)
> +
> +/*****************************
> + * Events
> + *****************************/
> +
> +typedef enum evt_err {
> +    SMMU_EVT_OK,
> +    SMMU_EVT_F_UUT,
> +    SMMU_EVT_C_BAD_SID,
> +    SMMU_EVT_F_STE_FETCH,
> +    SMMU_EVT_C_BAD_STE,
> +    SMMU_EVT_F_BAD_ATS_REQ,
> +    SMMU_EVT_F_STREAM_DISABLED,
> +    SMMU_EVT_F_TRANS_FORBIDDEN,
> +    SMMU_EVT_C_BAD_SSID,
> +    SMMU_EVT_F_CD_FETCH,
> +    SMMU_EVT_C_BAD_CD,
> +    SMMU_EVT_F_WALK_EXT_ABRT,
> +    SMMU_EVT_F_TRANS        = 0x10,
> +    SMMU_EVT_F_ADDR_SZ,
> +    SMMU_EVT_F_ACCESS,
> +    SMMU_EVT_F_PERM,
> +    SMMU_EVT_F_TLB_CONFLICT = 0x20,
> +    SMMU_EVT_F_CFG_CONFLICT = 0x21,
> +    SMMU_EVT_E_PAGE_REQ     = 0x24,
> +} SMMUEvtErr;
> +
> +void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
> +                         uint32_t sid, bool is_write, SMMUEvtErr type);
> +
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index f35fadc..7470576 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -132,7 +132,7 @@ static MemTxResult smmuv3_read_cmdq(SMMUV3State *s, Cmd *cmd)
>      return ret;
>  }
>
> -void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
> +static void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
>  {
>      SMMUQueue *q = &s->evtq;
>      bool was_empty = smmu_is_q_empty(s, q);
> @@ -157,6 +157,88 @@ void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
>      }
>  }
>
> +/*
> + * smmuv3_record_event - Record an event
> + */
> +void smmuv3_record_event(SMMUV3State *s, hwaddr iova,

(Are you sure you want a hwaddr and not a uint64_t ?)

> +                         uint32_t sid, IOMMUAccessFlags perm,
> +                         SMMUEvtErr type)
> +{
> +    Evt evt;
> +    bool rnw = perm & IOMMU_RO;

This doesn't feel like a great way to structure this, because
every event has different fields and so you'll end up with
a huge number of parameters to this function, half of which
are unused for any given callsite.

I think a better approach is to have one utility function
per event type (the syn_aa64_* functions in target/arm/internals.h
for filling out syndrome register values are an example of this).

Or you could have a struct-and-union setup

typedef struct {
    uint8_t event_type;
    union {
        struct {
            hwaddr iova;
            bool rnw;
            // etc
        } f_uut;
        struct {
            // etc
        } c_bad_streamid;
        // etc
    } u;
} EventInfo;

and have callers fill that in for this function to then
marshal into the record structure. That might be a bit
heavyweight here compared to just having a function per
event -- depends a bit on whether your callsites are
going to find it helpful to construct an EventInfo in
one place and then hand it around/return it from a function
before recording it in the queue, or if you always have
all the info you need for the event right at the point
where you want to record it.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback Eric Auger
@ 2017-10-09 17:45   ` Peter Maydell
  2018-02-06 12:19     ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:45 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> This patch implements the IOMMU Memory Region translate()
> callback. Most of the code relates to the translation
> configuration decoding and check (STE, CD).
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmuv3-internal.h | 182 +++++++++++++++++++++++-
>  hw/arm/smmuv3.c          | 351 ++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/arm/trace-events      |   9 ++
>  3 files changed, 537 insertions(+), 5 deletions(-)
>
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index e3e9828..f9f95ae 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -399,7 +399,185 @@ typedef enum evt_err {
>      SMMU_EVT_E_PAGE_REQ     = 0x24,
>  } SMMUEvtErr;
>
> -void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
> -                         uint32_t sid, bool is_write, SMMUEvtErr type);
> +/*****************************
> + * Configuration Data
> + *****************************/
> +
> +typedef struct __smmu_data2  STEDesc; /* STE Level 1 Descriptor */
> +typedef struct __smmu_data16 Ste;     /* Stream Table Entry(STE) */
> +typedef struct __smmu_data2  CDDesc;  /* CD Level 1 Descriptor */
> +typedef struct __smmu_data16 Cd;      /* Context Descriptor(CD) */
> +
> +/*****************************
> + * STE fields
> + *****************************/
> +
> +#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
> +#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
> +enum {
> +    STE_CONFIG_NONE      = 0,
> +    STE_CONFIG_BYPASS    = 4,       /* S1 Bypass    , S2 Bypass */
> +    STE_CONFIG_S1        = 5,       /* S1 Translate , S2 Bypass */
> +    STE_CONFIG_S2        = 6,       /* S1 Bypass    , S2 Translate */
> +    STE_CONFIG_NESTED    = 7,       /* S1 Translate , S2 Translate */
> +};
> +#define STE_S1FMT(x)   extract32((x)->word[0], 4, 2)
> +#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
> +#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
> +#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
> +#define STE_S2VMID(x)  extract32((x)->word[4], 0, 16)
> +#define STE_S2T0SZ(x)  extract32((x)->word[5], 0, 6)
> +#define STE_S2SL0(x)   extract32((x)->word[5], 6, 2)
> +#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
> +#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
> +#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
> +#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
> +#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
> +#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
> +#define STE_CTXPTR(x)                                           \
> +    ({                                                          \
> +        unsigned long addr;                                     \
> +        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
> +        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
> +        addr;                                                   \
> +    })
> +
> +#define STE_S2TTB(x)                                            \
> +    ({                                                          \
> +        unsigned long addr;                                     \
> +        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
> +        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
> +        addr;                                                   \
> +    })
> +
> +static inline int is_ste_bypass(Ste *ste)

Types which are acronyms are all-caps, so STE, CD.

> +{
> +    return STE_CONFIG(ste) == STE_CONFIG_BYPASS;
> +}
> +
> +static inline bool is_ste_stage1(Ste *ste)
> +{
> +    return STE_CONFIG(ste) == STE_CONFIG_S1;
> +}
> +
> +static inline bool is_ste_stage2(Ste *ste)
> +{
> +    return STE_CONFIG(ste) == STE_CONFIG_S2;
> +}
> +
> +/**
> + * is_s2granule_valid - Check the stage 2 translation granule size
> + * advertised in the STE matches any IDR5 supported value
> + */
> +static inline bool is_s2granule_valid(Ste *ste)
> +{
> +    int idr5_format = 0;
> +
> +    switch (STE_S2TG(ste)) {
> +    case 0: /* 4kB */
> +        idr5_format = 0x1;
> +        break;
> +    case 1: /* 64 kB */
> +        idr5_format = 0x4;
> +        break;
> +    case 2: /* 16 kB */
> +        idr5_format = 0x2;
> +        break;
> +    case 3: /* reserved */
> +        break;
> +    }
> +    idr5_format &= SMMU_IDR5_GRAN;
> +    return idr5_format;
> +}
> +
> +static inline int oas2bits(int oas_field)
> +{
> +    switch (oas_field) {
> +    case 0b011:
> +        return 42;
> +    case 0b100:
> +        return 44;
> +    default:
> +        return 32 + (1 << oas_field);
> +   }
> +}
> +
> +static inline int pa_range(Ste *ste)
> +{
> +    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
> +
> +    if (!STE_S2AA64(ste)) {
> +        return 40;
> +    }
> +
> +    return oas2bits(oas_field);
> +}
> +
> +#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
> +
> +/*****************************
> + * CD fields
> + *****************************/
> +#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
> +#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
> +#define CD_TTB(x, sel)                                      \
> +    ({                                                      \
> +        uint64_t hi, lo;                                    \
> +        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
> +        hi <<= 32;                                          \
> +        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
> +        hi | lo;                                            \
> +    })
> +
> +#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
> +#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
> +#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
> +
> +#define CD_T0SZ(x)    CD_TSZ((x), 0)
> +#define CD_T1SZ(x)    CD_TSZ((x), 1)
> +#define CD_TG0(x)     CD_TG((x), 0)
> +#define CD_TG1(x)     CD_TG((x), 1)
> +#define CD_EPD0(x)    CD_EPD((x), 0)
> +#define CD_EPD1(x)    CD_EPD((x), 1)
> +#define CD_IPS(x)     extract32((x)->word[1], 0, 3)
> +#define CD_AARCH64(x) extract32((x)->word[1], 9, 1)
> +#define CD_TTB0(x)    CD_TTB((x), 0)
> +#define CD_TTB1(x)    CD_TTB((x), 1)
> +
> +#define CDM_VALID(x)    ((x)->word[0] & 0x1)
> +
> +static inline int is_cd_valid(SMMUV3State *s, Ste *ste, Cd *cd)
> +{
> +    return CD_VALID(cd);
> +}
> +
> +/**
> + * tg2granule - Decodes the CD translation granule size field according
> + * to the TT in use
> + * @bits: TG0/1 fiels
> + * @tg1: if set, @bits belong to TG1, otherwise belong to TG0
> + */
> +static inline int tg2granule(int bits, bool tg1)
> +{
> +    switch (bits) {
> +    case 1:
> +        return tg1 ? 14 : 16;
> +    case 2:
> +        return tg1 ? 12 : 14;
> +    case 3:
> +        return tg1 ? 16 : 12;
> +    default:
> +        return 12;
> +    }
> +}
> +
> +#define L1STD_L2PTR(stm) ({                                 \
> +            uint64_t hi, lo;                            \
> +            hi = (stm)->word[1];                        \
> +            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
> +            hi << 32 | lo;                              \
> +        })
> +
> +#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
>
>  #endif
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 7470576..20fbce6 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -160,9 +160,9 @@ static void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
>  /*
>   * smmuv3_record_event - Record an event
>   */
> -void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
> -                         uint32_t sid, IOMMUAccessFlags perm,
> -                         SMMUEvtErr type)
> +static void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
> +                                uint32_t sid, IOMMUAccessFlags perm,
> +                                SMMUEvtErr type)
>  {
>      Evt evt;
>      bool rnw = perm & IOMMU_RO;
> @@ -306,6 +306,348 @@ static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>      *base = val & ~(SMMU_BASE_RA | 0x3fULL);
>  }
>
> +/*
> + * All SMMU data structures are little endian, and are aligned to 8 bytes
> + * L1STE/STE/L1CD/CD, Queue entries in CMDQ/EVTQ/PRIQ
> + */
> +static inline int smmu_get_ste(SMMUV3State *s, hwaddr addr, Ste *buf)
> +{
> +    trace_smmuv3_get_ste(addr);
> +    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));

Why dma_memory_read() here when we were using other memory access
functions for things like TLB table walks earlier?

Incidentally, the spec requires us to perform memory accesses as
at least 64-bit single-copy atomic (see 3.21.3) -- does this do that?
(This gets important with SMP when the guest on another CPU might
be updating the STE or page table entry at the same time as we're
reading it...)

> +}
> +
> +/*
> + * For now we only support CD with a single entry, 'ssid' is used to identify
> + * otherwise
> + */
> +static inline int smmu_get_cd(SMMUV3State *s, Ste *ste, uint32_t ssid, Cd *buf)
> +{
> +    hwaddr addr = STE_CTXPTR(ste);
> +
> +    if (STE_S1CDMAX(ste) != 0) {
> +        error_report("Multilevel Ctx Descriptor not supported yet");
> +    }
> +
> +    trace_smmuv3_get_cd(addr);
> +    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
> +}
> +
> +/**
> + * is_ste_consistent - Check validity of STE
> + * according to 6.2.1 Validity of STE
> + * TODO: check the relevance of each check and compliance
> + * with this spec chapter

Good idea :-)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions Eric Auger
@ 2017-10-09 17:47   ` Peter Maydell
  2017-11-13 13:00     ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:47 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> The VirtMachineState contains some dt phandles that will be used
> in some node creation functions. For instance we plan to use the
> PCI host controller phandle in the smmu node creation function. So
> let's pass the VirtMachineState handle down to the node creation
> functions by enhancing the involved datatypes.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/sysbus-fdt.c         | 3 +++
>  hw/arm/virt.c               | 1 +
>  include/hw/arm/sysbus-fdt.h | 2 ++
>  3 files changed, 6 insertions(+)
>
> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
> index d68e3dc..d92a983 100644
> --- a/hw/arm/sysbus-fdt.c
> +++ b/hw/arm/sysbus-fdt.c
> @@ -36,6 +36,7 @@
>  #include "hw/vfio/vfio-platform.h"
>  #include "hw/vfio/vfio-calxeda-xgmac.h"
>  #include "hw/vfio/vfio-amd-xgbe.h"
> +#include "hw/arm/virt.h"
>  #include "hw/arm/fdt.h"
>
>  /*
> @@ -47,6 +48,7 @@ typedef struct PlatformBusFDTData {
>      int irq_start; /* index of the first IRQ usable by platform bus devices */
>      const char *pbus_node_name; /* name of the platform bus node */
>      PlatformBusDevice *pbus;
> +    VirtMachineState *vms;
>  } PlatformBusFDTData;

sysbus-fdt isn't virt specific, so this doesn't belong here.

More generally, why is sysbus-fdt involved in this at all?
I expected that instantiating and wiring up the SMMU would
be the job of hw/arm/virt.c, like any other device we
might have on the board.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling Eric Auger
@ 2017-10-09 17:48   ` Peter Maydell
  2017-10-17 15:06   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
  1 sibling, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:48 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> SMMUV3 does not support any IOVA range TLBI command:
> SMMU_CMD_TLBI_NH_VA invalidates TLB entries by page.
> That's an issue when running DPDK on guest. DPDK uses
> hugepages but each time a hugepage is mapped on guest side,
> a storm of SMMU_CMD_TLBI_NH_VA commands get sent by the
> guest smmuv3 driver and trapped by QEMU for VFIO replay.
>
> Let's get prepared to handle implementation defined commands,
> SMMU_CMD_TLBI_NH_VA_VM, which invalidate a range of IOVAs.
>
> Upon this command, we notify the whole range in one host.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

Definite NACK for this, I'm afraid. We should emulate the
hardware, not new things of our own devising.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option Eric Auger
@ 2017-10-09 17:49   ` Peter Maydell
  0 siblings, 0 replies; 72+ messages in thread
From: Peter Maydell @ 2017-10-09 17:49 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
> In VFIO use cases, the virtual smmu translates IOVA->IPA (stage 1)
> whereas the physical SMMU translates IPA -> host PA (stage 2).
>
> The 2 stages of the physical SMMU are currently not used.
> Instead both stage 1 and stage2 mappings are combined together
> and programmed in a single stage (S1) in the physical SMMU.
>
> The drawback of this approach is each time the IOVA->IPA mapping
> is changed by the guest, the host must be notified to re-program
> the physical SMMU with the combined stages.
>
> So we need to trap into the QEMU device each time the guest alters
> the configuration or TLB data. Unfortunately the SMMU does not
> expose any caching mode as the Intel IOMMU. On Intel, this caching
> mode HW bit informs the OS that each time it updates the remapping
> structures (even on map) it must invalidate the caches. Those
> invalidate commands are used to notify the host that it must
> recompute S1+S2 mappings and reprogram the HW.
>
> As we don't have the HW bit on ARM, we currently rely on a
> a FW quirk on guest smmuv3 driver side. When this FW quirk is
> applied the driver performs TLB invalidations on map and
> sends SMMU_CMD_TLBI_NH_VA_AM commands.
>
> Those TLB invalidations are used to trap changes in the
> translation tables.
>
> We introduced a new implemented defined SMMU_CMD_TLBI_NH_VA_AM
> command since it allows to inavlidate a whole range instead
> of invalidating a single page (native SMMU_CMD_TLBI_NH_VA command).
>
> As a consequence anybody wanting to use virtual smmuv3 in VFIO
> use case must add
> -device smmuv3,caching-mode
> to the option line.

Even more of a NACK on this one. We shouldn't need to do
weird things to be able to use the SMMU in a VM. We need
to figure out how the spec expects us (and the kernel)
to be using the SMMU, and do that.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
  2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling Eric Auger
  2017-10-09 17:48   ` Peter Maydell
@ 2017-10-17 15:06   ` Linu Cherian
  1 sibling, 0 replies; 72+ messages in thread
From: Linu Cherian @ 2017-10-17 15:06 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen, linu.cherian

Hi Eric,

On Fri Sep 01, 2017 at 07:21:22PM +0200, Eric Auger wrote:
> SMMUV3 does not support any IOVA range TLBI command:
> SMMU_CMD_TLBI_NH_VA invalidates TLB entries by page.
> That's an issue when running DPDK on guest. DPDK uses
> hugepages but each time a hugepage is mapped on guest side,
> a storm of SMMU_CMD_TLBI_NH_VA commands get sent by the
> guest smmuv3 driver and trapped by QEMU for VFIO replay.
> 
> Let's get prepared to handle implementation defined commands,
> SMMU_CMD_TLBI_NH_VA_VM, which invalidate a range of IOVAs.
> 
> Upon this command, we notify the whole range in one host.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
>  hw/arm/smmuv3-internal.h |  1 +
>  hw/arm/smmuv3.c          | 13 +++++++++++++
>  hw/arm/trace-events      |  1 +
>  3 files changed, 15 insertions(+)
> 
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index f9f95ae..e70cf76 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -289,6 +289,7 @@ enum {
>      SMMU_CMD_RESUME          = 0x44,
>      SMMU_CMD_STALL_TERM,
>      SMMU_CMD_SYNC,          /* 0x46 */
> +    SMMU_CMD_TLBI_NH_VA_AM   = 0x8F, /* VIOMMU Impl Defined */
>  };
>  
>  static const char *cmd_stringify[] = {
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 9c8640f..55dc80b 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -880,6 +880,19 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
>              smmuv3_replay_iova_range(&s->smmu_state, addr, size);
>              break;
>          }
> +        case SMMU_CMD_TLBI_NH_VA_AM:
> +        {
> +            int asid = extract32(cmd.word[1], 16, 16);
> +            int am = extract32(cmd.word[1], 0, 16);
> +            uint64_t low = extract32(cmd.word[2], 12, 20);
> +            uint64_t high = cmd.word[3];
> +            uint64_t addr = high << 32 | (low << 12);
> +            size_t size = am << 12;
> 


While testing dpdk, observed that there are map requests coming to 
arm smmuv3 driver with size greater than 256M. Since the current 
code supports only 256M( 16 + 12 bits), had to abuse the asid field
to pass the extra bits for address mask to get things working.

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 04e2d75..51b1d07 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1418,7 +1418,10 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size,
                if (smmu->options & ARM_SMMU_OPT_TLBI_ON_MAP) {
                        cmd.opcode      = CMDQ_OP_TLBI_NH_VA_AM;
                        cmd.tlbi.am   = size >> 12;
+                       cmd.tlbi.asid = size >> 28;
                        granule = size;


On Qemu side,

  if (cfg.disabled || cfg.bypassed) {
@@ -884,12 +899,15 @@ static int smmuv3_cmdq_consume(SMMUV3State *s)
         case SMMU_CMD_TLBI_NH_VA_AM:
         {
             int asid = extract32(cmd.word[1], 16, 16);
             int am = extract32(cmd.word[1], 0, 16);
             uint64_t low = extract32(cmd.word[2], 12, 20);
             uint64_t high = cmd.word[3];
             uint64_t addr = high << 32 | (low << 12);
             size_t size = am << 12;
 
+           am = am | asid << 16;
+           size = am << 12;        
+



+
> +            trace_smmuv3_cmdq_tlbi_nh_va_am(asid, am, addr, size);
> +            smmuv3_replay_iova_range(&s->smmu_state, addr, size);
> +            break;
> +        }
>          case SMMU_CMD_TLBI_NH_VAA:
>          case SMMU_CMD_TLBI_EL3_ALL:
>          case SMMU_CMD_TLBI_EL3_VA:
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 15f84d6..fba33ac 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -26,6 +26,7 @@ smmuv3_cmdq_opcode(const char *opcode) "<--- %s"
>  smmuv3_cmdq_cfgi_ste(int streamid) "     |_ streamid =%d"
>  smmuv3_cmdq_cfgi_ste_range(int start, int end) "     |_ start=0x%d - end=0x%d"
>  smmuv3_cmdq_tlbi_nh_va(int asid, int vmid, uint64_t addr) "     |_ asid =%d vmid =%d addr=0x%"PRIx64
> +smmuv3_cmdq_tlbi_nh_va_am(int asid, int am, size_t size, uint64_t addr) "     |_ asid =%d am =%d size=0x%lx addr=0x%"PRIx64
>  smmuv3_cmdq_consume_out(uint8_t prod_wrap, uint32_t prod, uint8_t cons_wrap, uint32_t cons) "prod_wrap:%d, prod:0x%x cons_wrap:%d cons:0x%x"
>  smmuv3_update(bool is_empty, uint32_t prod, uint32_t cons, uint8_t prod_wrap, uint8_t cons_wrap) "q empty:%d prod:%d cons:%d p.wrap:%d p.cons:%d"
>  smmuv3_update_check_cmd(int error) "cmdq not enabled or error :0x%x"
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
                   ` (23 preceding siblings ...)
  2017-09-28  6:43 ` Linu Cherian
@ 2017-10-24  5:38 ` Linu Cherian
  2017-10-24 10:20   ` Will Deacon
  24 siblings, 1 reply; 72+ messages in thread
From: Linu Cherian @ 2017-10-24  5:38 UTC (permalink / raw)
  To: Eric Auger
  Cc: eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, will.deacon, robin.murphy, peterx, bharat.bhushan,
	christoffer.dall, wtownsen

Hi Eric,


On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> This series implements the emulation code for ARM SMMUv3.
> 
> Changes since v6:
> - DPDK testpmd now running on guest with 2 assigned VFs
> - Changed the instantiation method: add the following option to
>   the QEMU command line
>   -device smmuv3 # for virtio/vhost use cases
>   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> - splitted the series into smaller patches to allow the review
> - the VFIO integration based on "tlbi-on-map" smmuv3 driver
>   is isolated from the rest: last 2 patches, not for upstream.
>   This is shipped for testing/bench until a better solution is found.
> - Reworked permission flag checks and event generation
> 
> testing:
> - in dt and ACPI modes
> - virtio-net-pci and vhost-net devices using dma ops with various
>   guest page sizes [2]
> - assigned VFs using dma ops [3]:
>   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
>   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
>   with guest and host page size equal (4kB)
> 
> Known limitations:
> - no VMSAv8-32 suport
> - no nested stage support (S1 + S2)
> - no support for HYP mappings
> - register fine emulation, commands, interrupts and errors were
>   not accurately tested. Handling is sufficient to run use cases
>   described above though.
> - interrupts and event generation not observed yet.
> 
> Best Regards
> 
> Eric
>

Was looking at options to get rid of the existing hacks we have
in this implementation (last two patches) and also to reduce the map/unmap/translation 
overhead for the guest kernel devices.

Interestingly, the nested stage translation + smmu emulation at kernel
 that we were exploring, has been already tried by Will Deacon. 
https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf
https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html


It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices.
OR
If you have already plans to do nested stage support in the future, would be interested to know 
about it.



 
> This series can be found at:
> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7
> Previous version at:
> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6
> 
> References:
> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option
>     https://lkml.org/lkml/2017/8/11/426
> 
> [2] qemu cmd line excerpt:
> -device smmuv3 \
> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \
> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \
> [3] use -device smmuv3,caching-mode
> 
> 
> History:
> v6 -> v7:
> - see above
> 
> v5 -> v6:
> - Rebase on 2.10 and IOMMUMemoryRegion
> - add ACPI TLBI_ON_MAP support (VFIO integration also works in
>   ACPI mode)
> - fix block replay
> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd
>   (goes along with TLBI_ON_MAP FW quirk)
> - replay systematically unmap the whole range first
> - smmuv3_map_hook does not unmap anymore and the unmap is done
>   before the replay
> - add and use smmuv3_context_device_invalidate instead of
>   blindly replaying everything
> 
> v4 -> v5:
> - initial_level now part of SMMUTransCfg
> - smmu_page_walk_64 takes into account the max input size
> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed
> - smmuv3_translate: bug fix: don't walk on bypass
> - smmu_update_qreg: fix PROD index update
> - I did not yet address Peter's comments as the code is not mature enough
>   to be split into sub patches.
> 
> v3 -> v4 [Eric]:
> - page table walk rewritten to allow scan of the page table within a
>   range of IOVA. This prepares for VFIO integration and replay.
> - configuration parsing partially reworked.
> - do not advertise unsupported/untested features: S2, S1 + S2, HYP,
>   PRI, ATS, ..
> - added ACPI table generation
> - migrated to dynamic traces
> - mingw compilation fix
> 
> v2 -> v3 [Eric]:
> - rebased on 2.9
> - mostly code and patch reorganization to ease the review process
> - optional patches removed. They may be handled separately. I am currently
>   working on ACPI enablement.
> - optional instantiation of the smmu in mach-virt
> - removed [2/9] (fdt functions) since not mandated
> - start splitting main patch into base and derived object
> - no new function feature added
> 
> v1 -> v2 [Prem]:
> - Adopted review comments from Eric Auger
>         - Make SMMU_DPRINTF to internally call qemu_log
>             (since translation requests are too many, we need control
>              on the type of log we want)
>         - SMMUTransCfg modified to suite simplicity
>         - Change RegInfo to uint64 register array
>         - Code cleanup
>         - Test cleanups
> - Reshuffled patches
> 
> v0 -> v1 [Prem]:
> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable)
> - Reworked register access/update logic
> - Factored out translation code for
>         - single point bug fix
>         - sharing/removal in future
> - (optional) Unit tests added, with PCI test device
>         - S1 with 4k/64k, S1+S2 with 4k/64k
>         - (S1 or S2) only can be verified by Linux 4.7 driver
>         - (optional) Priliminary ACPI support
> 
> v0 [Prem]:
> - Implements SMMUv3 spec 11.0
> - Supported for PCIe devices,
> - Command Queue and Event Queue supported
> - LPAE only, S1 is supported and Tested, S2 not tested
> - BE mode Translation not supported
> - IRQ support (legacy, no MSI)
> 
> Eric Auger (18):
>   hw/arm/smmu-common: smmu base device and datatypes
>   hw/arm/smmu-common: IOMMU memory region and address space setup
>   hw/arm/smmu-common: smmu_read/write_sysmem
>   hw/arm/smmu-common: VMSAv8-64 page table walk
>   hw/arm/smmuv3: Wired IRQ and GERROR helpers
>   hw/arm/smmuv3: Queue helpers
>   hw/arm/smmuv3: Implement MMIO write operations
>   hw/arm/smmuv3: Event queue recording helper
>   hw/arm/smmuv3: Implement translate callback
>   target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route
>   hw/arm/smmuv3: Implement data structure and TLB invalidation
>     notifications
>   hw/arm/smmuv3: Implement IOMMU memory region replay callback
>   hw/arm/virt: Store the PCI host controller dt phandle
>   hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation
>     functions
>   hw/arm/sysbus-fdt: Pass the platform bus base address in
>     PlatformBusFDTData
>   hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation
>   hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling
>   hw/arm/smmuv3: [not for upstream] Add caching-mode option
> 
> Prem Mallappa (2):
>   hw/arm/smmuv3: Skeleton
>   hw/arm/virt-acpi-build: Add smmuv3 node in IORT table
> 
>  default-configs/aarch64-softmmu.mak |    1 +
>  hw/arm/Makefile.objs                |    1 +
>  hw/arm/smmu-common.c                |  527 ++++++++++++++++
>  hw/arm/smmu-internal.h              |  105 ++++
>  hw/arm/smmuv3-internal.h            |  584 +++++++++++++++++
>  hw/arm/smmuv3.c                     | 1181 +++++++++++++++++++++++++++++++++++
>  hw/arm/sysbus-fdt.c                 |  129 +++-
>  hw/arm/trace-events                 |   48 ++
>  hw/arm/virt-acpi-build.c            |   63 +-
>  hw/arm/virt.c                       |    6 +-
>  include/hw/acpi/acpi-defs.h         |   15 +
>  include/hw/arm/smmu-common.h        |  123 ++++
>  include/hw/arm/smmuv3.h             |   80 +++
>  include/hw/arm/sysbus-fdt.h         |    2 +
>  include/hw/arm/virt.h               |   15 +
>  target/arm/kvm.c                    |   27 +
>  target/arm/trace-events             |    3 +
>  17 files changed, 2886 insertions(+), 24 deletions(-)
>  create mode 100644 hw/arm/smmu-common.c
>  create mode 100644 hw/arm/smmu-internal.h
>  create mode 100644 hw/arm/smmuv3-internal.h
>  create mode 100644 hw/arm/smmuv3.c
>  create mode 100644 include/hw/arm/smmu-common.h
>  create mode 100644 include/hw/arm/smmuv3.h
> 
> -- 
> 2.5.5
> 
> 

-- 
Linu cherian

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-10-24  5:38 ` Linu Cherian
@ 2017-10-24 10:20   ` Will Deacon
  2017-10-24 17:06     ` Linu Cherian
  0 siblings, 1 reply; 72+ messages in thread
From: Will Deacon @ 2017-10-24 10:20 UTC (permalink / raw)
  To: Linu Cherian
  Cc: Eric Auger, eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, robin.murphy, peterx, bharat.bhushan, christoffer.dall,
	wtownsen

On Tue, Oct 24, 2017 at 11:08:02AM +0530, Linu Cherian wrote:
> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> > This series implements the emulation code for ARM SMMUv3.
> > 
> > Changes since v6:
> > - DPDK testpmd now running on guest with 2 assigned VFs
> > - Changed the instantiation method: add the following option to
> >   the QEMU command line
> >   -device smmuv3 # for virtio/vhost use cases
> >   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> > - splitted the series into smaller patches to allow the review
> > - the VFIO integration based on "tlbi-on-map" smmuv3 driver
> >   is isolated from the rest: last 2 patches, not for upstream.
> >   This is shipped for testing/bench until a better solution is found.
> > - Reworked permission flag checks and event generation
> > 
> > testing:
> > - in dt and ACPI modes
> > - virtio-net-pci and vhost-net devices using dma ops with various
> >   guest page sizes [2]
> > - assigned VFs using dma ops [3]:
> >   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
> >   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
> >   with guest and host page size equal (4kB)
> > 
> > Known limitations:
> > - no VMSAv8-32 suport
> > - no nested stage support (S1 + S2)
> > - no support for HYP mappings
> > - register fine emulation, commands, interrupts and errors were
> >   not accurately tested. Handling is sufficient to run use cases
> >   described above though.
> > - interrupts and event generation not observed yet.
> > 
> > Best Regards
> > 
> > Eric
> >
> 
> Was looking at options to get rid of the existing hacks we have
> in this implementation (last two patches) and also to reduce the map/unmap/translation 
> overhead for the guest kernel devices.
> 
> Interestingly, the nested stage translation + smmu emulation at kernel
>  that we were exploring, has been already tried by Will Deacon. 
> https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf
> https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html
> 
> 
> It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices.
> OR
> If you have already plans to do nested stage support in the future, would be interested to know 
> about it.

I don't plan to revive that code. I got something well on the way to working
for SMMUv2, but it had some pretty major issues:

1. A huge amount of emulation code in the kernel
2. A horribly complicated user ABI
3. Keeping track of internal hardware caching state was a nightmare, so
   over-invalidation was rife
4. Errata workarounds meant trapping all SMMU accesses (inc. for stage 1)
5. I remember having issues with interrupts, but this was likely
   SMMUv2-specific
6. There was no scope for code re-use with other SMMU implementations (e.g.
   SMMUv3)

Overall, it was just an unmaintainable, non-performant
security-flaw-waiting-to-happen so I parked it. That's some of the
background behind me preferring a virtio-iommu approach, because there's
the potential for kernel acceleration using something like vhost.

Will

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [Qemu-arm] [PATCH v7 00/20] ARM SMMUv3 Emulation Support
  2017-10-24 10:20   ` Will Deacon
@ 2017-10-24 17:06     ` Linu Cherian
  0 siblings, 0 replies; 72+ messages in thread
From: Linu Cherian @ 2017-10-24 17:06 UTC (permalink / raw)
  To: Will Deacon
  Cc: Eric Auger, eric.auger.pro, peter.maydell, qemu-arm, qemu-devel,
	prem.mallappa, alex.williamson, mohun106, drjones, tcain,
	Radha.Chintakuntla, Sunil.Goutham, mst, jean-philippe.brucker,
	tn, robin.murphy, peterx, bharat.bhushan, christoffer.dall,
	wtownsen, linu.cherian

Hi Will,

On Tue, Oct 24, 2017 at 11:20:29AM +0100, Will Deacon wrote:
> On Tue, Oct 24, 2017 at 11:08:02AM +0530, Linu Cherian wrote:
> > On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote:
> > > This series implements the emulation code for ARM SMMUv3.
> > > 
> > > Changes since v6:
> > > - DPDK testpmd now running on guest with 2 assigned VFs
> > > - Changed the instantiation method: add the following option to
> > >   the QEMU command line
> > >   -device smmuv3 # for virtio/vhost use cases
> > >   -device smmuv3,caching-mode # for vfio use cases (based on [1])
> > > - splitted the series into smaller patches to allow the review
> > > - the VFIO integration based on "tlbi-on-map" smmuv3 driver
> > >   is isolated from the rest: last 2 patches, not for upstream.
> > >   This is shipped for testing/bench until a better solution is found.
> > > - Reworked permission flag checks and event generation
> > > 
> > > testing:
> > > - in dt and ACPI modes
> > > - virtio-net-pci and vhost-net devices using dma ops with various
> > >   guest page sizes [2]
> > > - assigned VFs using dma ops [3]:
> > >   - AMD Overdrive and igbvf passthrough (using gsi direct mapping)
> > >   - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing)
> > > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3]
> > >   with guest and host page size equal (4kB)
> > > 
> > > Known limitations:
> > > - no VMSAv8-32 suport
> > > - no nested stage support (S1 + S2)
> > > - no support for HYP mappings
> > > - register fine emulation, commands, interrupts and errors were
> > >   not accurately tested. Handling is sufficient to run use cases
> > >   described above though.
> > > - interrupts and event generation not observed yet.
> > > 
> > > Best Regards
> > > 
> > > Eric
> > >
> > 
> > Was looking at options to get rid of the existing hacks we have
> > in this implementation (last two patches) and also to reduce the map/unmap/translation 
> > overhead for the guest kernel devices.
> > 
> > Interestingly, the nested stage translation + smmu emulation at kernel
> >  that we were exploring, has been already tried by Will Deacon. 
> > https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf
> > https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html
> > 
> > 
> > It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices.
> > OR
> > If you have already plans to do nested stage support in the future, would be interested to know 
> > about it.
> 
> I don't plan to revive that code. I got something well on the way to working
> for SMMUv2, but it had some pretty major issues:
> 
> 1. A huge amount of emulation code in the kernel
> 2. A horribly complicated user ABI
> 3. Keeping track of internal hardware caching state was a nightmare, so
>    over-invalidation was rife
> 4. Errata workarounds meant trapping all SMMU accesses (inc. for stage 1)
> 5. I remember having issues with interrupts, but this was likely
>    SMMUv2-specific
> 6. There was no scope for code re-use with other SMMU implementations (e.g.
>    SMMUv3)
> 
> Overall, it was just an unmaintainable, non-performant
> security-flaw-waiting-to-happen so I parked it. That's some of the
> background behind me preferring a virtio-iommu approach, because there's
> the potential for kernel acceleration using something like vhost.
> 
> Will

Thanks for the explanation.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-10-09 17:47   ` Peter Maydell
@ 2017-11-13 13:00     ` Auger Eric
  2017-11-13 13:08       ` Peter Maydell
  0 siblings, 1 reply; 72+ messages in thread
From: Auger Eric @ 2017-11-13 13:00 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

Hi Peter,

On 09/10/2017 19:47, Peter Maydell wrote:
> On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
>> The VirtMachineState contains some dt phandles that will be used
>> in some node creation functions. For instance we plan to use the
>> PCI host controller phandle in the smmu node creation function. So
>> let's pass the VirtMachineState handle down to the node creation
>> functions by enhancing the involved datatypes.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/arm/sysbus-fdt.c         | 3 +++
>>  hw/arm/virt.c               | 1 +
>>  include/hw/arm/sysbus-fdt.h | 2 ++
>>  3 files changed, 6 insertions(+)
>>
>> diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
>> index d68e3dc..d92a983 100644
>> --- a/hw/arm/sysbus-fdt.c
>> +++ b/hw/arm/sysbus-fdt.c
>> @@ -36,6 +36,7 @@
>>  #include "hw/vfio/vfio-platform.h"
>>  #include "hw/vfio/vfio-calxeda-xgmac.h"
>>  #include "hw/vfio/vfio-amd-xgbe.h"
>> +#include "hw/arm/virt.h"
>>  #include "hw/arm/fdt.h"
>>
>>  /*
>> @@ -47,6 +48,7 @@ typedef struct PlatformBusFDTData {
>>      int irq_start; /* index of the first IRQ usable by platform bus devices */
>>      const char *pbus_node_name; /* name of the platform bus node */
>>      PlatformBusDevice *pbus;
>> +    VirtMachineState *vms;
>>  } PlatformBusFDTData;
> 
> sysbus-fdt isn't virt specific, so this doesn't belong here.
Correct. I plan to replace this by MachineState instead.
> 
> More generally, why is sysbus-fdt involved in this at all?
> I expected that instantiating and wiring up the SMMU would
> be the job of hw/arm/virt.c, like any other device we
> might have on the board.
I wished to have the same type of option as for x86 where
"-device intel-iommu" is passed to the QEMU command line. smmuv3 device
being a SysBusDevice, a natural framework to handle its node creation
function is sysbus-fdt. Having a -device approach is practical to pass
other options to the device (this was typically the case for the
"caching-mode" option). On Intel there are caching-mode, passthrough
(pt) options.

Although the smmu caching-mode option may not survive in this form, we
can foresee other options to come. In the future we may pass the PCI bus
number we want to plug the smmu into.

Also the invocation method would be identical for virtio-iommu.

I explored the other (and simpler) alternative of passing an option to
virt machine but I think this approach is less flexible. Also as you
pointed out, by default, smmu would be on, if we stick to the existing
convention. Given the perf overhead I don't think this is what we want.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-11-13 13:00     ` Auger Eric
@ 2017-11-13 13:08       ` Peter Maydell
  2017-11-13 13:37         ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2017-11-13 13:08 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 13 November 2017 at 13:00, Auger Eric <eric.auger@redhat.com> wrote:
> On 09/10/2017 19:47, Peter Maydell wrote:
>> More generally, why is sysbus-fdt involved in this at all?
>> I expected that instantiating and wiring up the SMMU would
>> be the job of hw/arm/virt.c, like any other device we
>> might have on the board.
> I wished to have the same type of option as for x86 where
> "-device intel-iommu" is passed to the QEMU command line. smmuv3 device
> being a SysBusDevice, a natural framework to handle its node creation
> function is sysbus-fdt. Having a -device approach is practical to pass
> other options to the device (this was typically the case for the
> "caching-mode" option). On Intel there are caching-mode, passthrough
> (pt) options.

Not being able to conveniently wire up a sysbus device on the
command line or pass it options are general problems. I don't
think the SMMU is a special case that should work around these
general issues by being created in a different way to everything
else. If the "hard to pass options to the device" problem needs
solving (which it does anyway if we want to drop '-net' for
configuring embedded ethernet devices) we should solve it,
not have some small set of sysbus devices be weirdly magic.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-11-13 13:08       ` Peter Maydell
@ 2017-11-13 13:37         ` Auger Eric
  2017-11-13 13:44           ` Peter Maydell
  0 siblings, 1 reply; 72+ messages in thread
From: Auger Eric @ 2017-11-13 13:37 UTC (permalink / raw)
  To: qemu-devel

Hi Peter,

On 13/11/2017 14:08, Peter Maydell wrote:
> On 13 November 2017 at 13:00, Auger Eric <eric.auger@redhat.com> wrote:
>> On 09/10/2017 19:47, Peter Maydell wrote:
>>> More generally, why is sysbus-fdt involved in this at all?
>>> I expected that instantiating and wiring up the SMMU would
>>> be the job of hw/arm/virt.c, like any other device we
>>> might have on the board.
>> I wished to have the same type of option as for x86 where
>> "-device intel-iommu" is passed to the QEMU command line. smmuv3 device
>> being a SysBusDevice, a natural framework to handle its node creation
>> function is sysbus-fdt. Having a -device approach is practical to pass
>> other options to the device (this was typically the case for the
>> "caching-mode" option). On Intel there are caching-mode, passthrough
>> (pt) options.
> 
> Not being able to conveniently wire up a sysbus device on the
> command line or pass it options are general problems. I don't
> think the SMMU is a special case that should work around these
> general issues by being created in a different way to everything
> else. If the "hard to pass options to the device" problem needs
> solving (which it does anyway if we want to drop '-net' for
> configuring embedded ethernet devices) we should solve it,
> not have some small set of sysbus devices be weirdly magic.
do you have examples of other SysbusDevices whose instantiation is made
conditional with "-device" option and which address the dt node/ACPI
table creation in a more standard manner? Or do you want me to drop the
"-device" requirement.

I may manage reaching my goal with yet another machine init done
notifier that would create the dt node from virt code at fixed base
address. But this solution still may be be virt specific. Is it the
direction you want me to follow at the moment?

Thanks

Eric

> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-11-13 13:37         ` Auger Eric
@ 2017-11-13 13:44           ` Peter Maydell
  2017-11-13 13:59             ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2017-11-13 13:44 UTC (permalink / raw)
  To: Auger Eric; +Cc: QEMU Developers

On 13 November 2017 at 13:37, Auger Eric <eric.auger@redhat.com> wrote:
> On 13/11/2017 14:08, Peter Maydell wrote:
>> Not being able to conveniently wire up a sysbus device on the
>> command line or pass it options are general problems. I don't
>> think the SMMU is a special case that should work around these
>> general issues by being created in a different way to everything
>> else. If the "hard to pass options to the device" problem needs
>> solving (which it does anyway if we want to drop '-net' for
>> configuring embedded ethernet devices) we should solve it,
>> not have some small set of sysbus devices be weirdly magic.
> do you have examples of other SysbusDevices whose instantiation is made
> conditional with "-device" option and which address the dt node/ACPI
> table creation in a more standard manner? Or do you want me to drop the
> "-device" requirement.

I think that you just can't create sysbus devices with -device.
They're hardwired into the machine model.

> I may manage reaching my goal with yet another machine init done
> notifier that would create the dt node from virt code at fixed base
> address. But this solution still may be be virt specific. Is it the
> direction you want me to follow at the moment?

I still don't really understand why the SMMU has to be any
different from the UART, or the PCI controller, or any of
the other devices in the virt board model. None of those
try to be creatable from the command line.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions
  2017-11-13 13:44           ` Peter Maydell
@ 2017-11-13 13:59             ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2017-11-13 13:59 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

Hi,

On 13/11/2017 14:44, Peter Maydell wrote:
> On 13 November 2017 at 13:37, Auger Eric <eric.auger@redhat.com> wrote:
>> On 13/11/2017 14:08, Peter Maydell wrote:
>>> Not being able to conveniently wire up a sysbus device on the
>>> command line or pass it options are general problems. I don't
>>> think the SMMU is a special case that should work around these
>>> general issues by being created in a different way to everything
>>> else. If the "hard to pass options to the device" problem needs
>>> solving (which it does anyway if we want to drop '-net' for
>>> configuring embedded ethernet devices) we should solve it,
>>> not have some small set of sysbus devices be weirdly magic.
>> do you have examples of other SysbusDevices whose instantiation is made
>> conditional with "-device" option and which address the dt node/ACPI
>> table creation in a more standard manner? Or do you want me to drop the
>> "-device" requirement.
> 
> I think that you just can't create sysbus devices with -device.
> They're hardwired into the machine model.
OK
> 
>> I may manage reaching my goal with yet another machine init done
>> notifier that would create the dt node from virt code at fixed base
>> address. But this solution still may be be virt specific. Is it the
>> direction you want me to follow at the moment?
> 
> I still don't really understand why the SMMU has to be any
> different from the UART, or the PCI controller, or any of
> the other devices in the virt board model. None of those
> try to be creatable from the command line.
To me the difference is, if you don't want to use UART or PCI
controller, you don't suffer performance downgrade if they are
instantiated. On the opposite, if the SMMU is instantiated whereas you
don't need his functionality you suffer performance downgrade.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback
  2017-10-09 17:45   ` Peter Maydell
@ 2018-02-06 12:19     ` Auger Eric
  2018-02-06 12:43       ` Peter Maydell
  0 siblings, 1 reply; 72+ messages in thread
From: Auger Eric @ 2018-02-06 12:19 UTC (permalink / raw)
  To: Peter Maydell
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	jean-philippe.brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

Hi Peter,
On 09/10/17 19:45, Peter Maydell wrote:
> On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote:
>> This patch implements the IOMMU Memory Region translate()
>> callback. Most of the code relates to the translation
>> configuration decoding and check (STE, CD).
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>>  hw/arm/smmuv3-internal.h | 182 +++++++++++++++++++++++-
>>  hw/arm/smmuv3.c          | 351 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  hw/arm/trace-events      |   9 ++
>>  3 files changed, 537 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
>> index e3e9828..f9f95ae 100644
>> --- a/hw/arm/smmuv3-internal.h
>> +++ b/hw/arm/smmuv3-internal.h
>> @@ -399,7 +399,185 @@ typedef enum evt_err {
>>      SMMU_EVT_E_PAGE_REQ     = 0x24,
>>  } SMMUEvtErr;
>>
>> -void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
>> -                         uint32_t sid, bool is_write, SMMUEvtErr type);
>> +/*****************************
>> + * Configuration Data
>> + *****************************/
>> +
>> +typedef struct __smmu_data2  STEDesc; /* STE Level 1 Descriptor */
>> +typedef struct __smmu_data16 Ste;     /* Stream Table Entry(STE) */
>> +typedef struct __smmu_data2  CDDesc;  /* CD Level 1 Descriptor */
>> +typedef struct __smmu_data16 Cd;      /* Context Descriptor(CD) */
>> +
>> +/*****************************
>> + * STE fields
>> + *****************************/
>> +
>> +#define STE_VALID(x)   extract32((x)->word[0], 0, 1) /* 0 */
>> +#define STE_CONFIG(x)  extract32((x)->word[0], 1, 3)
>> +enum {
>> +    STE_CONFIG_NONE      = 0,
>> +    STE_CONFIG_BYPASS    = 4,       /* S1 Bypass    , S2 Bypass */
>> +    STE_CONFIG_S1        = 5,       /* S1 Translate , S2 Bypass */
>> +    STE_CONFIG_S2        = 6,       /* S1 Bypass    , S2 Translate */
>> +    STE_CONFIG_NESTED    = 7,       /* S1 Translate , S2 Translate */
>> +};
>> +#define STE_S1FMT(x)   extract32((x)->word[0], 4, 2)
>> +#define STE_S1CDMAX(x) extract32((x)->word[1], 27, 5)
>> +#define STE_EATS(x)    extract32((x)->word[2], 28, 2)
>> +#define STE_STRW(x)    extract32((x)->word[2], 30, 2)
>> +#define STE_S2VMID(x)  extract32((x)->word[4], 0, 16)
>> +#define STE_S2T0SZ(x)  extract32((x)->word[5], 0, 6)
>> +#define STE_S2SL0(x)   extract32((x)->word[5], 6, 2)
>> +#define STE_S2TG(x)    extract32((x)->word[5], 14, 2)
>> +#define STE_S2PS(x)    extract32((x)->word[5], 16, 3)
>> +#define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
>> +#define STE_S2HD(x)    extract32((x)->word[5], 24, 1)
>> +#define STE_S2HA(x)    extract32((x)->word[5], 25, 1)
>> +#define STE_S2S(x)     extract32((x)->word[5], 26, 1)
>> +#define STE_CTXPTR(x)                                           \
>> +    ({                                                          \
>> +        unsigned long addr;                                     \
>> +        addr = (uint64_t)extract32((x)->word[1], 0, 16) << 32;  \
>> +        addr |= (uint64_t)((x)->word[0] & 0xffffffc0);          \
>> +        addr;                                                   \
>> +    })
>> +
>> +#define STE_S2TTB(x)                                            \
>> +    ({                                                          \
>> +        unsigned long addr;                                     \
>> +        addr = (uint64_t)extract32((x)->word[7], 0, 16) << 32;  \
>> +        addr |= (uint64_t)((x)->word[6] & 0xfffffff0);          \
>> +        addr;                                                   \
>> +    })
>> +
>> +static inline int is_ste_bypass(Ste *ste)
> 
> Types which are acronyms are all-caps, so STE, CD.
> 
>> +{
>> +    return STE_CONFIG(ste) == STE_CONFIG_BYPASS;
>> +}
>> +
>> +static inline bool is_ste_stage1(Ste *ste)
>> +{
>> +    return STE_CONFIG(ste) == STE_CONFIG_S1;
>> +}
>> +
>> +static inline bool is_ste_stage2(Ste *ste)
>> +{
>> +    return STE_CONFIG(ste) == STE_CONFIG_S2;
>> +}
>> +
>> +/**
>> + * is_s2granule_valid - Check the stage 2 translation granule size
>> + * advertised in the STE matches any IDR5 supported value
>> + */
>> +static inline bool is_s2granule_valid(Ste *ste)
>> +{
>> +    int idr5_format = 0;
>> +
>> +    switch (STE_S2TG(ste)) {
>> +    case 0: /* 4kB */
>> +        idr5_format = 0x1;
>> +        break;
>> +    case 1: /* 64 kB */
>> +        idr5_format = 0x4;
>> +        break;
>> +    case 2: /* 16 kB */
>> +        idr5_format = 0x2;
>> +        break;
>> +    case 3: /* reserved */
>> +        break;
>> +    }
>> +    idr5_format &= SMMU_IDR5_GRAN;
>> +    return idr5_format;
>> +}
>> +
>> +static inline int oas2bits(int oas_field)
>> +{
>> +    switch (oas_field) {
>> +    case 0b011:
>> +        return 42;
>> +    case 0b100:
>> +        return 44;
>> +    default:
>> +        return 32 + (1 << oas_field);
>> +   }
>> +}
>> +
>> +static inline int pa_range(Ste *ste)
>> +{
>> +    int oas_field = MIN(STE_S2PS(ste), SMMU_IDR5_OAS);
>> +
>> +    if (!STE_S2AA64(ste)) {
>> +        return 40;
>> +    }
>> +
>> +    return oas2bits(oas_field);
>> +}
>> +
>> +#define MAX_PA(ste) ((1 << pa_range(ste)) - 1)
>> +
>> +/*****************************
>> + * CD fields
>> + *****************************/
>> +#define CD_VALID(x)   extract32((x)->word[0], 30, 1)
>> +#define CD_ASID(x)    extract32((x)->word[1], 16, 16)
>> +#define CD_TTB(x, sel)                                      \
>> +    ({                                                      \
>> +        uint64_t hi, lo;                                    \
>> +        hi = extract32((x)->word[(sel) * 2 + 3], 0, 16);    \
>> +        hi <<= 32;                                          \
>> +        lo = (x)->word[(sel) * 2 + 2] & ~0xf;               \
>> +        hi | lo;                                            \
>> +    })
>> +
>> +#define CD_TSZ(x, sel)   extract32((x)->word[0], (16 * (sel)) + 0, 6)
>> +#define CD_TG(x, sel)    extract32((x)->word[0], (16 * (sel)) + 6, 2)
>> +#define CD_EPD(x, sel)   extract32((x)->word[0], (16 * (sel)) + 14, 1)
>> +
>> +#define CD_T0SZ(x)    CD_TSZ((x), 0)
>> +#define CD_T1SZ(x)    CD_TSZ((x), 1)
>> +#define CD_TG0(x)     CD_TG((x), 0)
>> +#define CD_TG1(x)     CD_TG((x), 1)
>> +#define CD_EPD0(x)    CD_EPD((x), 0)
>> +#define CD_EPD1(x)    CD_EPD((x), 1)
>> +#define CD_IPS(x)     extract32((x)->word[1], 0, 3)
>> +#define CD_AARCH64(x) extract32((x)->word[1], 9, 1)
>> +#define CD_TTB0(x)    CD_TTB((x), 0)
>> +#define CD_TTB1(x)    CD_TTB((x), 1)
>> +
>> +#define CDM_VALID(x)    ((x)->word[0] & 0x1)
>> +
>> +static inline int is_cd_valid(SMMUV3State *s, Ste *ste, Cd *cd)
>> +{
>> +    return CD_VALID(cd);
>> +}
>> +
>> +/**
>> + * tg2granule - Decodes the CD translation granule size field according
>> + * to the TT in use
>> + * @bits: TG0/1 fiels
>> + * @tg1: if set, @bits belong to TG1, otherwise belong to TG0
>> + */
>> +static inline int tg2granule(int bits, bool tg1)
>> +{
>> +    switch (bits) {
>> +    case 1:
>> +        return tg1 ? 14 : 16;
>> +    case 2:
>> +        return tg1 ? 12 : 14;
>> +    case 3:
>> +        return tg1 ? 16 : 12;
>> +    default:
>> +        return 12;
>> +    }
>> +}
>> +
>> +#define L1STD_L2PTR(stm) ({                                 \
>> +            uint64_t hi, lo;                            \
>> +            hi = (stm)->word[1];                        \
>> +            lo = (stm)->word[0] & ~(uint64_t)0x1f;      \
>> +            hi << 32 | lo;                              \
>> +        })
>> +
>> +#define L1STD_SPAN(stm) (extract32((stm)->word[0], 0, 4))
>>
>>  #endif
>> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
>> index 7470576..20fbce6 100644
>> --- a/hw/arm/smmuv3.c
>> +++ b/hw/arm/smmuv3.c
>> @@ -160,9 +160,9 @@ static void smmuv3_write_evtq(SMMUV3State *s, Evt *evt)
>>  /*
>>   * smmuv3_record_event - Record an event
>>   */
>> -void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
>> -                         uint32_t sid, IOMMUAccessFlags perm,
>> -                         SMMUEvtErr type)
>> +static void smmuv3_record_event(SMMUV3State *s, hwaddr iova,
>> +                                uint32_t sid, IOMMUAccessFlags perm,
>> +                                SMMUEvtErr type)
>>  {
>>      Evt evt;
>>      bool rnw = perm & IOMMU_RO;
>> @@ -306,6 +306,348 @@ static inline void smmu_update_base_reg(SMMUV3State *s, uint64_t *base,
>>      *base = val & ~(SMMU_BASE_RA | 0x3fULL);
>>  }
>>
>> +/*
>> + * All SMMU data structures are little endian, and are aligned to 8 bytes
>> + * L1STE/STE/L1CD/CD, Queue entries in CMDQ/EVTQ/PRIQ
>> + */
>> +static inline int smmu_get_ste(SMMUV3State *s, hwaddr addr, Ste *buf)
>> +{
>> +    trace_smmuv3_get_ste(addr);
>> +    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
> 
> Why dma_memory_read() here when we were using other memory access
> functions for things like TLB table walks earlier?
> 
> Incidentally, the spec requires us to perform memory accesses as
> at least 64-bit single-copy atomic (see 3.21.3) -- does this do that?
> (This gets important with SMP when the guest on another CPU might
> be updating the STE or page table entry at the same time as we're
> reading it...)

Among all your comments on v7, here is the one I am the least
comfortable with. I was not able to figure out whether
dma_memory_read(), which I use now for all the descriptor accesses
guarantees this 64b single copy atomicity.

Unrelated, I also did not change command descriptor field decoding (you
suggested to use registerfields.h). cmd struct is a a structure laid out
in guest RAM so I was not sure about how to use the register API for
this decoding.

With respect to the GPLv2 license issue, at the moment, I have not been
able to get an ACK from any Broadcom representative for transforming it
into "v2 or later". I will continue trying getting this approval though.

The IOMMU is not instantiated anymore using sysbus-fdt. it is
instantiated according to a machine option, still set to false by
default, given the performance overhead.

Otherwise I think I covered all your comments in v8.

As I mentioned in my cover letter I intend to submit separate patches
later on for
- vhost integration
- TLB emulation (as done on intel iommu),

as the code base already is huge and I am reluctant to add some other
features until the basic functionality has not stabilized.

Thanks

Eric
> 
>> +}
>> +
>> +/*
>> + * For now we only support CD with a single entry, 'ssid' is used to identify
>> + * otherwise
>> + */
>> +static inline int smmu_get_cd(SMMUV3State *s, Ste *ste, uint32_t ssid, Cd *buf)
>> +{
>> +    hwaddr addr = STE_CTXPTR(ste);
>> +
>> +    if (STE_S1CDMAX(ste) != 0) {
>> +        error_report("Multilevel Ctx Descriptor not supported yet");
>> +    }
>> +
>> +    trace_smmuv3_get_cd(addr);
>> +    return dma_memory_read(&address_space_memory, addr, buf, sizeof(*buf));
>> +}
>> +
>> +/**
>> + * is_ste_consistent - Check validity of STE
>> + * according to 6.2.1 Validity of STE
>> + * TODO: check the relevance of each check and compliance
>> + * with this spec chapter
> 
> Good idea :-)
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback
  2018-02-06 12:19     ` Auger Eric
@ 2018-02-06 12:43       ` Peter Maydell
  2018-02-06 12:56         ` Auger Eric
  0 siblings, 1 reply; 72+ messages in thread
From: Peter Maydell @ 2018-02-06 12:43 UTC (permalink / raw)
  To: Auger Eric
  Cc: eric.auger.pro, qemu-arm, QEMU Developers, Prem Mallappa,
	Alex Williamson, Andrew Jones, Christoffer Dall,
	Radha.Chintakuntla, Sunil.Goutham, Radha Mohan, Trey Cain,
	Bharat Bhushan, Tomasz Nowicki, Michael S. Tsirkin, Will Deacon,
	Jean-Philippe Brucker, robin.murphy, Peter Xu, Edgar E. Iglesias,
	wtownsen

On 6 February 2018 at 12:19, Auger Eric <eric.auger@redhat.com> wrote:
> On 09/10/17 19:45, Peter Maydell wrote:
>> Incidentally, the spec requires us to perform memory accesses as
>> at least 64-bit single-copy atomic (see 3.21.3) -- does this do that?
>> (This gets important with SMP when the guest on another CPU might
>> be updating the STE or page table entry at the same time as we're
>> reading it...)
>
> Among all your comments on v7, here is the one I am the least
> comfortable with. I was not able to figure out whether
> dma_memory_read(), which I use now for all the descriptor accesses
> guarantees this 64b single copy atomicity.

It doesn't -- it winds up in flatview_read_continue(), which will
do a memcpy() from guest ram into the buffer, and since it's
using an arbitrary passed-in length value the chances are high
it won't end up using an atomic access on the host.

This is a nasty issue which we haven't figured out at all yet;
see also this thread:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03067.html

For the moment my suggestion would be that when you need
to do guest memory accesses that have atomicity requirements you:
 (a) use an accessor function which specifically loads a
  quantity of the correct size, rather than one which takes
  an arbitrary start-and-length
  (when we add APIs which do have atomicity guarantees they'll
  be "load 64 bit word" etc, so using our existing APIs of
  that form should avoid needing to restructure this code later)
 (b) add a TODO comment noting the required atomicity

thanks
-- PMM

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback
  2018-02-06 12:43       ` Peter Maydell
@ 2018-02-06 12:56         ` Auger Eric
  0 siblings, 0 replies; 72+ messages in thread
From: Auger Eric @ 2018-02-06 12:56 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Radha Mohan, Andrew Jones, Trey Cain, Radha.Chintakuntla,
	Sunil.Goutham, Michael S. Tsirkin, Jean-Philippe Brucker,
	Tomasz Nowicki, Will Deacon, QEMU Developers, Peter Xu,
	Alex Williamson, qemu-arm, Christoffer Dall, Edgar E. Iglesias,
	robin.murphy, wtownsen, Bharat Bhushan, Prem Mallappa,
	eric.auger.pro

Hi Peter,
On 06/02/18 13:43, Peter Maydell wrote:
> On 6 February 2018 at 12:19, Auger Eric <eric.auger@redhat.com> wrote:
>> On 09/10/17 19:45, Peter Maydell wrote:
>>> Incidentally, the spec requires us to perform memory accesses as
>>> at least 64-bit single-copy atomic (see 3.21.3) -- does this do that?
>>> (This gets important with SMP when the guest on another CPU might
>>> be updating the STE or page table entry at the same time as we're
>>> reading it...)
>>
>> Among all your comments on v7, here is the one I am the least
>> comfortable with. I was not able to figure out whether
>> dma_memory_read(), which I use now for all the descriptor accesses
>> guarantees this 64b single copy atomicity.
> 
> It doesn't -- it winds up in flatview_read_continue(), which will
> do a memcpy() from guest ram into the buffer, and since it's
> using an arbitrary passed-in length value the chances are high
> it won't end up using an atomic access on the host.
> 
> This is a nasty issue which we haven't figured out at all yet;
> see also this thread:
> https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03067.html
> 
> For the moment my suggestion would be that when you need
> to do guest memory accesses that have atomicity requirements you:
>  (a) use an accessor function which specifically loads a
>   quantity of the correct size, rather than one which takes
>   an arbitrary start-and-length
>   (when we add APIs which do have atomicity guarantees they'll
>   be "load 64 bit word" etc, so using our existing APIs of
>   that form should avoid needing to restructure this code later)
>  (b) add a TODO comment noting the required atomicity

OK thank you for the link. To start with, I will add a comment in next
version then.

Thanks

Eric
> 
> thanks
> -- PMM
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2018-02-06 12:56 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-01 17:21 [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 01/20] hw/arm/smmu-common: smmu base device and datatypes Eric Auger
2017-09-27 17:38   ` Peter Maydell
2017-09-28  7:57     ` Auger Eric
2017-09-30  8:28     ` Prem Mallappa
2017-10-02  7:43       ` Auger Eric
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 02/20] hw/arm/smmu-common: IOMMU memory region and address space setup Eric Auger
2017-10-09 14:39   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 03/20] hw/arm/smmu-common: smmu_read/write_sysmem Eric Auger
2017-10-09 14:46   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 04/20] hw/arm/smmu-common: VMSAv8-64 page table walk Eric Auger
2017-10-09 15:36   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 05/20] hw/arm/smmuv3: Skeleton Eric Auger
2017-09-08 10:52   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-09-08 15:18     ` Auger Eric
2017-09-12  6:14       ` Linu Cherian
2017-10-09 16:17   ` [Qemu-devel] " Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 06/20] hw/arm/smmuv3: Wired IRQ and GERROR helpers Eric Auger
2017-10-09 17:01   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 07/20] hw/arm/smmuv3: Queue helpers Eric Auger
2017-10-09 17:12   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 08/20] hw/arm/smmuv3: Implement MMIO write operations Eric Auger
2017-10-09 17:17   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 09/20] hw/arm/smmuv3: Event queue recording helper Eric Auger
2017-10-09 17:34   ` Peter Maydell
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 10/20] hw/arm/smmuv3: Implement translate callback Eric Auger
2017-10-09 17:45   ` Peter Maydell
2018-02-06 12:19     ` Auger Eric
2018-02-06 12:43       ` Peter Maydell
2018-02-06 12:56         ` Auger Eric
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 11/20] target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 12/20] hw/arm/smmuv3: Implement data structure and TLB invalidation notifications Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 13/20] hw/arm/smmuv3: Implement IOMMU memory region replay callback Eric Auger
2017-09-14  9:27   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-09-14 14:31     ` Tomasz Nowicki
2017-09-14 14:43       ` Tomasz Nowicki
2017-09-15  7:30         ` Auger Eric
2017-09-15  7:41           ` Auger Eric
2017-09-15 10:42           ` tn
2017-09-15 13:19             ` Auger Eric
2017-09-15 14:50             ` Auger Eric
2017-09-18  9:50               ` Tomasz Nowicki
2017-09-15  7:23     ` Auger Eric
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 14/20] hw/arm/virt: Store the PCI host controller dt phandle Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 15/20] hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions Eric Auger
2017-10-09 17:47   ` Peter Maydell
2017-11-13 13:00     ` Auger Eric
2017-11-13 13:08       ` Peter Maydell
2017-11-13 13:37         ` Auger Eric
2017-11-13 13:44           ` Peter Maydell
2017-11-13 13:59             ` Auger Eric
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 16/20] hw/arm/sysbus-fdt: Pass the platform bus base address in PlatformBusFDTData Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 17/20] hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 18/20] hw/arm/virt-acpi-build: Add smmuv3 node in IORT table Eric Auger
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 19/20] hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling Eric Auger
2017-10-09 17:48   ` Peter Maydell
2017-10-17 15:06   ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-09-01 17:21 ` [Qemu-devel] [PATCH v7 20/20] hw/arm/smmuv3: [not for upstream] Add caching-mode option Eric Auger
2017-10-09 17:49   ` Peter Maydell
2017-09-07 12:39 ` [Qemu-devel] [PATCH v7 00/20] ARM SMMUv3 Emulation Support Peter Maydell
2017-09-08  8:35   ` Auger Eric
2017-09-08  5:47 ` Michael S. Tsirkin
2017-09-08  8:36   ` Auger Eric
2017-09-12  6:18 ` [Qemu-devel] [Qemu-arm] " Linu Cherian
2017-09-12  6:38   ` Auger Eric
2017-09-28  6:43 ` Linu Cherian
2017-09-28  7:13   ` Peter Xu
2017-09-28  7:54     ` Auger Eric
2017-09-28  9:21       ` Linu Cherian
2017-10-24  5:38 ` Linu Cherian
2017-10-24 10:20   ` Will Deacon
2017-10-24 17:06     ` Linu Cherian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.